Hibernate cache Problems

Source: Internet
Author: User

1. About hibernate cache:

1.1. Basic Cache principles

Hibernate cache is divided into two levels,

The first level is stored in the session as the first level cache. It is carried by default and cannot be detached.

The second level is the process-level cache controlled by sessionfactory. Is a globally shared cache. Any query method that calls the second-level cache will benefit from this. Only after correct configuration can the secondary cache be used. At the same time, you must use the corresponding method to obtain data from the Cache during conditional query. For example, the query. iterate () method, load, get method, and so on. It must be noted that the session. Find method always obtains data from the database and does not obtain data from the second-level cache, even if it has the required data.

The implementation process of using the cache for query is as follows: First, check whether the required data exists in the level-1 cache. If no data exists, query the level-2 Cache. If no data exists in the level-2 cache, then, query the database. Note that the query speed of these three methods is reduced in turn.

1.2. Existing Problems

1.2.1. Problems with level-1 caching and reasons for using level-2 caching

Because the session life cycle is usually very short, and the first-level fastest cache life cycle in the session is also very short, the first-level cache hit rate is very low. Its Improvements to system performance are also limited. Of course, the internal cache of this session is mainly used to keep the internal data status of the session synchronized. Not provided by hibernate to greatly improve system performance.

In order to improve the performance of hibernate, in addition to some common methods, such:

In addition to delayed loading, urgent external connections, and query filtering, you also need to configure the level-2 cache of hibernate. The improvements to the overall system performance are often immediate!

(After my previous project experience, there will generally be 3 ~ 4 times of Performance Improvement)

1.2.2. n + 1 Query

1.2.2.1 when Will one + N problem occur?

Premise: by default, Hibernate uses fetch = "select" to associate tables with tables, rather than fetch = "join". These are all prepared for lazy loading.

1) One-to-multiple (<set> <list>). In this case, an object is obtained through one SQL query. Due to the existence of the Association, so we need to extract the collection associated with this object, so the number of the collection is N and we need to issue n SQL statements, so the original one SQL query is changed to 1 + N.

2) Multiple-to-one: in this case, n objects are obtained through one SQL query. Due to the existence of association, the first-party object corresponding to the n objects is also taken out, so the original one SQL query is changed to one + N.

3) When iterator queries, you must first go to the cache to find (one SQL query set, only find the ID). When there is no life cycle, you will then find them one by one in the database by ID, 1 + n SQL statements

1.2.2.2 how to solve the 1 + N problem?

1) lazy = true. The default value of hibernate3 is lazy = true. When lazy = true, the associated object is not queried immediately. Only when the associated object is required (access its properties, non-Id field) the query will only take place.

2) When the second-level cache is used, the second-level cache applications will not be afraid of the 1 + N issue, because even if the first query is slow (not hit), the direct cache hit in the future will also be very fast. I used 1 + N again.

3) Of course, you can also set fetch = "join" to check all associated tables at a time, but the lazy loading feature is lost.

When performing a conditional query, the iterate () method has the famous "n + 1" query problem, that is to say, in the first query, the iterate method will execute the number of query results that meet the conditions plus one (n + 1) query. However, this problem only exists in the first query, and the performance will be greatly improved when the same query is executed later. This method is suitable for querying business data with a large amount of data.

However, note: when the data volume is very large (such as pipeline data), you need to configure specific cache policies for this persistent object, for example, you can set parameters such as the maximum number of records stored in the cache and the time when the cache exists to prevent the system from loading a large amount of data into the memory at the same time, resulting in the rapid depletion of memory resources, instead, the system performance is reduced !!!

1.3. Other considerations for using hibernate second-level cache:

1.3.1. Data Validity

In addition, Hibernate maintains the data in the second-level cache on its own to ensure the consistency between the cached data and the real data in the database! Whenever you call the SAVE (), update (), or saveorupdate () method to pass an object, or use load (), get (), list (), iterate () or the scroll () method to obtain an object, the object will be added to the internal cache of the session. When the flush () method is subsequently called, the object state is synchronized with the database.

That is to say, when data is deleted, updated, and added, the cache is updated at the same time. Of course, this also includes Level 2 Cache!

You only need to call the hibernate API to perform database-related work. Hibernate automatically guarantees the validity of the cached data !!

However, if you use JDBC to bypass hibernate, you can directly perform database operations. At this time, Hibernate does not/and cannot perceive the changes made to the database on its own, so it cannot guarantee the effectiveness of the data in the cache !!

This is also a common problem for all ORM products. Fortunately, Hibernate exposes the cache clearing method, which provides us with an opportunity to manually ensure Data Validity !!

Level 1 cache and level 2 Cache have corresponding clearing methods.

The second-level cache provides the following clearing methods:

Clear cache by Object Class

Clears cache by the primary key ID of the object class and Object

Clears the cache data in the collection of objects.

1.3.2. Applicable situations

Not all situations are suitable for the use of level-2 cache, which needs to be determined based on the specific situation. You can also configure a specific Cache Policy for a persistent object.

Suitable for the use of second-level cache:

1. data will not be modified by a third party;

Generally, it is better not to configure secondary cache for data modified outside of Hibernate to avoid inconsistent data. However, if the data needs to be cached due to performance reasons and may be modified by a third party such as SQL, you can also configure a second-level cache for the data. You only need to manually call the cache clearing method after modifying the SQL statement. To ensure data consistency

2. The data size is within the acceptable range;

If the data volume of a data table is extremely large, it is not suitable for second-level cache. The reason is that the cached data volume is too large, which may lead to insufficient memory resources and reduce performance.
If the data volume of a data table is extremely large, but often only the new data is used. In this case, you can also configure a secondary cache for it. However, you must independently configure cache policies for the persistence class, such as the maximum number of caches and cache expiration time, to reduce these parameters to a reasonable range (too high will cause memory resource shortage, the cache is too low ).

3. Low data update frequency;

For data that is updated frequently, the cost of frequently synchronizing cached data may be the same as the benefit of querying cached data. The disadvantage benefit is offset. Cache is of little significance.

4. Non-key data (not financial data)

Financial data and so on are very important data. It is absolutely not allowed to appear or use invalid data. Therefore, it is best not to use level-2 cache for security reasons.

Because the importance of "correctness" is far greater than that of "high performance.

2. Suggestions for using hibernate cache in the system

2.1. Current situation

In general, the system bypasses hibernate in three cases to perform database operations:

1. Multiple application systems access one database at a time

In this case, using the hibernate second-level cache will inevitably cause data inconsistency. At this time, we need to perform a detailed design. For example, to avoid writing data to the same data table at the same time,
Use various levels of database locking mechanisms.

2. Dynamic table Problems

A "dynamic table" is a data table automatically created based on the user's operating system during system operation.

For example, "Custom forms" and other functional modules of the custom extension development nature, because the data table is created at runtime, so the hibernate ing cannot be performed. Therefore, the operation can only be performed by bypassing the direct JDBC operation of hibernate.

If the data in the dynamic table is not cached at this time, there is no data inconsistency problem.

If you have designed a cache mechanism at this time, you can call your own cache synchronization method.

3. When using SQL to batch Delete hibernate Persistent Object tables

After batch deletion is performed, deleted data will exist in the cache.

Analysis:

After executing 3rd SQL statements (batch deletion of SQL statements), the following three methods can be used for subsequent queries:

A. session. Find () method:

According to the previous summary, the find method does not query second-level cached data, but directly queries the database.

Therefore, the Data Validity issue does not exist.

B. When the iterate method is called to execute the condition query:

According to the execution method of the iterate query method, each time it queries the id value that meets the condition in the database, and then obtains data from the cache according to the ID, database query is performed only when data with this ID is not in the cache;

If this record has been directly deleted by SQL, iterate will not query this ID when executing the ID query. Therefore, even if this record exists in the cache, it will not be obtained by the customer, and there will be no inconsistency. (This situation has been tested and verified)

C. Run the query by ID using the get or load method:

Objectively, the expired data is queried. However, the batch deletion of SQL statements in the system is generally aimed at intermediate joined data tables. Query of intermediate joined tables generally uses conditional queries. The probability of querying an association by ID is very low, so this problem does not exist!

If a value object needs to query an association by ID, and SQL is used for batch deletion because of the large amount of data. When the two conditions are met, You can manually identify the data of this object in the second-level cache to ensure that the query by ID gets the correct results !! (This situation is less likely to happen)

2.2. Suggestions

1. We recommend that you do not use SQL to directly update data of data persistence objects, but you can delete data in batches. (In the system, there are few places where batch updates are required)

2. If you must use SQL to update data, you must clear the cached data of this object. Call

Sessionfactory. evict (class)

Sessionfactory. evict (class, ID) and other methods.

3. You can directly use hibernate for batch deletion when the batch deletion data volume is small. This eliminates the issue of cache data consistency caused by SQL Execution by hibernate.

4. We do not recommend using hibernate's batch deletion method to delete large batches of record data.

The reason is that hibernate will execute one query statement and N Delete statements that meet the conditions for batch deletion. Instead of executing a condition deletion statement at a time !!
When there are a lot of data to be deleted, there will be a huge performance bottleneck !!! If the batch deletion data volume is large, for example, more than 50 records, you can use JDBC to directly Delete the data. The advantage of this operation is that only one SQL deletion statement is executed, and the performance will be greatly improved. At the same time, you can use hibernate to clear related data in the second-level cache.

Call

Sessionfactory. evict (class );

Sessionfactory. evict (class, ID) and other methods.

Therefore, for general application system development (which does not involve clusters or distributed data synchronization issues), the SQL Execution is called only when the intermediate joined tables are deleted in batches, at the same time, the intermediate join table is generally executed by the condition query is unlikely to execute by ID query. Therefore, you can directly execute SQL deletion without calling the cache clearing method. This will not cause the data validity problem caused by the second-level cache configuration in the future.

To put it back, you can call the clear cache method even if you have actually called the method to query the intermediate table objects by ID.

3. Specific Configuration Methods

According to my understanding, Many hibernate users believe that "hibernate will handle performance problems from self-behavior" when calling their corresponding methods ", or "hibernate will automatically call the cache for all our operations". The actual situation is that although hibernate provides us with a good cache mechanism and supports the extended cache framework, but it must be called correctly to make a difference !! Therefore, the performance of many systems using hibernate may not be good or bad, but may be caused by incorrect understanding of the usage method. On the contrary, if you properly configure the performance of hibernate, you will be pleasantly surprised to find out. The following describes the specific configuration methods.

Ibatis provides the second-level cache interface:

Net. SF. hibernate. cache. provider,

At the same time, a default net. SF. hibernate. cache. hashtablecacheprovider is provided,

You can also configure Other implementations such as ehcache and jbosscache.

The specific configuration location is in the hibernate. cfg. xml file.

    1. <Property Name="Hibernate. cache. use_query_cache">True</Property> 
    2.  
    3. <Property Name="Hibernate. cache. provider_class">Net. SF. hibernate. cache. hashtablecacheprovider</Property> 

Many hibernate users think that the configuration is complete,

Note: In this case, the hibernate secondary cache is not used at all. At the same time, because they close the session most of the time when using hibernate, the level-1 cache does not play any role. The result is that no cache is used. All hibernate operations are directly performed on the database !! Performance is conceivable.

The correct method is to configure the specific cache policy for each VO object in addition to the above configuration, and configure it in the photo and video files. For example:

  1. <Hibernate-Mapping> 
  2. <Class Name="Com. sobey. SBM. model. entitysystem. VO. datatypevo" Table="Dcm_datatype"> 
  3. <Cache Usage="Read-write"/> 
  4. <ID Name="ID" Column="Typeid" Type="Java. Lang. Long"> 
  5. <Generator Class="Sequence"/> 
  6. </ID> 
  7. <Property Name="Name" Column="Name" Type="Java. Lang. String"/> 
  8. <Property Name="Dbtype" Column="Dbtype" Type="Java. Lang. String"/> 
  9. </Class> 
  10. </Hibernate-Mapping> 

The key is this <Cache Usage = "read-write"/>, which has several options: Read-only, read-write, transactional, etc.

Then, when executing the query, note that if it is a conditional query or a query that returns all results, the session. Find () method will not obtain the data in the cache. Cache data is called only when the query. iterate () method is called.

Both the get and load methods query the cached data.

The specific configuration methods for different cache frameworks vary, but the above configuration is generally used. (In addition, for the configuration of the transaction-type and cluster-supporting environments, I will striveArticle)

References:

Http://hi.baidu.com/zxmsdyz/blog/item/3dbf43d1f3236b309b50270b.html

Http://elf8848.iteye.com/blog/342691

Http://blog.csdn.net/dongzi87/article/details/6621497

Http://blog.csdn.net/dengqf/article/details/2235332

Http://log-cd.iteye.com/blog/355097

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.