When and how is hibernate cache used?

Source: Internet
Author: User

Hibernate cache is divided into two levels, the first level is stored in the session called a cache, default and cannot be uninstalled. The second level is the process-level cache controlled by Sessionfactory. is a globally shared cache, and any query method that calls a level two cache will benefit from it.

1. Questions about Hibernate caching:

1.1. Basic caching Principles

Hibernate cache is divided into two levels,

The first level is stored in the session as a primary cache, with default and cannot be uninstalled.

The second level is the process-level cache controlled by Sessionfactory. is a globally shared cache, and any query method that calls a level two cache will benefit from it. Only the correct configuration of the level two cache will work. You must also use the appropriate method to fetch data from the cache when you make a conditional query. such as the Query.iterate () method, load, get method, and so on. It is important to note that the Session.find method always fetches data from the database and does not fetch data from the level two cache, even if it has the data it needs.

The implementation of using the cache at query time is: First query the primary cache for the required data, if not, query the level two cache, if there is no level two cache, then perform the query database work. Note that the query speed in these 3 ways is reduced in turn.

1.2. Problems that exist

1.2.1. First-level cache issues and reasons for using level two caching

Because the lifetime of the session is often very short, the first level of the cache that exists within the session is of course very short, so the hit rate of the first level cache is very low. The improvement of the system performance is also very limited. Of course, the main function of this session internal cache is to keep the session internal data state synchronized. Not hibernate is provided to significantly improve the performance of the system.

In order to improve the performance of hibernate, in addition to some common methods that require attention such as:

With lazy loading, urgent external connections, query filtering, and so on, you also need to configure Hibernate's level two cache. The improvement of the overall performance of the system often has an immediate effect!

(after the experience of their previous projects, will generally have a performance improvement of more than one-fold)

1.2.2. Issues with n+1 queries

When will 1.2.2.1 encounter 1+n problem?

Prerequisite: Hibernate The default table-to-table association method is fetch= "select", not fetch= "join", which is prepared for lazy loading.

1) One-to-many (<set><list>), in 1 of this side, through 1 SQL lookups to get 1 objects, because of the existence of the association, then need to have this object associated with the collection, so the number of the collection is n also emit n sql, So the original 1 SQL query became 1 +n bar.

2) Many-to-one <many-to-one>, in many of this side, through 1 SQL query to get n objects, because of the existence of the association, will also be the N objects corresponding to the 1-party object out, so the original 1 SQL query into 1 +n bar.

3) iterator query, be sure to go to the cache first (1 SQL check set, only ID), in the dead, will again by ID to the library to find, resulting in 1+n SQL

1.2.2.2 How to solve 1+n problem?

1) Lazy=true, Hibernate3 started by default is Lazy=true, Lazy=true will not immediately query the association object, only if you need to associate an object (access its properties, not the ID field) will occur only when the query action.

2) using a level two cache, two-level cache applications will not be afraid of 1+n issues, because even if the first query is slow (missed), subsequent query direct cache hits are fast. It just took advantage of 1+n.

3) Of course you can also set the fetch= "join", once the association table was completely isolated, but lost the lazy loading characteristics.

When you perform a conditional query, the iterate () method has a well-known "n+1" query, which means that the iterate method executes the query with the query results that meet the criteria one more time (n+1) at the first query. However, this problem only exists at the first query, and performance can be greatly improved when executing the same query later. This method is suitable for querying business data with a large amount of data.

However, note: When the amount of data is particularly large (such as pipeline data, etc.) need to configure their specific cache policy for this persistence object, such as setting its maximum number of records in the cache, the time of the cache, and so on, to avoid the system to load a large amount of data in memory and the memory resource is quickly exhausted. Instead, it lowers the performance of the system!!!

1.3. Additional considerations for using Hibernate level two cache:

1.3.1. About the validity of data

In addition, hibernate maintains the data in the level two cache to ensure consistency between the data in the cache and the real data in the database! Whenever you invoke the Save (), update (), or Saveorupdate () method to pass an object, or use the load (), get (), list (), iterate (), or scroll () method to obtain an object, the The object will be added to the internal cache of the session. When the flush () method is subsequently called, the state of the object is synchronized with the database.

This means that the cache is updated while the data is deleted, updated, and incremented. Of course this also includes level two cache!

Just call the Hibernate API to perform database-related work. Hibernate will automatically guarantee the validity of your cached data!!

However, if you use JDBC to bypass hibernate directly perform operations on the database. At this point, hibernate will not/also not be able to perceive the changes made to the database, it can no longer guarantee the validity of the data in the cache!!

This is also a common problem with all ORM products. Fortunately, hibernate exposes us to the cache removal method, which gives us an opportunity to manually guarantee data availability!!

First-level cache, the level two cache has a corresponding cleanup method.

Where the two-level cache provides the cleanup method:

Empty cache by object class

Empties the cache by object class and the primary key ID of the object

Empties the cached data in the collection of objects, and so on.

1.3.2. Suitable use case

Not all cases are appropriate for use with level two caching, which needs to be determined according to the circumstances. You can also configure their specific caching policies for a persisted object.

Suitable for use with level two caching:

1, the data will not be modified by third parties;

In general, data that is modified outside of hibernate is best not to configure a level two cache to avoid causing inconsistent data. However, if this data needs to be cached for performance reasons, and it may be modified by 3rd parties such as SQL, you can also configure a level two cache for it. It's just that you need to manually call the cache's Purge method after the SQL execution changes. To ensure consistency of data

2, the data size within the acceptable range;

If the data table has a particularly large amount of data, this is not appropriate for level two caching. The reason is that the amount of cached data is too large to cause memory resources to be strained, but degrades performance.
If the data table data volume is particularly large, but often used is only a relatively new part of the data. At this point, you can also configure a level two cache for it. However, the caching policies of their persisted classes must be configured separately, such as the maximum cache count, cache expiration time, and so on, to reduce these parameters to a reasonable range (too high will cause memory resource tension, too low for cache significance).

3, the Data update frequency is low;

For data with data updates that are too frequent, the cost of frequently synchronizing the data in the cache can be comparable to the benefit from the data in the query cache, offsetting the disadvantage benefits. At this point the cache does not have much meaning.

4, non-critical data (not financial data, etc.)

Financial data, etc. is a very important data, it is absolutely not allowed to appear or use invalid data, so it is best not to use the level two cache for security reasons.

Because the importance of "correctness" is much greater than the importance of "high performance" at this time.

2. Suggestions for using hibernate caching in your system today

2.1. The current situation

There are three scenarios in a general system that bypass hibernate to perform database operations:

1. Multiple Application Systems access a database at the same time

In this case, the use of Hibernate level two cache will inevitably result in inconsistent data, which should be designed in detail at this time. For example, in the design of the same data table to avoid simultaneous write operations,
Use the various levels of database locking mechanisms, and so on.

2, dynamic table related

The so-called "dynamic table" refers to the data table that is automatically established according to the user's operating system when the system is running.

For example, "custom forms," such as the user custom extension development of the nature of the function module, because the data table is established at runtime, so hibernate can not be mapped. Therefore, it is only possible to manipulate the direct database JDBC operation bypassing hibernate.

If the data in the dynamic table is not designed for caching, there is no problem with data inconsistency.

If you design your own caching mechanism at this point, you can call your own cache synchronization method.

3. When using SQL to bulk delete Hibernate persisted object table

After you perform a bulk delete, the data that has been deleted is present in the cache.

Analysis:

When 3rd is executed (SQL bulk Delete), subsequent queries may only be in the following three ways:

A. Session.find () Method:

According to the previous summary, the Find method does not query the data of level two cache, but instead queries the database directly.

So there is no problem with data validity.

B. When calling the iterate method to perform a conditional query:

Depending on how the iterate query method executes, each time it queries the database for the ID value that satisfies the condition, and then gets the data from that ID to the cache, the database query is executed when the data in the cache does not have this ID;

If this record has been deleted directly by SQL, iterate does not query this ID when it executes the ID query. Therefore, even if the cache has this record is not obtained by the customer, there is no inconsistency. (This condition has been tested and verified)

C. Execute the query by ID with the GET or Load method:

Objectively, the data that is already out of date will be queried. However, because the system in the execution of SQL bulk deletion is generally for the intermediate relational data table, the query for the Intermediate association table is generally used conditional query, by ID to query an association relationship is very low, so this problem does not exist!

If a value object does need to query an association by ID, and because the volume of data is large, SQL is used to perform bulk deletion. When these two conditions are met, in order to ensure that the query by ID gets the correct results, you can use the method of manually clear the data of this object in level two cache!! (This situation is less likely to occur)

2.2. Recommendations

1, it is recommended not to use SQL directly to perform data persistence object data Update, but you can perform bulk delete. (There are fewer places in the system that require batch updates)

2. If you must use SQL to perform data updates, you must empty the cached data for this object. Call

Sessionfactory.evict (Class)

Sessionfactory.evict (Class,id) and other methods.

3, in the bulk delete data volume can be directly used Hibernate batch delete, so there is no bypass hibernate to execute SQL generated by the cache data consistency problem.

4. It is not recommended to use Hibernate's bulk deletion method to delete large quantities of recorded data.

The reason is that the bulk delete of Hibernate executes 1 query statements plus n DELETE statements that satisfy the condition. Instead of executing one conditional DELETE statement at a time!!
When there is a lot of data to be deleted there will be a great performance bottleneck!!! If the volume of deleted data is large, such as more than 50, it can be directly removed by JDBC. The benefit of this is that only one SQL DELETE statement is executed, and performance can be greatly improved. At the same time, the cache data synchronization problem, can use hibernate to clear the level two cache related data method.

Call

Sessionfactory.evict (Class);

Sessionfactory.evict (Class,id) and other methods.

So, for the general application system development (not related to clustering, distributed data synchronization problem, etc.), because only in the intermediate correlation table to perform bulk deletion when the SQL execution is called, and the Intermediate association table is generally the execution condition query is not likely to perform the query by ID. Therefore, you can execute the SQL delete directly at this time, and you do not even need to call the cache cleanup method. Doing so does not result in a later configuration of a level two cache that causes data validation.

Step back, even if you do call the method of querying an intermediate table object by ID, you can resolve it by calling the clear cache method.

3, the specific configuration method

According to many of the Hibernate users I know, it is superstitious to believe that "hibernate will handle the performance problem for us" or "hibernate will automatically call the cache for all of our operations." The actual situation is that Hibernate provides us with a good caching mechanism and extended cache framework support, but must be properly called before it is possible to play a role!! So many of the performance problems with Hibernate systems are not actually hibernate or bad, but because the user is not properly aware of how they are used. Conversely, if configured correctly, hibernate performance will make you quite "pleasantly surprised" to discover. Below I explain the specific configuration method.

The ibernate provides a level two cache interface:

Net.sf.hibernate.cache.Provider,

It also provides a default implementation of the Net.sf.hibernate.cache.HashtableCacheProvider,

Other implementations, such as Ehcache,jbosscache, can also be configured.

The specific configuration location is in the Hibernate.cfg.xml file

    1. < name="Hibernate.cache.use_query_cache">true </Property>
    2. <property name="Hibernate.cache.provider_class"> Net.sf.hibernate.cache.HashtableCacheProvider</Property>

Many hibernate users are configured to this step and think they're done.

Note: In fact, the light is so well-equipped that there is no two-level cache using Hibernate. At the same time, because most of the time they use Hibernate to close the session, the first level cache does not play any role. The result is that no cache is used, and all hibernate operations are directly operational databases!! Performance can be imagined.

The correct approach is to configure each Vo object's specific cache policy in addition to the above configuration in the mapping file. For example:

  1. <hibernate-mapping>
  2. <class name="Com.sobey.sbm.model.entitySystem.vo.DataTypeVO" table="Dcm_datatype">
  3. <cache usage="Read-write"/>
  4. <ID name="id" column="TYPEID" type="Java.lang.Long">
  5. <generator class="sequence"/>
  6. </ID>
  7. <property name="name" column="name" type="java.lang.String"/>
  8. <property name="DbType" column="DbType" type="java.lang.String"/>
  9. </class>
  10. </hibernate-mapping>

The key is this <cache usage= "Read-write"/> it has several options read-only,read-write,transactional, etc.

Then, when executing the query, note that if it is a conditional query, or a query that returns all results, the Session.find () method does not get the data in the cache at this time. The cached data is only tuned when the Query.iterate () method is called.

Both the Get and load methods will query the data in the cache

There are different configuration methods for different cache frameworks, but it is generally the above configuration (in addition, for the support of transactional types and the configuration of the cluster-enabled environment I will strive to publish in subsequent articles)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.