Caching is between an application and a physical data source, and it is designed to reduce the frequency of application access to physical data sources, thereby improving the performance of the application. The data in the cache is a copy of the data in the physical data source, the application reads and writes data from the cache at run time, and at a particular moment or event synchronizes the cache and the data from the physical data source.
The cached media is generally memory, so the read and write speed is very fast. However, if the amount of data stored in the cache is very large, the hard disk is also used as the cache medium. The implementation of the cache not only takes into account the stored media, but also considers the concurrent access to manage the cache and the life cycle of the cached data.
Hibernate caches include session caches and sessionfactory caches, where the sessionfactory cache can be divided into two categories: built-in caches and external caches. The session cache is built-in and cannot be uninstalled, and is also known as Hibernate's first-level cache. Sessionfactory's built-in cache and session cache are similar in implementation, which is the data contained in some of the collection properties of the Sessionfactory object, which refers to the data contained in some of the session's collection properties. Sessionfactory's built-in cache contains mapping metadata and pre-defined SQL statements, mapping metadata is a copy of the data in the mapping file, and the predefined SQL statements are derived from the mapping metadata during the hibernate initialization phase. Sessionfactory's built-in cache is read-only, and the application cannot modify the mapping metadata and predefined SQL statements in the cache, so sessionfactory does not need to synchronize the built-in cache with the mapped file. Sessionfactory's external cache is a configurable plug-in. By default, Sessionfactory does not enable this plugin. The external cached data is a copy of the database data, and the externally cached media can be either memory or hard disk. Sessionfactory's external cache is also known as Hibernate's second-level cache.
What is the difference between Hibernate's two-level cache, which is located in the persistence layer, which stores copies of database data? To understand the difference, you need to understand the two features of the persistence layer's cache: the cached scope and the cached concurrency access policy.
The scope of the cache for the persistence layer
The scope of the cache determines the life cycle of the cache and who can access it. The scope of the cache falls into three categories.
1 Transaction scope: The cache can only be accessed by the current transaction. The life cycle of the cache relies on the life cycle of the transaction, and when the transaction ends, the cache ends the life cycle. Under this range, the cached media is memory. A transaction can be a database transaction or an application transaction, each transaction has its own cache, and the data in the cache is usually in the form of an interrelated object.
2 Process Scope: The cache is shared by all transactions within the process. These transactions are likely to be concurrent access caches, so a necessary transaction isolation mechanism must be taken against the cache. The lifecycle of the cache depends on the life cycle of the process, and at the end of the process the cache ends the life cycle. The process-wide cache may hold a large amount of data, so the stored media can be either memory or hard disk. The data within the cache can be either an associative object form or an object's loose data form. The Loose object data form is somewhat similar to the object's serialized data, but the object decomposition into a loose algorithm requires faster algorithm than object serialization.
3 Cluster scope: In a clustered environment, the cache is shared by processes of a machine or multiple machines. The data in the cache is replicated to each process node in the cluster environment, where the data in the cache is guaranteed to be consistent through remote communication, and the data in the cache is usually in the form of loose data for the object.
For most applications, it should be prudent to consider whether a cluster-wide cache is required, because the speed of access is not necessarily faster than accessing database data directly.
The persistence layer can provide a wide range of caches. If the corresponding data is not found in the transaction-scoped cache, it can be queried within the process scope or cluster-wide cache, and if it is not, it is only queried in the database. The transaction-scoped cache is the first-level cache of the persistence layer, which is usually required, and the process-wide or cluster-wide cache is the second-level cache of the persistence layer, which is usually optional.
Persistent layer cache concurrency access policy
Concurrency issues occur when multiple concurrent transactions access the same data in the persisted layer's cache, and must take the necessary transaction isolation measures.
Concurrency issues occur in a process-wide or cluster-wide cache, which is a second-level cache. You can therefore set the following four types of concurrent access policies, each of which corresponds to a transaction isolation level.
Transactional: Only applicable in managed environments. It provides the REPEATABLE read transaction isolation level. For data that is often read but rarely modified, this type of isolation can be used because it prevents concurrency problems such as dirty reads and non-repeatable reads.
Read-write: Provides the Read committed transaction isolation level. Applicable only in non-clustered environments. For data that is often read but rarely modified, this type of isolation can be used because it prevents concurrency problems such as dirty reads.
Non-Strictly read-write: does not guarantee the consistency of the cache with the data in the database. If there are two transactions that can access the same data in the cache at the same time, you must configure a very short data expiration time for that data to avoid dirty reads as much as possible. This concurrent access policy can be used for data that is rarely modified and allows for occasional dirty reads. Read-only: This concurrency access policy can be used for data that is never modified, such as reference data.
Transactional concurrency access policy is the highest transaction isolation level, with the lowest read-only isolation level. The higher the transaction isolation level, the lower the concurrency performance.
What data is suitable for storage in the second level cache?
1. Data that is rarely modified
2. Data that is not very important, allowing occasional concurrent data to occur
3. Data that will not be accessed concurrently
4. Reference data
Not suitable for storing data in a second level cache?
1. Frequently modified data
2, financial data, not allowed to appear concurrency
3. Data that is shared with other applications.
Hibernate's Level two cache
As mentioned earlier, Hibernate provides a level two cache, and the first level is the session cache. Because the life cycle of a Session object usually corresponds to a database transaction or an application transaction, its cache is a transaction-scoped cache. The first level of caching is required, is not allowed, and in fact cannot be compared to dismount. In the first level cache, each instance of a persisted class has a unique OID.
The second level cache is a pluggable cache plug-in, which is managed by Sessionfactory. Because the life cycle of the Sessionfactory object corresponds to the entire process of the application, the second-level cache is a process-wide or cluster-wide cache. Loose data for objects stored in this cache. Secondary objects are likely to have concurrency problems, so an appropriate concurrency access policy is required, which provides the transaction isolation level for cached data. The cache adapter is used to integrate the specific cache implementation software with Hibernate. The second-level cache is optional, and you can configure a second-level cache on a per-class or per-collection granularity.
The general process for Hibernate's level two cache strategy is as follows:
1) When a condition is queried, always issue a SELECT * FROM table_name where .... (select all fields) such SQL statements query the database and get all the data objects at once.
2) Put all the obtained data objects into the second level cache based on the ID.
3) When hibernate accesses the data object according to the ID, it is first checked from the session cache, and if the level two cache is configured, it is checked from the level two cache, and the database is not found, and the result is placed in the cache by ID.
4) Update the cache while deleting, updating, and adding data.
Hibernate's level two cache policy, which is a cache policy for ID queries, has no effect on conditional queries. To do this, hibernate provides a query cache for conditional queries.
The process for Hibernate's query cache policy is as follows:
1) Hibernate first consists of a query Key,query Key that includes the request general information for the conditional query: SQL, SQL required parameters, record range (starting position rowstart, maximum number of records maxrows), and so on.
2) Hibernate finds the corresponding result list based on this query key into the query cache. If present, returns the list of results, and if not, queries the database, gets a list of results, and puts the entire result list into the query cache based on the query key.
3) The SQL in Query key involves some table names, and if any of these tables are modified, deleted, incremented, and so on, these related query keys are emptied from the cache.
Hibernate first-level cache and level two cache