This article is based on hibernate help documents, as well as some books and project experience to organize, only to provide points and ideas, specific ways to discuss the message, or find some more detailed and more targeted information.
Hibernate people may have encountered performance problems, to achieve the same function, hibernate and JDBC performance difference more than 10 times times is normal, if not adjusted early, it is likely to affect the overall progress of the project.
Basically, the main considerations for Hibernate performance tuning are as follows:
? Database Design Tuning
? HQL optimization
? Proper use of APIs (e.g., different collections and query APIs based on different business types)
? Main configuration parameters (log, query cache, fetch_size, batch_size, etc.)
? Mapping file optimization (ID generation policy, Level two cache, lazy loading, association optimization)
? Management of first-level caches
? There are many unique strategies for level two caching
? Transaction control policies.
1. Database design
A) Reduce the complexity of the association
b) Try not to use the Federated primary key
c) ID generation mechanism, different database provides the mechanism is not exactly the same
d) Appropriate redundant data, not overly pursuing high paradigm
2, HQL optimization
HQL If you throw away the association of some caching mechanisms with Hibernate itself, HQL's optimization techniques, like normal SQL optimization techniques, can easily be found on the web.
3. Master Configuration
A) the query cache, which is not the same as the cache below, is the cache for HQL statements, that is, when the exact same statement executes again, the cached data can be leveraged. However, query caching can be counterproductive in a trading system (where data changes frequently and the odds of querying the same conditions are not large): It can cost a lot of system resources but is difficult to come in handy.
b) Fetch_size, similar to the relevant parameters of JDBC, the parameters are not as large as possible, but should be based on business characteristics to set
c) Batch_size Ibid.
D) in the production system, remember to turn off the SQL statement printing.
4. Cache
A) database-level caching: This level of caching is most efficient and secure, but different levels of database manageability are not the same, for example, in Oracle , you can specify that the entire table be placed in the cache when you build the table.
b) Session cache: Valid in a Hibernate session, this level of caching is not highly intrusive, mostly in hibernate auto-management, but it provides a way to clear the cache, which is effective in large-volume increase/update operations. For example, adding 100,000 records at the same time, in the usual way, is likely to find Outofmemeroy exceptions, and you may need to manually clear this level of cache: Session.evict and Session.clear
c) Application Caching: Effective in a sessionfactory, and therefore also the priority of optimization, therefore, a variety of strategies are also considered more, before putting the data into this level of caching, you need to consider some prerequisites:
I. The data will not be modified by a third party (for example, is there another application that is modifying the data?)
II. The data is not too big
III. Data is not updated frequently (otherwise using the cache may backfire)
Iv. data will be frequently queried
V. Data is not a key data (such as money, security, etc.).
There are several forms of caching that can be configured in the mapping file: read-only (read-only, suitable for infrequently changed static data/historical data), Nonstrict-read-write,read-write (more general form, efficiency), transactional (JTA, with fewer supported cache products)
d) Distributed cache: Same as C) configuration, just the choice of cache products, in the current hibernate is not much to choose, Oscache, JBoss Cache, most of the current projects, their use for the cluster (especially the key trading system) conservative attitude. In a clustered environment, using only database-level caches is the safest.
5. Delayed loading
A) Entity lazy loading: implemented by using dynamic proxy
b) Set delay loading: This support is provided by implementing your own Set/list,hibernate
C) Property Lazy loading:
6. Method selection
A) to do the same thing, Hibernate offers some options, but in what ways it can be influenced by performance/code. Display, returning 100,000 records at a time (list/set/bag/map, etc.) for processing, is likely to cause a problem with insufficient memory , and if you are using a cursor-based (scrollableresults) or iterator result set, there is no such problem.
b) Session of the Load/get method, the former will use a level two cache, while the latter is not used.
c) Query and List/iterator, if you look at them carefully, you may find a lot of interesting situations, the two main differences (if spring is used, the hibernatetemplate corresponds to the Find,iterator method):
I. List can only take advantage of the query cache (but not in the transaction system), cannot take advantage of a single entity in the level two cache, but the list of objects isolated to write a level two cache, but it generally produces fewer execution SQL statements, in many cases is a (no association).
II. Iterator can take advantage of a level two cache, for a query statement, it will first find all eligible records from the database ID, and then through the ID to cache to find, the cache does not have records, and then construct the statement from the database, so it is easy to know, If there are no qualifying records in the cache, using iterator yields a n+1 SQL statement (n is the number of records that match the criteria)
Iii. through iterator, with the cache management API, in the massive data query can be a good solution to memory problems, such as:
while (It.hasnext ()) {
Youobject object = (youobject) it.next ();
Session.evict (Youobject);
Sessionfactory.evice (Youobject.class, Youobject.getid ());
}
If you use the list method, you are likely to have a outofmemory error.
Iv. through the above instructions, I think you should know how to use these two methods.
7. Selection of Sets
In the Hibernate 3.1 document, "19.5. Understanding Collection Performance "is described in detail.
8. Transaction control
The main factors that affect performance are: choice of transaction Mode, transaction isolation level and lock selection.
A) Transaction mode selection: If you do not involve multiple transaction manager transactions, you do not need to use JTA, only the JDBC transaction control.
b) Transaction ISOLATION LEVEL: see Standard SQL Transaction Isolation level
c) Lock selection: Pessimistic lock (typically implemented by a specific transaction manager), low efficiency for long transactions, but security. Optimistic locks (typically implemented at the application level), such as the version field can be defined in hibernate, it is clear that optimistic locks will fail if there are multiple application manipulation data and these applications are not using the same optimistic locking mechanism. Therefore, there should be different strategies for different data, as in many cases, we find a balance between efficiency and security/accuracy, in any case, optimization is not a purely technical problem, you should have enough knowledge of your application and business characteristics.
9. Batch Operation
Even with JDBC, there is a significant difference in the efficiency of batch versus non-use of batch data updates. We can set the Batch_size to support bulk operations.
For example, to bulk delete objects in a table, such as "Delete account", the statement will find that hibernate found all the ID of the account, and then delete, which is mainly to maintain the level two cache, so the efficiency is certainly not high, Bulk Delete/update has been added in subsequent releases, but this does not solve the maintenance problem of the cache. In other words, Hibernate's batch operation efficiency is not as satisfactory as the maintenance of level two cache.
As can be seen from many of the previous points, many times we are in the efficiency and safety/accuracy of a balance, in any case, optimization is not a purely technical problem, you should have a good understanding of your application and business characteristics, general, optimization plan should be in the design period of the basic determination, otherwise it may lead to unnecessary rework, Delay the project, and as architects and project managers, as well as the potential complaints of developers, we have little control over user requirements changes, but the technical/architectural risk is that it should be aware of and develop relevant countermeasures at an early stage.
It is also important to note that the application layer cache is just icing on the cake, never use it as a lifeline, the foundation of the application (database design, algorithms, efficient operation statements, the choice of appropriate APIs, etc.) is the most significant.
Performance optimization issues for hibernate