Hibernate optimization solution

Source: Internet
Author: User

People who use HIBERNATE may have encountered performance problems and implemented the same function. It is normal that the performance difference between HIBERNATE and JDBC is more than 10 times. If it is not adjusted as soon as possible, it may affect the progress of the entire project.

In general, the main considerations for HIBERNATE performance tuning are as follows:

* Database design adjustment

* HQL Optimization

* Correct use of APIS (for example, selecting different sets and querying APIs based on different service types)

* Main configuration parameters (log, query cache, fetch_size, batch_size, etc)

* Ing file optimization (ID generation policy, secondary cache, delayed loading, and association optimization)

* Level-1 Cache Management

* There are also many unique policies for level-2 caching.

* Transaction control policy.

1. Database Design

A) reduce the complexity of Association

B) Try not to use the federated primary key

C) ID generation mechanism. Different databases provide different mechanisms.

D) Appropriate redundant data, but the pursuit of high paradigm

2. HQL Optimization

Aside from its association with some caching mechanisms of HIBERNATE, HQL's optimization skills are the same as common SQL optimization skills, so it is easy to find some experience on the Internet.

  • Bulk delete/update in hibernate 3 can greatly improve operation flexibility and efficiency during batch data operations.
  • Try to use the SELECT statement to write the attributes to be queried to return the relational data.
  • SQL statement Optimization

See blog: http://blog.csdn.net/xdgofloadrunner/article/details/4131604

3. Master Configuration

A) query cache, which is different from the cache described below. It is a cache for HQL statements, that is, cache data can be used for re-execution of identical statements. However, the query cache in a transaction system (data changes frequently and the probability of the same query conditions is not large) may be counterproductive: it will consume a lot of system resources in vain, but it is difficult to come in handy.

B) fetch_size, which is similar to JDBC-related parameters. The larger the parameter is, the better. You should set it according to business characteristics.

Sets the number of records retrieved from the database each time the JDBC Statement reads data.

C) Same as batch_size.

Batch Size is the Batch Size set for database Batch deletion, Batch update and Batch insertion, which is a bit equivalent to setting the Buffer Size.

Hibernate. JDBC. fetch_size 50
Hibernate. JDBC. batch_size 25

These two options are very important !!! It will seriously affect the crud performance of hibernate!

C = create, r = read, u = Update, D = Delete

Fetch Size is the number of records retrieved from the database each time the JDBC Statement reads data.

For example, if you Query 10 thousand records at a time, the JDBC driver of Oracle will not obtain 10 thousand records at a time, but will only retrieve the number of Fetch Size records, after these records are traversed by the record set, the database will Fetch Size data.

Therefore, the unnecessary memory consumption is greatly reduced. Of course, the larger the Fetch Size is, the smaller the number of reads to the database, the faster the speed. The smaller the Fetch Size, the more reads the database, and the slower the speed.

This is a bit like writing a program to write a hard disk file, setting up a Buffer, writing a Buffer each time, and writing a hard disk once after the Buffer is full. The same principle applies.

The default Fetch Size of the JDBC driver of Oracle Database is 10, which is a very conservative setting. According to my test, when Fetch Size is 50, the performance will be doubled, when Fetch Size = 100, the performance can continue to increase by 20%, and the Fetch Size will continue to increase, so the performance improvement will not be significant.

Therefore, we recommend that you set the Fetch Size to 50 for Oracle.

However, not all databases support the Fetch Size feature. For example, MySQL does not.

MySQL is like the worst case I mentioned above. He always extracts 10 thousand records at once, and the memory consumption will be very amazing! There is no good way to do this :(

Batch Size is the Batch Size set for database Batch deletion, Batch update and Batch insertion, which is a bit equivalent to setting the Buffer Size.

The larger the Batch Size, the less times the Batch operation sends SQL statements to the database, the faster the speed. One test result I made was that when Batch Size = 0, it would take 25 seconds to delete 10 thousand records from the Oracle database using Hibernate. When Batch Size = 50, it takes only 5 seconds to delete it !!!

We can see how much performance is improved! Many people may find that the speed of Hibernate is at least twice that of JDBC, because Hibernate uses Batch Insert, the JDBC they wrote did not use Batch.

In my experience, the Batch Size of the Oracle database is suitable when it is set to 30. The value 50 is also good, and the performance will continue to improve. The value 50 is above and the performance improvement is very weak, instead, there is no need to consume more memory.

D) In the production system, remember to turn off SQL statement printing.

4. Cache

A) database-level cache: this level of cache is the most efficient and secure, but different database management levels are not the same. For example, in ORACLE, you can specify to cache the entire table during table creation.

B) SESSION cache: the cache is valid in a HIBERNATE SESSION. This level of cache is not reliable and is more efficient than HIBERNATE's automatic management. However, it provides a method to clear the cache, this is effective in bulk Add/update operations. For example, if you add 100,000 records at the same time, you may find an OutofMemeroy exception. In this case, you may need to manually clear the cache: Session. evict and Session. clear.

C) Application cache: it is effective in a SESSIONFACTORY, so it is also the top priority of optimization. Therefore, there are many policies to consider. Before putting data into this cache level, consider the following prerequisites:

I. data will not be modified by a third party (for example, is there another application that is also modifying the data ?)

Ii. The data is not too big

Iii. data will not be updated frequently (otherwise, using the CACHE may be counterproductive)

Iv. Data is frequently queried.

V. Data is not key data (such as money and security issues ).

The cache can be configured in several formats in the ing file: read-only (read-only, applicable to static/historical data with few changes), nonstrict-read-write, read-write (common format with average efficiency), transactional (in JTA, few caching products are supported)

D) distributed cache: the configuration is the same as that of c), but the selection of cache products is different. Currently, there are not many options available in HIBERNATE, such as oscache and jboss cache. Most of the current projects, they are conservative in using clusters (especially key transaction systems. In a cluster environment, database-level cache is the safest.

5. Delayed Loading

A) object delayed loading: implemented through dynamic proxy

B) SET delayed loading: HIBERNATE provides support for this by implementing its own SET/LIST.

C) Delayed attribute loading:

6. Method Selection

A) to do the same thing, HIBERNATE provides some options, but the specific method may affect performance/code. It is shown that 100,000 records (List/Set/Bag/Map) are returned for processing at a time, which may result in insufficient memory.Result set based on cursor (scrollableresults) or iterator.

B) SessionLoad/get Method.

1. If no matching record is found, the get () method returns NULL. Load () will report objectnotfoundexception.
2. The load () method can return the proxy class instance of the object, while get () always returns only the object class.
3. the load () method can fully utilize the existing data in the second-level cache and internal cache, while the get () method only searches in the internal cache. If no data is found, the second-level cache is skipped, directly call SQL to complete the search.

See: http://blog.csdn.net/nuoyan666/article/details/4620505

C) query and list/iterator. If you study them carefully, you may find many interesting situations. The main differences between the two are: (if spring is used, find in hibernatetemplate, iterator method ):

I. List can only use the query cache (hql cache, but it does not play a major role in the transaction system ),A single entity in the second-level cache cannot be usedBut the list object will be written to the second-level cache, But it generally only generates a small number of SQL statements to execute, in many cases it is one (not associated ).

Ii. iterator can use the second-level cache. For a query statement,It first finds the IDs of all qualified records from the database, and then caches the IDs. For records not in the cache, it constructs a statement to find the records from the database.Therefore, it is easy to know that if the cache does not contain any qualified records, using iterator will generate n + 1 SQL statement (n is the number of qualified records)

Iii. Use iterator and Cache Management APIs to solve memory problems in massive data queries, such:

While (it. hasnext ()){

Youobject object = (youobject) it. Next ();

Session. evict (youobject); // clears the specified persistent object from the first-level cache, releases the memory resources occupied by the object, and changes the specified object from the persistent state to the unmanaged state, to become a free object.

Sessionfactory. evict (youobject. Class, youobject. GETID (); // clears the persistence object of the specified ID of a class from the second-level cache and releases the resources occupied by the object.

}

If the list method is used, the outofmemory error may occur.

Iv. As described above, I think you should know how to use these two methods.

7. Set Selection

Detailed descriptions are provided in "3.1. Understanding collection performance" in hibernate 19.5.

See blog: http://iceside.iteye.com/blog/1047967

8. Transaction Control

The main impact of transactions on performance includes the selection of transaction methods, transaction isolation levels, and lock selection.

A) transaction method selection: if you do not involve multiple Transaction Manager transactions, you do not need to use JTA. Only JDBC transaction control is supported.

B) transaction isolation level: see standard SQL transaction isolation level.

C) Selection of locks: Pessimistic locks (usually implemented by the specific transaction Manager), low efficiency but secure for long transactions. Optimistic lock (usually implemented at the application level). For example, you can define the version field in hibernate. Obviously, if multiple applications operate on data and these applications do not use the same optimistic lock mechanism, the optimistic lock will become invalid. Therefore, there should be different policies for different data. As in many cases above, we often find a balance between efficiency and security/accuracy, optimization is not a pure technical problem. You should have a sufficient understanding of your applications and business features.

9. batch operation

Even when JDBC is used to update a large amount of data, the efficiency of batch differs greatly from that of no batch. You can set batch_size to support batch operations.

For example, to delete objects in a table in batches, such as the "delete account" Statement, Hibernate finds the IDs of all accounts and then deletes them, this is mainly to maintain the second-level cache, which is definitely not efficient. Bulk delete/update is added in subsequent versions, but it cannot solve the cache maintenance problem. That is to say, Hibernate's batch operation efficiency is not satisfactory due to the maintenance of the second-level cache!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.