Hibernate application Batch Processing

Source: Internet
Author: User
Tags bulk insert commit flush int size stmt

First, BULK INSERT

During the development of the project, we often need to insert large quantities of data into the database due to project requirements. There are magnitude, level 100,000, millions, and even tens. This number of levels of data is inserted using Hibernate, and exceptions can occur, and common exceptions are outofmemoryerror (memory overflow exceptions).

First, let's briefly review the mechanism of Hibernate insert operations. Hibernate wants to maintain its internal cache, and when we perform an insert operation, the objects that we want to manipulate are all placed in our own internal cache for management.

When it comes to hibernate caching, Hibernate has an internal cache with a level two cache. Since hibernate has different management mechanisms for both caches, we can configure the size of the cache for level two, and Hibernate takes a "laissez-faire" attitude towards the internal cache, with no limit to its capacity. Now the crux of the problem has been found, we do massive data insertion, the generation of so many objects will be included in the internal cache (internal cache is cached in memory), so that your system memory will be 1.1 points are eaten, if the final system was squeezed "fried", it is reasonable.

Let's think about how to deal with this problem better. Some of the development conditions must be processed using Hibernate, of course, some projects are more flexible, you can seek other methods.

     I recommend two ways to do this:
     (1): Optimize hibernate, and use segmented inserts to clear the cache in a timely manner.
     (2): Bypass the Hibernate API, directly through the JDBC API to do bulk insertion, this method is the best performance, but also the fastest.
     for Method 1 above, the basic idea is: Optimize hibernate, set the Ibernate.jdbc.batch_size parameter in the configuration file, to specify the number of SQL per commit , the procedure uses the segmented insert to clear the cache in a timely manner (the session implements the asynchronous Write-behind, which allows hibernate to explicitly write the batch of operations), that is, every time a certain amount of data is inserted in a timely manner to remove them from the internal cache, freeing up the occupied memory.

     The reason for configuring the Hibernate.jdbc.batch_size parameter is to read as few databases as possible, and the larger the hibernate.jdbc.batch_size parameter, the less the number of read databases The faster it gets.
 
     Program Implementation, I insert 10,000 data as an example, such as:
    session session= Hibernateutil.currentsession ();
    transatcion tx=session.begintransaction ();
    for (int i=0;i<10000;i++) {
      student st=new Student ();
      st.setname ("Feifei");
      session.save (ST);
      if (i%20==0) {//With every 20 data as a processing unit, same as JDBC Batch setup
         session.flush (); Maintain synchronization with database data
        session.clear ();//Clears all data from internal cache and frees up memory in time
      }
    }
    tx.commit ();

At a certain data scale, this approach can maintain the system memory resources in a relatively stable range.

Note: The previous mention of level two cache, the author here is necessary to mention. If level two caching is enabled, Hibernate is based on the mechanism to maintain the level two cache, hibernate will fill the two cache with the appropriate data when inserting, updating, and deleting operations. There will be a significant loss in performance, so I recommend disabling level two caching in batch processing scenarios.

For method 2, use the traditional JDBC batch, which is handled using the JDBC API.
Connection conn=db.getconnection ();
PreparedStatement stmt=conn.preparestatement ("INSERT into t_student (name) VALUES (?)");
for (int j=0;j<200;j++) {
for (int i=0;i<50;i++) {
Stmt.setstring (1, "Feifei");
Stmt.addbatch ();
}
Stmt.executebatch ();
Conn.commit ();
}

Look at the above code, is not always feel that there is something wrong with the place. Yes, I didn't find it. This is also the traditional programming of JDBC, without a little

Hibernate flavor.
You can modify the above code to the following:
Transaction tx=session.begintransaction (); Working with hibernate transaction Boundary connection

Conn=session.connection ();
Preparestatement stmt=conn.preparestatement ("INSERT into t_student (name) VALUES (?)");
for (int j=0;j++;j<200) {
for (int i=0;i++;j<50) {
Stmt.setstring (1, "Feifei");
}
}
Stmt.executeupdate ();
Tx.commit (); Working with hibernate transaction boundaries

This change will have the taste of hibernate. The author tested, using the JDBC API to do batch processing, performance than the use of

The Hibernate API is nearly 10 times times higher and the performance of JDBC is undoubtedly superior.

Second, batch update and delete

In Hibernate2, for bulk update operations, Hibernate is to isolate the data that meets the requirements and then do the update operation. Batch deletion is also the case, first to identify the eligible data, and then do the delete operation.
This has two major drawbacks:
(1): Consumes a lot of memory.
(2): the processing of large amounts of data, the execution of Update/delete statement is massive, and a update/delete statement can only operate an object, so the frequent operation of the database, low performance should be imagined.
After the Hibernate3 was released, the bulk Update/delete was introduced to the bulk update/delete operation, which was done by a HQL statement, which is similar to the batch update/delete operation of JDBC. In performance, there is a significant improvement over Hibernate2 's bulk update/delete.

Here is the reference code: Remove the data from the T_student table.
Transaction tx=session.beginsession ();
String hql= "Delete STUDENT";
Query query=session.createquery (HQL);
int size=query.executeupdate ();
Tx.commit ();


The console output is also a DELETE statement hibernate:delete from T_student, statement execution is less, performance is similar to using JDBC, is a good way to improve performance. Of course, in order to have better performance, I recommend the bulk update and delete operations or using JDBC, methods and basic knowledge points with the above bulk Insert Method 2 is basically the same, here is not redundant.

Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();
String hqlupdate = "Update Customer set name =: newName WHERE name =: Oldname";
int updatedentities = S.createquery (hqlupdate)
. setString ("NewName", NewName)
. setString ("Oldname", Oldname)
. executeupdate ();
Tx.commit ();
Session.close ();

Execute a hql DELETE, also using the Query.executeupdate () method (this method is for those familiar with JDBC

Preparedstatement.executeupdate () and set by the people.
Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();
String hqldelete = "Delete Customer where name =: Oldname";
int deletedentities = S.createquery (hqldelete)
. setString ("Oldname", Oldname)
. executeupdate ();
Tx.commit ();
Session.close ();

The author provides a way to consider improving performance from the database side, calling the stored procedure on the hibernate terminal. Stored procedures run on the database side, faster. Take the batch update as an example and give the reference code.
First establish a stored procedure named Batchupdatestudent on the database side:

Create or replace Produre batchupdatestudent (a in number) as
Begin
Update STUDENT set age=age+1 where age>a;
End
The calling code is as follows:
Transaction tx=session.beginsession ();
Connection conn=session.connection ();
String pd= "{call Batchupdatestudent (?)}";
CallableStatement Cstmt=conn. Preparecall (PD);
Cstmt.setint (1,20); Set the age parameter to 20
Tx.commit ();

Observing the above code is also bypassing the Hibernate API, using the JDBC API to invoke the stored procedure, or the transaction boundary of Hibernate. Stored procedures are undoubtedly a good way to improve the performance of batch processing, directly run with the database side, to a certain extent, the batch of pressure transferred to the database.


There is a saveorupdateall (Collection C) method in the Hibernatetemplate class and it should also be possible to process data in batches.

The method you used before:

	Bulk increase public
	void Addpatch (list<tbuildpipelinehistarea> List) {
		Session session=gethibernatetemplate (). Getsessionfactory (). Getcurrentsession ();

		for (int i=0;i<list.size (); i++)
		{
		    Session.save (List.get (i));

		    With every 50 data as a processing unit

		    if (i%50==0)  
		    {
		        //Only the data in the hibernate cache is submitted to the database, maintaining synchronization with the database data

		        session.flush ();  

		        Clears all data from the internal cache and frees up memory
		        session.clear ();}}  
		    }
	


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.