Hibernate processing batch updates and bulk deletions

Source: Internet
Author: User
Tags bulk insert commit flush stmt
9.4 Batch processing data
Typically, only a limited number of persisted objects are stored in the cache of a Session object, and the session object is closed when the session object processes the transaction, releasing the memory occupied by the session's cache in a timely manner.
Bulk processing data refers to handling large amounts of data in a transaction. The following program batch updates the age field for all records in a Customers table that are older than 0 in a single transaction:
Transaction tx = Session.begintransaction ();
Iterator customers=
Session.createquery ("from Customer C where c.age>0"). List (). iterator ();
while (Customers.hasnext ()) {
Customer customer= (Customer) Customers.next ();
Customer.setage (Customer.getage () +1);
}

Tx.commit ();
Session.close ();

If there are 10,000 records in the Customers table that are older than 0, hibernate loads 10,000 customer objects into memory at a draught. When the Tx.commit () method is executed, the cache is cleaned and hibernate executes the UPDATE statement for the 10,000 Update Customers table:
Update CUSTOMERS set age=? ... where id=i;
Update CUSTOMERS set age=? ... where id=j;
......
Update CUSTOMERS set age=? ... where id=k;

The above batch update method has two disadvantages:
Consumes a lot of memory, you must load 10,000 customer objects into memory and update them all.
The number of UPDATE statements executed is too large, each UPDATE statement can only update one customer object, must pass 10,000 update statements to update 10,000 customer objects, frequent access to the database, can greatly reduce the performance of the application.

Generally speaking, you should avoid bulk operations at the application tier as much as possible, but you should do bulk operations directly in the database tier, such as executing SQL statements for bulk updates or deletions directly in the database, and if the logic of the bulk operation is more complex, you can do bulk operations by running stored procedures directly in the database.
Not all database systems support stored procedures. For example, the current MySQL does not support stored procedures, so it is not possible to use stored procedures for bulk updates or bulk deletion.
Of course, in the application layer can also carry out bulk operations, mainly in the following ways:
(1) Batch operation through session.
(2) The batch operation is carried out through statelesssession.
(3) The batch operation is carried out through HQL.
(4) directly through the JDBC API for bulk operations.

9.4.1 Batch operations via session

The Save () and update () methods of the session store the processed objects in their own cache. If you handle a large number of persistent objects through a Session object, you should empty the cache of objects that have been processed and that are not accessed. The practice is to call the flush () method to clean the cache immediately after processing an object or a small batch object, and then call the clear () method to empty the cache.

Bulk operations by session are subject to the following constraints:
(1) The number of JDBC single batch processing needs to be set in the Hibernate configuration file, and a reasonable value is usually between 10 and 50, for example:
Hibernate.jdbc.batch_size=20
In bulk operations as described in this section, you should ensure that the number of batch SQL statements sent to the database is consistent with this batch_size property.
(2) If the object uses the "identity" identifier builder, hibernate cannot perform a bulk insert operation at the JDBC layer.
(3) In bulk operations, it is recommended to turn off the second-level cache of Hibernate. This book's sister chapter, "Proficient in Hibernate: Advanced article" on the second level of caching is described in detail. The session cache is the first-level cache of Hibernate, which is typically a transaction-scoped cache, that is, each transaction has a separate first-level cache. The Sessionfactory external cache is the second-level cache of Hibernate, which is the application-scoped cache, that is, all transactions share the same secondary cache. In any case, the first-level cache of Hibernate is always available. By default, the second-level cache of Hibernate is turned off, and the secondary cache can be explicitly shut down in the Hibernate configuration file by:
Hibernate.cache.use_second_level_cache=false

1. BULK INSERT Data
The following code inserts 100,000 customers records into the database, and inserts 20 customers records in a single batch:
Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();

for (int i=0; i<100000; i++) {
Customer customer = new Customer (...);
Session.save (customer);
if (i% 20 = 0) {//single bulk operation number is 20
Session.flush (); Clean cache, execute SQL INSERT statement for BULK INSERT 20 records
Session.clear (); Empty the Customer object in the cache
}
}

Tx.commit ();
Session.close ();

In the above program, each time you execute the Session.flush () method, a batch of 20 records is inserted into the database. Next, the Session.clear () method empties 20 newly saved customer objects from the cache.
In order to ensure that the above procedures run smoothly, the following constraints need to be observed.
In the hibernate configuration file, the Hibernate.jdbc.batch_size attribute should also be set to 20.
Closes the second level cache. Because if a secondary cache is used, then all the customer objects created in the first level cache (that is, the session cache) are copied to the second level cache before being saved to the database, which can cause a lot of unnecessary overhead.
The identifier builder for the customer object cannot be "identity."

2. Batch Update data

When you do a batch update, it is obviously undesirable to have all the objects in a cache loaded into the session and then update them one by one in the cache. To resolve this problem, you can use the scroll () method of a scrollable result set Org.hibernate.scrollableresults,query to return a Scrollableresults object. The following code demonstrates a bulk update of the customer object, which starts by using the Scrollableresults object to load all the customer objects:
Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();

Scrollableresults customers= session.createquery ("from Customer")
. Scroll (scrollmode.forward_only);
int count=0;
while (Customers.next ()) {
Customer customer = (customer) customers.get (0);
Customer.setage (Customer.getage () +1); Update the Age property of the Customer object
if (++count% 20 = 0) {//The number of single bulk operations is 20

Session.flush ()//clean cache, execute SQL UPDATE statement for batch update 20 records
Session.clear ()//emptying the Customer object in the cache
}
}

Tx.commit ();
Session.close ();

In the above code, query's scroll () method returns a Scrollableresults object that does not actually contain any customer objects, and it contains only cursors that are used to locate customers records in the online database. It does not load the corresponding customer object into the database until the program traverses the access to a specific element in the Scrollableresults object.
In order to ensure that the above procedures run smoothly, the following constraints need to be observed:
In the hibernate configuration file, the Hibernate.jdbc.batch_size attribute should also be set to 20.
Closes the second level cache. If the second level cache is already enabled in the configuration file, you can also ignore the secondary cache in your program in the following ways:
Scrollableresults customers= session.createquery ("from Customer")
Ignore second level cache
. Setcachemode (Cachemode.ignore)
. Scroll (scrollmode.forward_only);

9.4.2 through Statelesssession for bulk operations

The session has a cache to keep the object in memory synchronized with the corresponding data in the database, and the object in the session cache is a persisted object. In bulk operations, however, storing a large number of objects in the session cache consumes a lot of memory space. As an alternative, a stateless statelesssession can be used for bulk operation.
The following code uses Statelesssession for bulk update operations:
Statelesssession session = Sessionfactory.openstatelesssession ();
Transaction tx = Session.begintransaction ();

Scrollableresults customers = Session.getnamedquery ("GetCustomers")
. Scroll (scrollmode.forward_only);
while (Customers.next ()) {
Customer customer = (customer) customers.get (0);
Customer.setage (Customer.getage () +1); Updates the age attribute of the customer object in memory;
Session.update (Customer), execute the UPDATE statement immediately, update the corresponding customers record in the database
}

Tx.commit ();
Session.close ();

Formally, Statelesssession is somewhat similar to the usage of the session. Statelesssession compared to session, there are the following differences:
(1) Statelesssession has no cache, the objects that are loaded, saved or updated through Statelesssession are in a free state.
(2) Statelesssession does not interact with the second-level cache of Hibernate.
(3) when the Save (), update (), or delete () method of the Statelesssession is invoked, the methods immediately execute the appropriate SQL statement rather than simply plan to execute an SQL statement.
(4) Statelesssession does not automatically perform a dirty check on the object being loaded. So in the above program, after modifying the properties of the customer object in memory, you need to update the corresponding data in the database through the Statelesssession update () method.
(5) Statelesssession does not perform any cascading operations on the associated object. For example, when you save a customer object by Statelesssession, the order object associated with it is not cascaded.
(6) The operation done by Statelesssession can be captured by the Interceptor interceptor, but it will be ignored by the Hibernate event handling system.
(7) When a customer object with OID 1 is loaded two times with the same Statelesssession object, two customer objects with different memory addresses are obtained, for example:
Statelesssession session = Sessionfactory.openstatelesssession ();
Customer c1= (Customer) Session.get (Customer.class,new Long (1));
Customer c2= (Customer) Session.get (Customer.class,new Long (1));
System.out.println (C1==C2); Print False

9.4.3 through HQL for bulk operations

The HQL (Hibernate query Language in Hibernate3) can retrieve not only data, but also bulk updates, deletes, and inserts of data. Bulk operations are actually done directly in the database, and the data being processed is not stored in the session's cache and therefore does not occupy memory space.
The Query.executeupdate () method is similar to the preparedstatement.executeupdate () in the JDBC API, which executes the HQL statement for updating, deleting, and inserting, which executes for updating, The SQL statement that was deleted and inserted.

1. Batch Update data
The following program code demonstrates the bulk updating of the customer object via HQL:
Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();

String hqlupdate =
"Update Customer C Set c.name =: newName where c.name =: Oldname";
int updatedentities = Session.createquery (hqlupdate)
. SetString ("NewName", "Mike")
. SetString ("Oldname", "Tom")
. executeupdate ();

Tx.commit ();
Session.close ();
The SQL statements that the above program code sends to the database are:
Update CUSTOMERS set name= "Mike" where name= "Tom"

2. Bulk Delete Data
The session's Delete () method can only delete one object at a time, and is not suitable for bulk delete operations. The following program code demonstrates the bulk deletion of a customer object by HQL:
Session session = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();

String hqldelete = "Delete Customer c where c.name =: Oldname";
int deletedentities = session.createquery (hqldelete)
. SetString ("Oldname", "Tom")
. Executeupdate ()
T X.commit ();
Session.close ();
The SQL statement submitted to the database by the above program code is:
Delete from CUSTOMERS where name= "Tom"

3. Bulk INSERT Data
The HQL syntax for inserting data is:
INSERT into EntityName properties_list select_statement
above EntityName represents the name of the persisted class. Properties_list represents a list of properties for the persisted class, select_statement represents a subquery statement.
HQL only supports insert INTO ... select ... form, instead of "insert INTO ... values ..." Insert statement in the form.
The following example shows how to bulk insert data through HQL. The Delinquentaccount and the customer classes are assumed to have the ID and name attributes, and the tables corresponding to these two classes are delinquent_accounts and customers tables respectively. The DelinquentAccount.hbm.xml and Customer.hbm.xml files are mapping files for these two classes, respectively. The following code can copy the data from the Customers table into the Delinquent_accounts table:
Session sessions = Sessionfactory.opensession ();
Transaction tx = Session.begintransaction ();

String Hqlinsert = "INSERT into Delinquentaccount (ID, name) Select C.id, c.name from Customer C where c.id>1";
int createdentities = s.createquery (Hqlinsert)
. executeupdate ();
Tx.commit ();
Session.close ();

The SQL statement submitted to the database by the above program code is:
INSERT into delinquent_accounts (id,name) Select Id,name from CUSTOMERS where Id>1
br> 9.4.4 directly through the JDBC API for bulk Operations

When SQL Insert, UPDATE, and DELETE statements are executed through the JDBC API, the data that is involved in the SQL statement is not loaded into memory and therefore does not occupy memory space.
The following program executes SQL statements for batch updates directly through the JDBC API:
Transaction tx = Session.begintransaction ();
//Get the database connection used by this session
Java.sql.Connection con=session.connection ();
//Execute SQL statements for batch updates through the JDBC API
PreparedStatement stmt=con.preparestatement ("Update CUSTOMERS set age=age+1"
+ "Where age>0");
Stmt.executeupdate ();

Tx.commit ();

The above program obtains the database connection used by the session through the connection () method of the session, and then creates the PreparedStatement object and executes the SQL statement through it. It is noteworthy that the application still declares the transaction boundary through the Hibernate transaction interface.
It is noteworthy that in Hibernate3, although the connection () method of the session still exists, it has been deprecated and is not advocated for use, However, Hibernate3 provides an alternative: The Org.hibernate.jdbc.Work interface represents an operation that accesses a database directly through the JDBC API, and the Execute () method of the work interface is used to execute directly through the JDBC API to access the operation of the database:
Public interface Work {
//direct access to the database through the JDBC API
public void execute (Connection Connection) thro WS SQLException; The DoWork (Work Work) method of the
}
session is used to perform the action specified by the Work object, that is, to invoke the Execute () method of the Work object. The session passes the database connection currently in use to the Execute () method.

The following procedure demonstrates the process of performing a bulk operation through the work interface and the DoWork () method of the session:
Transaction tx=session.begintransaction ();
// Defines an anonymous class that implements the Work interface
Work work=new Work () {
public void execute (Connection Connection) throws sqlexception{
// Executes the SQL statement for batch update through the JDBC API
PreparedStatement stmt=connection
. Preparestatement ("Update CUSTOMERS set age=age+1"
+ "where age>0");
Stmt.executeupdate ();
}
};

//Execute work
Session.dowork (work);
Tx.commit ();

When SQL statements are executed through the PreparedStatement interface in the JDBC API, the data that is involved in the SQL statement is not loaded into the session's cache and therefore does not occupy memory space.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.