? Copyright NOTICE: This article is the original blogger article, reproduced please indicate the source
Recently, colleagues have encountered a large number of data import problems, so also concerned about. There are two main problems with large-scale data import: memory overflow and slow import rate .
Memory Overflow : All of the data in the file is taken out of the collection, and when there is too much data, a Java memory overflow , which can be resolved by tuning the maximum available memory of the JVM (XMX),
but it's not a kingly . .
MySQL supports an SQL statement that inserts multiple records, and is less efficient than a single insert, but MySQL is an acceptable packet size
There are also restrictions, when too many inserts can also cause packet memory overflow, this can be done by adjusting the MySQL max_allowed_packet Solve,
but it's not kingly . .
Slow Import : Single insertion is not considered, so consider inserting multiple records into one SQL statement.
According to the above, you should also control the size of an inserted data cannot exceed the Max_allowed_packet configuration.
Below is a comparison of the rate of preparedstatement and direct splicing of SQL two bulk inserts ( 1w insertion at a time)
Package Org.javaio.csv;import Java.io.bufferedreader;import Java.io.fileinputstream;import Java.io.inputstreamreader;import Java.sql.drivermanager;import Java.sql.preparedstatement;import Java.text.simpledateformat;import java.util.date;import com.mysql.jdbc.connection;/** * Import Large batch CSV file * */public class Test {/** * jdbc belongs to, not using */private final static String url = "jdbc:mysql://localhost:3306/demo_test?usessl=true& Characterencoding=utf8 ";p rivate final static string name =" Root ";p rivate final static string pwd =" 20121221 ";p rivate STA Tic Connection conn;private static PreparedStatement PS; /** * Parse CSV file and insert into database, temporarily unused (JDBC) * @param args * * @throws Exception */public static void Main (string[] args) throws E xception {test test = new Test ();//Psbatch time statistic-start simpledateformat SDF = new SimpleDateFormat ("Yyyy-mm-dd HH:mm:ss"); String startTime = Sdf.format (New Date ()); System.out.println ("Psbatch Start time:" + startTime); System.out.println ("Psbatch start execution ...");//Use PreparedStatement BULK INSERT inT idx = Test.psbatch ();//statistic Time-end System.out.println ("Psbatch execution completed, total insert" + idx + "bar data"); String endTime = Sdf.format (New Date ()); System.out.println ("Psbatch End time:" + endTime); System.out.println ();//Time Statistics-start starttime = Sdf.format (New Date ()); System.out.println ("Sqlbatch Start time:" + startTime); System.out.println ("sqlbatch begins execution ...");//Use SQL statements to BULK INSERT idx = Test.sqlbatch ();//statistic Time-end System.out.println ("Sqlbatch Execution complete, insert "+ idx +" bar data "); endTime = Sdf.format (New Date ()); System.out.println ("Sqlbatch End time:" + endTime);} /** * Use PreparedStatement BULK INSERT * * @return * * @throws Exception */private int Psbatch () throws Exception {int idx = 0;// Number of rows try {//read CSV file FileInputStream fis = new FileInputStream ("C:/users/chen/desktop/data/ceshi. csv"); InputStreamReader ISR = new InputStreamReader (FIS, "UTF-8"); BufferedReader br = new BufferedReader (ISR); String line;//row Data string[] column = new string[4];//column data//GET database Connection conn = getconnection ();//Set not auto commit conn.setautocommit (FAL SE);//SQLString sql = "INSERT INTOTest (name, ' desc ', Column1, Column2, Column3, column4) "+" VALUES (?,?,?,?,?,?) "; PS = conn.preparestatement (SQL); (line = Br.readline ()) = null) {//loop reads each row idx++;//count column = Line.split (",");p S.S Etstring (1, column[0]); if (column.length >= 2 && column[1]! = null) {ps.setstring (2, column[1]);} else {ps.sets Tring (2, "");} if (column.length >= 3 && column[2]! = null) {ps.setstring (3, column[2]);} else {ps.setstring (3, "");} if (column.length >= 4 && column[3]! = null) {ps.setstring (4, column[3]);} else {ps.setstring (4, "");} Ps.setstring (5, "type");p s.setstring (6, "1");p S.addbatch (); if (idx% 10000 = = 0) {ps.executebatch (); Conn.commit (); Ps.clearbatch ();}} if (idx% 10000! = 0) {ps.executebatch (); Conn.commit ();p s.clearbatch ();}} catch (Exception e) {System.out.println ("" + idx + "First 10,000 data insertion error ..."); Finally {try {if (PS! = null) {//close connection ps.close ();} IF (conn! = null) {Conn.close ();}} catch (Exception E2) {e2.printstacktrace ();}} return idx;} /** * Using sqL Statement BULK INSERT * * @return * * @throws Exception */private int sqlbatch () {int idx = 0;//number of rows try {//read CSV file FileInputStream FIS = new FileInputStream ("C:/users/chen/desktop/data/ceshi. csv"); InputStreamReader ISR = new InputStreamReader (FIS, " UTF-8 "); BufferedReader br = new BufferedReader (ISR); String line;//row Data string[] column = new string[4];//column data//GET database Connection conn = getconnection ();//Sqlstringbuffer sql = new Stri Ngbuffer ("INSERT into Test (name, ' desc ', Column1, Column2, Column3, column4)" + "values") and while (line = Br.readline ()) ! = NULL) {//loop reads each line idx++;//count column = Line.split (","); Sql.append ("('" + column[0] + "', '"); if (Column.length >= 2 &A mp;& column[1]! = null) {Sql.append (column[1] + "', '");} else {sql.append ("', '");} if (column.length >= 3 && column[2]! = null) {Sql.append (column[2] + "', '");} else {sql.append ("', '");} if (column.length >= 4 && column[3]! = null) {Sql.append (column[3] + "', '");} else {sql.append ("', '");} Sql.append ("type ', ' 1 '),"if (idx% 10000 = = 0) {String ExecuteSQL = sql.tostring (). substring (0, sql.tostring (). LastIndexOf (","));p s = Conn.prepar Estatement (ExecuteSQL);p s.executeupdate (); sql = new StringBuffer ("INSERT into Test (name, ' desc ', Column1, Column2, Column3, Column4) "+" Values ");}} if (idx% 10000! = 0) {String ExecuteSQL = sql.tostring (). substring (0, sql.tostring (). LastIndexOf (","));p s = conn.prepares Tatement (ExecuteSQL);p s.executeupdate ();}} catch (Exception e) {System.out.println ("" + idx + "First 10,000 data insertion error ..."); Finally {try {if (PS! = null) {//close connection ps.close ();} IF (conn! = null) {Conn.close ();}} catch (Exception E2) {e2.printstacktrace ();}} return idx;} /** * Get Database connection * * @param SQL * SQL statement */private Connection getconnection () throws Exception {Class.forName ("Com.mysql.jdbc . Driver "); conn = (Connection) drivermanager.getconnection (URL, name, PWD); return conn;}}
Rate Comparison: to exclude other effects, two times are imported in the case of empty tables
Use SQL splicing Batch insert time about 3-4 minutes
It takes about 10 minutes to insert the PreparedStatement in bulk.
Java mass Data import (MySQL)