Java mass Data import (MySQL)

Source: Internet
Author: User
Tags bulk insert parse csv file readline

? Copyright NOTICE: This article is the original blogger article, reproduced please indicate the source

Recently, colleagues have encountered a large number of data import problems, so also concerned about. There are two main problems with large-scale data import: memory overflow and slow import rate .

Memory Overflow : All of the data in the file is taken out of the collection, and when there is too much data, a Java memory overflow , which can be resolved by tuning the maximum available memory of the JVM (XMX),

but it's not a kingly . .

MySQL supports an SQL statement that inserts multiple records, and is less efficient than a single insert, but MySQL is an acceptable packet size

There are also restrictions, when too many inserts can also cause packet memory overflow, this can be done by adjusting the MySQL max_allowed_packet Solve,

but it's not kingly . .

Slow Import : Single insertion is not considered, so consider inserting multiple records into one SQL statement.

According to the above, you should also control the size of an inserted data cannot exceed the Max_allowed_packet configuration.

Below is a comparison of the rate of preparedstatement and direct splicing of SQL two bulk inserts ( 1w insertion at a time)

Package Org.javaio.csv;import Java.io.bufferedreader;import Java.io.fileinputstream;import Java.io.inputstreamreader;import Java.sql.drivermanager;import Java.sql.preparedstatement;import Java.text.simpledateformat;import java.util.date;import com.mysql.jdbc.connection;/** * Import Large batch CSV file * */public class Test {/** * jdbc belongs to, not using */private final static String url = "jdbc:mysql://localhost:3306/demo_test?usessl=true& Characterencoding=utf8 ";p rivate final static string name =" Root ";p rivate final static string pwd =" 20121221 ";p rivate STA    Tic Connection conn;private static PreparedStatement PS; /** * Parse CSV file and insert into database, temporarily unused (JDBC) * @param args * * @throws Exception */public static void Main (string[] args) throws E xception {test test = new Test ();//Psbatch time statistic-start simpledateformat SDF = new SimpleDateFormat ("Yyyy-mm-dd HH:mm:ss"); String startTime = Sdf.format (New Date ()); System.out.println ("Psbatch Start time:" + startTime); System.out.println ("Psbatch start execution ...");//Use PreparedStatement BULK INSERT inT idx = Test.psbatch ();//statistic Time-end System.out.println ("Psbatch execution completed, total insert" + idx + "bar data"); String endTime = Sdf.format (New Date ()); System.out.println ("Psbatch End time:" + endTime); System.out.println ();//Time Statistics-start starttime = Sdf.format (New Date ()); System.out.println ("Sqlbatch Start time:" + startTime); System.out.println ("sqlbatch begins execution ...");//Use SQL statements to BULK INSERT idx = Test.sqlbatch ();//statistic Time-end System.out.println ("Sqlbatch Execution complete, insert "+ idx +" bar data "); endTime = Sdf.format (New Date ()); System.out.println ("Sqlbatch End time:" + endTime);}  /** * Use PreparedStatement BULK INSERT * * @return * * @throws Exception */private int Psbatch () throws Exception {int idx = 0;// Number of rows try {//read CSV file FileInputStream fis = new FileInputStream ("C:/users/chen/desktop/data/ceshi. csv"); InputStreamReader ISR = new InputStreamReader (FIS, "UTF-8"); BufferedReader br = new BufferedReader (ISR); String line;//row Data string[] column = new string[4];//column data//GET database Connection conn = getconnection ();//Set not auto commit conn.setautocommit (FAL SE);//SQLString sql = "INSERT INTOTest (name, ' desc ', Column1, Column2, Column3, column4) "+" VALUES (?,?,?,?,?,?) "; PS = conn.preparestatement (SQL); (line = Br.readline ()) = null) {//loop reads each row idx++;//count column = Line.split (",");p S.S Etstring (1, column[0]); if (column.length >= 2 && column[1]! = null) {ps.setstring (2, column[1]);} else {ps.sets Tring (2, "");} if (column.length >= 3 && column[2]! = null) {ps.setstring (3, column[2]);} else {ps.setstring (3, "");} if (column.length >= 4 && column[3]! = null) {ps.setstring (4, column[3]);} else {ps.setstring (4, "");} Ps.setstring (5, "type");p s.setstring (6, "1");p S.addbatch (); if (idx% 10000 = = 0) {ps.executebatch (); Conn.commit (); Ps.clearbatch ();}} if (idx% 10000! = 0) {ps.executebatch (); Conn.commit ();p s.clearbatch ();}} catch (Exception e) {System.out.println ("" + idx + "First 10,000 data insertion error ..."); Finally {try {if (PS! = null) {//close connection ps.close ();} IF (conn! = null) {Conn.close ();}} catch (Exception E2) {e2.printstacktrace ();}} return idx;} /** * Using sqL Statement BULK INSERT * * @return * * @throws Exception */private int sqlbatch () {int idx = 0;//number of rows try {//read CSV file FileInputStream FIS = new FileInputStream ("C:/users/chen/desktop/data/ceshi. csv"); InputStreamReader ISR = new InputStreamReader (FIS, " UTF-8 "); BufferedReader br = new BufferedReader (ISR); String line;//row Data string[] column = new string[4];//column data//GET database Connection conn = getconnection ();//Sqlstringbuffer sql = new Stri Ngbuffer ("INSERT into Test (name, ' desc ', Column1, Column2, Column3, column4)" + "values") and while (line = Br.readline ()) ! = NULL) {//loop reads each line idx++;//count column = Line.split (","); Sql.append ("('" + column[0] + "', '"); if (Column.length >= 2 &A mp;& column[1]! = null) {Sql.append (column[1] + "', '");} else {sql.append ("', '");} if (column.length >= 3 && column[2]! = null) {Sql.append (column[2] + "', '");} else {sql.append ("', '");} if (column.length >= 4 && column[3]! = null) {Sql.append (column[3] + "', '");} else {sql.append ("', '");} Sql.append ("type ', ' 1 '),"if (idx% 10000 = = 0) {String ExecuteSQL = sql.tostring (). substring (0, sql.tostring (). LastIndexOf (","));p s = Conn.prepar Estatement (ExecuteSQL);p s.executeupdate (); sql = new StringBuffer ("INSERT into Test (name, ' desc ', Column1, Column2, Column3, Column4) "+" Values ");}} if (idx% 10000! = 0) {String ExecuteSQL = sql.tostring (). substring (0, sql.tostring (). LastIndexOf (","));p s = conn.prepares Tatement (ExecuteSQL);p s.executeupdate ();}} catch (Exception e) {System.out.println ("" + idx + "First 10,000 data insertion error ..."); Finally {try {if (PS! = null) {//close connection ps.close ();} IF (conn! = null) {Conn.close ();}} catch (Exception E2) {e2.printstacktrace ();}} return idx;} /** * Get Database connection * * @param SQL * SQL statement */private Connection getconnection () throws Exception {Class.forName ("Com.mysql.jdbc . Driver "); conn = (Connection) drivermanager.getconnection (URL, name, PWD); return conn;}}

Rate Comparison: to exclude other effects, two times are imported in the case of empty tables

Use SQL splicing Batch insert time about 3-4 minutes

It takes about 10 minutes to insert the PreparedStatement in bulk.

Java mass Data import (MySQL)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.