How do I import 40 million pieces of data from Oracle to MySQL?

Source: Internet
Author: User

Because the company wants to replace the advertising system and test the performance of the new system, the company's previous system uses Oracle, the new system is MySQL, and the table structure is not the same. Create a user in the local Oracle database, create a data table with the same structure as that in MySQL, and then organize the data to insert the old data into the new table. Next, we will import data from the local Oracle database to MySQL.

Export to SQL script, several GB of script, and the two databases do not have one CIDR block, and the execution is very slow. The imported file database is inconsistent. Without thinking of a good export and import method, I chose to adopt the program implementation. below is all my code:

 

Public class dumpdata {

 

Public static void main (string [] ARGs ){

Connection con = NULL;
Connection conn = NULL;

Preparedstatement PST = NULL;
Statement SQL _statement = NULL;

Try {

Long starttime = system. currenttimemillis ();
Simpledateformat SDF = new simpledateformat ("HH: mm: SS ");
Timezone T = SDF. gettimezone ();
T. setrawoffset (0 );
SDF. settimezone (t );

Class. forname ("oracle. JDBC. Driver. oracledriver ");
String url = "JDBC: oracle: thin: @ 219. 239. **. ***: 1521: itcpn ";
Conn = drivermanager. getconnection (URL ,"***","***");
If (Conn! = NULL) system. Out. println ("Oracle connection successful ");


Class. forname ("org. gjt. Mm. MySQL. Driver ");
System. Out. println ("Success loading MySQL driver! ");
Con = drivermanager. getconnection ("JDBC: mysql: // 219. 239. **. ***: 3306 /***? & Useunicode = true & characterencoding = gb2312 & rewritebatchedstatements = true ","****","*****");
If (con! = NULL) system. Out. println ("MySQL connection successful ");
Con. setautocommit (false );

SQL _statement = conn. createstatement ();
String countsql = "select/* + parallel (T, 4) */count (1) From/" solution /"";
Resultset res = SQL _statement.executequery (countsql );
Int COUNT = 0;
While (res. Next ()){
Count = res. getint (1 );
}
Res. Close ();
Int loop = count/0/100000;

Stringbuffer insertsql = new stringbuffer ("insert into solution values (? ");
For (INT I = 0; I <38; I ++ ){
Insertsql. append (",? ");
}
Insertsql. append (")");
Pst = (preparedstatement) con. preparestatement (insertsql. tostring ());
System. Out. println (insertsql. tostring ());

For (INT n = 0; n <= loop; n ++ ){
String query = "select ID, account, memo, channeltype, channelid, bannergroupid, bannerid, configid," +
"Enable, GID, priority, percent, aid, method, type, startday, starttime, quitday, quittime, campaignid," +
"Fixedchannelid, nopaying, optflag, keyword, keywordflag, keywordtext, keywordencoding, reserveflag," +
"Forecastflag, interval, castingtype, Count, crm_priority, crm_limitnum, crm_limitdate, deleteflag, solutiontypeid, price, Period" + "from (select ID, account, memo, channeltype, channelid, bannergroupid, bannerid, configid, "+
"Enable, GID, priority, percent, aid, method, type, startday, starttime, quitday, quittime, campaignid," +
"Fixedchannelid, nopaying, optflag, keyword, keywordflag, keywordtext, keywordencoding, reserveflag," +
"Forecastflag, estimatemediaexpense, castingtype, Count, crm_priority, crm_limitnum, crm_limitdate, deleteflag, solutiontypeid," + "Price, achievedflag, rownum rn from/"solution/" where rownum <= "+ (n + 1) * 100000 +") Where rn> "+ N * 100000;

System. Out. println (query );
Resultset result = SQL _statement.executequery (query );

Result. setfetchsize (1000 );


Long endtime1 = system. currenttimemillis ();
System. Out. println ("complete data query, time:" + SDF. Format (new date (endtime1-starttime )));

While (result. Next ()){
For (INT I = 1; I <40; I ++ ){
Pst. setobject (I, result. GetObject (I ));
}
Pst. addbatch ();
If (result. getrow () % 10000 = 0 ){
System. Out. println ("insert 10000 data records! ");
Pst.exe cutebatch ();
Pst. clearbatch ();
}
}
Pst.exe cutebatch ();
Pst. clearbatch ();
Con. Commit ();
System. Out. println ("0.1 million records are inserted! ");
Result. Close ();
System. GC ();

}

Long endtime = system. currenttimemillis ();
System. Out. println ("complete data insertion, time:" + SDF. Format (new date (endtime-starttime )));

} Catch (classnotfoundexception e ){
E. printstacktrace ();
} Catch (sqlexception e ){
E. printstacktrace ();
} Catch (exception e ){
E. printstacktrace ();
}
Finally {
Try {
If (SQL _statement! = NULL) SQL _statement.close ();
If (Pst! = NULL) PST. Close ();
If (con! = NULL &&! Con. isclosed ()){
Con. setautocommit (true );
Con. Close ();
}
If (con! = NULL &&! Conn. isclosed ()){
Conn. Close ();
}
} Catch (sqlexception e ){
E. printstacktrace ();
}

}
}
}

 

The addbatch () insertion method is adopted. Calling addbatch () is a process of saving SQL statements, which is almost time-consuming. It is time-consuming to call executebatch (). However, if too many statements are collected, this will cause MySQL to crash. Here I set a moderate 10000.

Considering the performance connection, the rewritebatchedstatements parameter is added. The default value is false. You must manually set it to true. When rewritebatchedstatements is set to false, the execution path will jump to executebatchserially. In this method, the statements are sent one by one, which is almost the same as the non-batch processing, so it is slow. When set to true, the executebatchedinserts method is executed. In fact, MySQL supports such insert statements.

SQL code Insert  IntoT_user (ID, uname) Values(1, '1'), (2, '2'), (3, '3')... reference http://www.javaeye.com/topic/770032? Page = 3

The MySQL driver uses mysql-connector-java-5.1.14-bin.jar.

The database record query policy is also optimized later./* + parallel (T, 4) */enables parallel database processing, generally, parallel operations can be used only when all tables or indexes are accessed, and the data volume reaches millions. Increase the fetch_size to reduce the interactions between result data transmission and the server data preparation time to improve performance. We recommend that you do not exceed 1000. If it is too large, there will be no performance improvement, but it may increase the risk of memory overflow. The SQL optimization of the query statement also provides the query speed by querying 100000 records at a time and nesting them. After optimization, the running speed can also be greatly improved. Running on an intranet PC starts to take over 0.2 million pieces of data for more than 30 seconds. It is estimated that running on an internet server can be doubled. Finally, it took over 50 minutes to import 40 million records to the internet server. I heard that there is also a great way to export data. If you are interested, you can study it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.