FAST text data import to the database, greatly improving the speed and Performance

Source: Internet
Author: User
Tags psql

The company's communication billing management system has an important function: Text data is imported into the database, the format of text data is:

The analyzed caller number, the analyzed called number, start time (string), End Time (string), call duration, incoming relay, outgoing relay, and service type, CTX group ID, billing mark, reason for termination

61915421,075486321177, 2010-04-1023:59:34. 91,2010-04-1100:00:04. 46, 29.55, 2*37,2 *, 7, 0

61915400,02085512676, 2010-04-1023:59:38. 85,2010-04-1100:00:15. 56, 36.71, 2*37,2*7, 0

61915403,0208411616, 2010-04-1023:59:40. 82,2010-04-1100:00:17. 36, 36.54, 2*37,2*7,7, 0

61915419,02039991550, 2010-04-1023:59:46. 90,2010-04-1100:00:21. 09, 34.19, 2*37,2*7, 0

61915402,02084549963, 2010-04-1023:59:41. 21,2010-04-1100:00:27. 28, 46.07, 2*37,2 *, 7, 0

4904341,428615242528484, 2010-04-1023:59:10. 67,2010-04-1100:00:28. 59, 77.92, 2*6, 7, 0

619152.16,02089239027, 2010-04-1023:59:44. 05,2010-04-1100:00:34. 18, 50.13, 2*37,2 *, 7, 0

At the beginning, we used the Readline () method of streamreader to read data in a row, and then split the data by commas using the string split () method, create a able manually in the memory, and then write the method to import the datatable to the database. The specific implementation is as follows:

1. Generate datetable for text data

 

Public datatable getcdrtext (string strfilepathname) {streamreader objsr = NULL; string strfilename = ""; try {strfilename = path. getfilename (strfilepathname); objsr = new streamreader (strfilepathname, system. text. encoding. default); // set the table column 'able able dt = new datatable (); DT. tablename = strfilename; string strfistline = objsr. readline (). tostring (); string [] Columns = strfistline. split (",". tochararray (); For (INT I = 0; I <columns. length; I ++) {If (columns [I]. equals ("analyzed caller number") {columns [I] = "analyzed caller number";} else if (columns [I]. equals ("analyzed called Number") {columns [I] = "analyzed called Number";} else if (columns [I]. equals ("Start Time (string)") {columns [I] = "Start Time";} else if (columns [I]. equals ("End Time (string)") {columns [I] = "End Time";} else if (columns [I]. equals ("Call duration") {columns [I] = "duration";} else if (columns [I]. equals ("business category") {columns [I] = "traffic type";} else if (columns [I]. equals ("Start Time (integer)") {columns [I] = "Start Time integer";} else if (columns [I]. equals ("End Time (integer)") {columns [I] = "End Time integer";} DT. columns. add (columns [I]);} DT. columns. add ("servicecategory"); string [] srvalue = NULL; while (objsr. peek ()>-1) {srvalue = objsr. readline (). tostring (). split (",". tochararray (); // dttmp. rows. clear (); DT. rows. add (srvalue); // Add data to the table} objsr. close (); Return DT;} catch (system. exception e) {objsr. close (); throw e ;}}

 

2. SQL statement Design

 

Public int insertdata (datatable DT) {string strtablename = string. empty; sqlparameter [] pparamete = NULL; string Psql = string. empty; // table name Psql = "insert into originaldata"; Psql + = "(callernumber, calleenumber, starttime, endtime, duration, intrunk, outtrunk, servicecategory, role, chargeflag, releasereason) values "; Psql + =" (@ analyzed caller number, @ analyzed called number, @ start time, @ End Time, @ duration, @ relay, @ outbound relay, @ traffic type, @ CTX group ID, @ billing mark, @ termination reason )"; pparamete = new sqlparameter [] {New sqlparameter ("@ analyzed caller number", sqldbtype. varchar, 20, "analyzed caller number"), new sqlparameter ("@ analyzed called Number", sqldbtype. varchar, 35, "analyzed called Number"), new sqlparameter ("@ start time", sqldbtype. datetime, 30, "Start Time"), new sqlparameter ("@ end time", sqldbtype. datetime, 30, "End Time"), new sqlparameter ("@ duration", sqldbtype. decimal, 18, "duration"), new sqlparameter ("@ traffic type", sqldbtype. varchar, 6, "servicecategory"), new sqlparameter ("@ relay", sqldbtype. varchar, 6, "inbound relay"), new sqlparameter ("@ outbound relay", sqldbtype. varchar, 6, "outbound relay"), new sqlparameter ("@ CTX group ID", sqldbtype. int, 6, "CTX group ID"), new sqlparameter ("@ billing mark", sqldbtype. int, 6, "billing mark"), new sqlparameter ("@ termination reason", sqldbtype. int, 6, "reason for termination")}; return executeinsert (Psql, DT, pparamete );}

 

3. insert data from the entire table to the database

 

Public int executeinsert (string sqlstring, datatable DT, Params sqlparameter [] partition parms) {sqldataadapter objadapter = NULL; sqlcommand objcomm = NULL; try {sqlconnection objconn = new sqlconnection (system. configuration. configurationmanager. appsettings ["connectionstring"]); If (objconn. state = connectionstate. closed) {objconn. open ();} objcomm = new sqlcommand (sqlstring, objconn); objcomm. commandtimeout = 0; objadapter = new sqldataadapter (); For (INT I = 0; I <partition parms. length; I ++) {objcomm. parameters. add (partition parms [I]);} objadapter. updatecommand = objcomm; objadapter. insertcommand = objcomm; int intret = objadapter. update (DT); objconn. close (); Return intret;} catch (system. exception e) {Throw e ;}}

 

When using this method, the text files of about 4 MB can still be competent, but it takes a long time to import each file of more than 10 MB, in addition, sometimes there will be a reminder of insufficient memory (the computer for development preparation garbage, 1 GB physical memory, 1 GB virtual memory ). Therefore, this method does not work for me, So Baidu and Google finally found a better solution, using the SQL Server OpenRowSet and bulk methods to achieve fast data import, if you do not understand these two methods, you can use Baidu or Google to improve the performance by a factor of (exaggerated). Data Import significantly reduces time consumption and achievesCodeAs follows:

1. Data Files in XML file format

 

<? XML version = "1.0"?> <Bcpformat xmlns = "http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns: xsi = "http://www.w3.org/2001/XMLSchema-instance"> <record> <field ID = "1" xsi: TYPE = "charterm" Terminator = ", "max_length =" 20 "/> <field ID =" 2 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 35 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 3 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 50 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 4 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 50 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 5 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 50 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 6 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 6 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 7 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 6 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 8 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 16 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 9 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 16 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 10 "xsi: TYPE =" charterm "Terminator = ", "max_length =" 16 "collation =" SQL _latin1_general_cp1_ci_as "/> <field ID =" 11 "xsi: type = "charterm" Terminator = "\ n" max_length = "16" collation = "SQL _latin1_general_cp1_ci_as"/> </record> <row> <column source = "1" name =" callernumber "xsi: type = "sqlvarychar"/> <column source = "2" name = "calleenumber" xsi: type = "sqlvarychar"/> <column source = "3" name = "starttime" xsi: type = "sqldatetime"/> <column source = "4" name = "endtime" xsi: type = "sqldatetime"/> <column source = "5" name = "duration" xsi: type = "sqlvarychar"/> <column source = "6" name = "intrunk" xsi: type = "sqlvarychar"/> <column source = "7" name = "outtrunk" xsi: type = "sqlvarychar"/> <column source = "8" name = "servicecategory" xsi: type = "sqlvarychar"/> <column source = "9" name = "centrexgroupid" xsi: type = "sqlsmallint"/> <column source = "10" name = "chargeflag" xsi: type = "sqlsmallint"/> <column source = "11" name = "releasereason" xsi: TYPE = "sqlsmallint"/> </row> </bcpformat>

2. concatenate SQL statements

 

 

If (openfiledialog1.showdialog () = dialogresult. OK) {try {datetime dt1 = datetime. now; string strfilename1 = openfiledialog1.filename; string strfilename2 = appdomain. currentdomain. basedirectory + "format. XML "; stringbuilder stbsql = new stringbuilder (); stbsql. append ("insert into originaldata (callernumber, calleenumber, starttime, endtime, duration, servicecategory,"); stbsql. append ("intrunk, outtrunk, centrexgroupid, chargeflag, releasereason)"); stbsql. append ("select callernumber, calleenumber, starttime, endtime, ceiling (duration), servicecategory, intrunk, outtrunk, centrexgroupid, chargeflag, releasereason"); stbsql. appendformat ("from OpenRowSet (bulk '{0}', formatfile = '{1}', firstrow = 2) as t", strfilename1, strfilename2); executeinsert (stbsql. tostring (); // This is just a simple method for executing SQL statements. Here, datetime dt2 = datetime is not pasted. now; timespan Ts = dt2-dt1; label1.text + = ts. tostring () ;}catch (exception ex) {Throw ex ;}}

 

The execution speed is as follows: the size of the selected imported text file is 10 MB. Since I import data to the same table in the same database, therefore, after each import, I use truncate table to clear data.

 

We can see that the txtread method consumes about 27 times the Time of the bulk method.

 

I am not writing well. You are welcome to make a picture. If anyone has a better way, let's talk about it!

 

Source code: windemo

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.