Batch insert and update massive data in C #

Source: Internet
Author: User

Http://blog.csdn.net/axman/article/details/2200840

For the insertion and update of massive data, ADO. NET is indeed inferior to JDBC, and JDBC has a unified model for batch operations.
Very convenient:
Preparedstatement PS = conn. preparestatement ("insert or update arg1, args2 ....");
Then you can
For (INT I = 0; I <1000000000000000; I ++ ){
PS. setxxx (realarg );
.....
PS. addbatch ();
If (I % 500 = 0) {// assume that five hundred entries are submitted once.
Ps.exe cutebatch ();
// Clear parame batch
}
}
Ps.exe cutebatch ();
 
This operation not only brings about extremely high performance, but also is very convenient. To implement this function in ADO. net, it should be directly in the command interface.
Or the dataadapter interface provides addbat and commitbat APIs, but ADO. Net does not implement this simply, but requires developers to pass
Complex workarounds.
For a large number of insert operations, you can use an empty able to Add rows to be inserted. after a certain number of rows are submitted, you can clear the table,
Implementation is not complicated:

Datetime begin = datetime. now;
String connectionstring = ......;
Using (sqlconnection conn = new sqlconnection (connectionstring ))...{
Conn. open ();
Sqldataadapter SD = new sqldataadapter ();
SD. selectcommand = new sqlcommand ("select devid, data_time, data_value from currenttest", Conn );
SD. insertcommand = new sqlcommand ("insert into currenttest (devid, data_time, data_value )"
+ "Values (@ devid, @ data_time, @ data_value);", Conn );
SD. insertcommand. Parameters. Add ("@ devid", sqldbtype. Char, 18, "devid ");
SD. insertcommand. Parameters. Add ("@ data_time", sqldbtype. Char, 19, "data_time ");
SD. insertcommand. Parameters. Add ("@ data_value", sqldbtype. Int, 8, "data_value ");
SD. insertcommand. updatedrowsource = updaterowsource. None;
SD. updatebatchsize = 0;

Dataset dataset = new dataset ();
SD. Fill (Dataset );
Random r = new random (1000 );
For (INT I = 0; I <100000; I ++ )...{
Object [] ROW =... {"devid" + I, datetime. now. tostring ("yyyy-mm-dd hh: mm: SS"), R. next (1,1000 )};
Dataset. Tables [0]. Rows. Add (ROW );
If (I % 300 = 0 )...{
SD. Update (Dataset. Tables [0]);
Dataset. Tables [0]. Clear ();
}
}
SD. Update (Dataset. Tables [0]);
Dataset. Tables [0]. Clear ();
SD. Dispose ();
Dataset. Dispose ();
Conn. Close ();

}
Timespan Ts = datetime. Now-begin;
MessageBox. Show ("Ts =" + ts. totalmilliseconds );

For this test, it takes 28 seconds to insert 0.1 million pieces of data. The performance is still commendable. However, for batch update and search all over the world, the record fill is added to the dataset and then the rows
In my small data size test, fill 0.1 million pieces of data into dataset is no longer enough. If it is a million pieces, how can it be done? Do you have to record the batch operation first?
First get to dataset? That is to say, which records should I choose to query to update?

So I still use an empty datatable to add the record to be updated:

SD. selectcommand = new sqlcommand ("select devid, data_time, data_value from currenttest where 1 = 0", Conn );
// The Condition 1 = 0 ensures that an empty table is obtained.
SD. updatecommand = new sqlcommand ("Update currenttest set data_time = @ data_time, data_value = @ data_value where devid = @ devid", Conn );
SD. updatecommand. Parameters. Add ("@ data_time", sqldbtype. Char, 19, "data_time ");
SD. updatecommand. Parameters. Add ("@ data_value", sqldbtype. Int, 4, "data_value ");
SD. updatecommand. Parameters. Add ("@ devid", sqldbtype. Char, 20, "devid ");
SD. updatecommand. updatedrowsource = updaterowsource. None;
SD. updatebatchsize = 0;

For (INT I = 0; I <300; I ++ ){
..............................
Dataset. Tables [0]. Rows. Add (ROW );
}
SD. Update (Dataset. Tables [0]);
Update 300 records first. If all records are successfully updated again, the system prompts that the insert operation requires insertcommand. Because an empty table is added to the row operation, rowstate is added,

If the database is updated at this time, the insert operation is executed and cannot be updated. Change:
For (INT I = 0; I <300; I ++ ){
..............................

Row = {fill in the initialization value };
Dataset. Tables [0]. Rows. Add (ROW );
}
Dataset. acceptchanges ();
For (INT I = 0; I <300; I ++ ){
..............................
Dataset. Tables [0]. Rows [I] [x] = "xxxxxxx ";
..............................
}
SD. Update (Dataset. Tables [0]);
Insert data in the datatable, use acceptchanges (), modify rowstate to unchanged, and then modify the data in the table to change the unchanged status.

Datatable changes from current status to original, and then updates the row of the datatable.

Update successful, but it is inconvenient to do so.

Adjust the idea, first fetch 200 entries from the database (size of batch update), and directly obtain an original datatable.

SD. selectcommand = new sqlcommand ("select top 200 devid, data_time, data_value from currenttest", Conn );
Dataset dataset = new dataset ();
SD. Fill (Dataset );
Use the 200 space to store other data to be updated:
 
For (INT I = 0; I <100; I ++)
{
Dataset. Tables [0]. Rows [I]. beginedit ();
Dataset. Tables [0]. Rows [I] ["data_time"] = "2222-22-22 22:22:22 ";
Dataset. Tables [0]. Rows [I] ["data_value"] = 100;
Dataset. Tables [0]. Rows [I] ["devid"] = "devid" + (I + 10000); // update the records from devid10000 to devid10200
Dataset. Tables [0]. Rows [I]. endedit ();
}
SD. Update (Dataset. Tables [0]);
OK, success. Haha. Keep the data to be updated in this space and submit it when it is filled up. In this way, only a few cycles are required to update 100000 pieces of data.

Datetime begin = datetime. now;
String connectionstring = "";
Using (sqlconnection conn = new sqlconnection (connectionstring ))...{
Conn. open ();

Sqldataadapter SD = new sqldataadapter ();
SD. selectcommand = new sqlcommand ("select top 200 devid, data_time, data_value from currenttest", Conn );

Dataset dataset = new dataset ();
SD. Fill (Dataset );
Random r = new random (1000 );

SD. updatecommand = new sqlcommand ("Update currenttest"
+ "Set data_time = @ data_time, data_value = @ data_value where devid = @ devid", Conn );
SD. updatecommand. Parameters. Add ("@ data_time", sqldbtype. Char, 19, "data_time ");
SD. updatecommand. Parameters. Add ("@ data_value", sqldbtype. Int, 4, "data_value ");
SD. updatecommand. Parameters. Add ("@ devid", sqldbtype. Char, 20, "devid ");
SD. updatecommand. updatedrowsource = updaterowsource. None;
SD. updatebatchsize = 0;
For (int count = 0; count <100000
...{

For (INT I = 0; I <200; I ++, Count ++)
...{
Dataset. Tables [0]. Rows [I]. beginedit ();
Dataset. Tables [0]. Rows [I] ["data_time"] = "2222-22-22 22:22:22 ";
Dataset. Tables [0]. Rows [I] ["data_value"] = 100;
Dataset. Tables [0]. Rows [I] ["devid"] = "devid" + count;
Dataset. Tables [0]. Rows [I]. endedit ();
}
SD. Update (Dataset. Tables [0]);
}


Dataset. Tables [0]. Clear ();
SD. Dispose ();
Dataset. Dispose ();
Conn. Close ();

}
Timespan Ts = datetime. Now-begin;
MessageBox. Show ("Ts =" + ts. totalmilliseconds );

Note that the above update operation refers to the constant updating of records in one million, million, and ten million records. These records to be updated are not from the beginning

To the end of this order, but constantly update any records according to the conditions, I cannot put tens of thousands of records first fill into Ds and then select

This record is updated. Therefore, every 200 update operations are submitted in datatable, And the JDBC addbat and executebat operations are implemented.

It takes 32 seconds to update 0.1 million entries in this operation.

Kao, there is no more elegant method. I have to use it like this.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.