Happy transfer of massive data-sqlbulkcopy

Source: Internet
Author: User

 

Recently, we are working with our team to Port Company A's X platform to Company B. We all know that, data transfer is inevitable for software migration projects-data can be copied from one place to another. However, Company B's old X platform has been on the market for 5 to 6 years. It is common to have millions of data records on a single table. If we only say that the data can be properly "placed" on the new platform, that's easy, but how long does it take to transfer massive data? How many hours? How many working days? -- Because the sum of all data transfer times equals the time that our services need to be stopped when the new and old platforms finally switch over. Aside from other projects of the project, if the entire data transfer needs to take 3 to 5 business days, I don't want to say that the customer, product and Business Department colleagues cannot agree.

I am lucky, but I am also ignorant. A new colleague suggested last week that ASP. NET 2.0 has a sqlbulkcopy class, which can help us quickly copy data.

At first I found a blog article titled batch writing data from datatable to the database. The original article is as follows:

Recently, I imported files in Excel into the database and wrote them in a program. Due to the large data volume, the speed was very slow. Later I used the sqlbulkcopy class to solve the speed problem, I will insert the statement, sqldataadapter. update (dataset, tablename); sqlbulkcopy. writetoserver (datatable); compare the performance of the three methods:

1. Generate a datatable table for testing. The table structure is as follows:

Uniqueid (primary key, auto-increment) | companyName | companycode | Address | Owner | memo

There are 6 fields in total.

Sqlconnection = new sqlconnection (connectionstring );
Sqldataadapter = new sqldataadapter ("select * From table_1 where 1 = 2", sqlconnection );
Dataset dataset = new dataset ();
Sqldataadapter. Fill (dataset, "table_1 ");
Datatable = dataset. Tables [0];
// Generate 20000 records

For (INT I = 0; I {
Datarow = datatable. newrow ();
Datarow ["companyName"] = "companyName" + String. Format ("{0: 0000}", I );
Datarow ["companycode"] = "companycode" + String. Format ("{0: 0000}", I );
Datarow ["Address"] = "Address" + String. Format ("{0: 0000}", I );
Datarow ["owner"] = "owner" + String. Format ("{0: 0000}", I );
Datarow ["memo"] = "memo" + String. Format ("{0: 0000}", I );

Datatable. Rows. Add (datarow );
}

2. Use sqlcommand.exe cutenonquery () to insert data.

Foreach (datarow in datatable. Rows)
{
String SQL = "insert into [table_1]

([CompanyName], [companycode], [address], [owner], [memo]) "+
"Values ('" + datarow ["companyName"]. tostring () + "'" +
", '" + Datarow ["companycode"]. tostring () + "'" +
", '" + Datarow ["Address"]. tostring () + "'" +
", '" + Datarow ["owner"]. tostring () + "'" +
", '" + Datarow ["memo"]. tostring () + "')";
Using (sqlconnection sqlconn = new sqlconnection (connectionstring ))
{
Sqlconn. open ();

Sqlcommand = new sqlcommand (SQL, sqlconn );
Sqlcommand. executenonquery ();
Sqlconn. Close ();
}
}

Insert 20000 records at: 00: 00: 29.7336000

3. Use sqldataadapter. Update (dataset, tablename );

Sqlcommand insertcommand = new sqlcommand ("insert into [table_1] ([companyName], [companycode], [address], [owner], [memo])" +
"Values (@ companyName, @ companycode, @ address, @ owner, @ memo)", new sqlconnection (connectionstring ));
Insertcommand. Parameters. Add ("@ companyName", sqldbtype. nchar, 50, "companyName ");
Insertcommand. Parameters. Add ("@ companycode", sqldbtype. nchar, 25, "companycode ");
Insertcommand. Parameters. Add ("@ address", sqldbtype. nchar, 255, "Address ");
Insertcommand. Parameters. Add ("@ owner", sqldbtype. nchar, 25, "owner ");
Insertcommand. Parameters. Add ("@ memo", sqldbtype. nchar, 255, "memo ");
Sqldataadapter. insertcommand = insertcommand;

Sqldataadapter. Update (dataset, "table_1 ");

Insert 20000 records at: 00: 00: 22.8938000

4. Use sqlbulkcopy. writetoserver (datatable)

Sqlbulkcopy = new sqlbulkcopy (connectionstring, sqlbulkcopyoptions. useinternaltransaction );
Sqlbulkcopy. destinationtablename = "table_1"; // name of the table in the database

Sqlbulkcopy. writetoserver (Dataset. Tables [0]);

Insert 20000 records at: 00: 00: 00.3276000

Therefore, the speed is sqlbulkcopy, followed by sqldataadapter. Update (), and sqlcommand. executenonquery.

This article from the csdn blog, reproduced please indicate the source: http://blog.csdn.net/humozhi/archive/2009/03/31/4039868.aspx

I got used to the fact that new things are always skeptical, so I was surprised when I started several colleagues to test the results. The average import speed is about 10 times faster than the traditional method.

After checking some information, Microsoft defined it as follows:

The sqlbulkcopy class allows you to write managed code solutions that provide similar functions. There are other ways to load data to SQL Server tables (such as insert statements), but sqlbulkcopy provides obvious performance advantages.
The sqlbulkcopy class can only be used to write data to SQL Server tables. However, the data source is not limited to SQL Server; any data source can be used, as long as the data can be loaded to the able instance or the idatareader instance can be used to read data.

Supplement: sqlbulkcopy is a new class of. NET Framework 2.0. It is located in the namespace system. Data. sqlclient. It mainly provides the function of effectively loading data from other data sources to SQL Server tables in batches. Similar to the command line application named BCP in the Microsoft SQL Server package. However, you can use the sqlbulkcopy class to write a hosted code solution, which is superior to the BCP command line application in performance, and better than loading a large amount of data to the SQL Server table in insert mode. Sqlbulkcopy can be applied to the transfer of mass data, regardless of the data source.

In addition, use the sqlbulkcopy class in Asp.net 2.0 to copy data in batches:

Sqlbulkcopy contains the writetoserver method, which is used to copy data from the data source to the data destination. The writetoserver method can process datarow [] arrays, able, and datareader data types. You can use different data types according to different situations, but it is a good idea to select datareader more often. This is because datareader is a forward-only, read-only data stream that does not save data, so it is faster than datatable and datarows.

The following code copies data from the source table to the target table.
Using (sqlbulkcopy bulkcopy = new sqlbulkcopy (destinationconnection. connectionstring ))
{
Bulkcopy. batchsize = 500;

Bulkcopy. policyafter = 1000;
Bulkcopy. sqlrowscopied + = new sqlrowscopiedeventhandler (bulkcopy_sqlrowscopied );
Bulkcopy. destinationtablename = "products_latest ";
Bulkcopy. writetoserver (Reader );
}

Here is a pair of knowledge points to be mentioned. First, I use datareader to read data from the database table. Products_latest is the destination table, because data needs to be copied from the products_archive table to the products_latest table. The bulkcopy object provides a sqlrowcopied event that occurs when the number of rows specified by the policyafter attribute is processed each time. In this example, this event is triggered every time 1000 rows are processed, because policyafter is set to 1000

The batchsize attribute is very important, and the program performance depends on it. Batchsize indicates the number of rows in each batch. At the end of each batch, the rows in this batch are sent to the database. I set the batchsize to 500, which means that the reader sends them to the database every 500 rows read to perform the batch copy operation. The default value of batchsize is "1", which means that each row is sent to the database as a batch.

Setting different batchsize results in different performance. You should test the batchsize according to your needs.

I checked the principle:

Sqlbulkcopy allows a large amount of data in the dataset or datareader to directly read a large number of records through the data stream, which can be quickly added to the data table of the target database. However, we should note that it is not a bcp.exe tool that we generally use. It can read a large amount of data from a separator file and selectively combine it with a format file) load records to the database, or export the data in the database into a file. However, because dataset can integrate XML data, the sqlbulkcopy type can still be used to easily transfer XML file data to the database through dataset.

Summary

After learning about sqlbulkcopy, we basically use it to implement all big data. Some people say that sqlbulkcopy does not use ADO. Net but directly operates the MDB file, and I have not confirmed it. But I don't want to influence my use of it for the moment. Of course, pay attention to the following points when using sqlbulkcopy:

1. confirm that a large-capacity update is required to perform this operation. (use other channels to store dozens of rows of data as much as possible ).

2. Confirm data consistency and check mechanism to avoid primary key conflicts and data inconsistency.

3. sqlbulkcopy may change the metadata of the target table (for example, when the constraint check is disabled ). In this case, concurrent Snapshot isolation transactions that access large-capacity inserted tables will fail.

4. sqlbulkcopy will update the lock for large capacity under the database. Pay attention to the concurrency to avoid timeout for other connections due to long wait.

Extension

The fast data replication of sqlbulkcopy reminds me of the multi-machine data synchronization of database backup and database read/write splitting. When looking for information, I also saw an application of sqlbulkcopy, "using sqlbulkcopy to implement online Q & A system". Of course, Data Synchronization After database read/write splitting is a great learning. sqlbulkcopy can only be a preliminary solution.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.