Original: Limit challenge-c#100 Data import SQL Server database in only 4 seconds (with source)
In actual work sometimes need to import a large amount of data into the database, and then for various program calculations, this experiment will use 5 methods to complete the process, and detailed record the time spent in various methods.
The tool used in this experiment is VS2008 and SQL Server 2000, SQL Server 2008, respectively, using the 5 method to import 1 million data into SQL 2000 and SQL 2008, the experimental environment is the Dell 2850 dual 2.0G C PU, 2G memory server. Interested friends can download the source code to verify their time.
There is also a need to explain that the SQL statements in this lab are executed using the Dataprovider module in the Isline Framework framework, which simply reads and encapsulates the SQL configuration and does not have an intrinsic impact on the final result, about Isline For framework framework knowledge, refer to the Isline Framework series article.
Here's the point, using the basic INSERT statement, using the BULK INSERT statement, using BULK insert in multi-threading, using the SqlBulkCopy class, using the SqlBulkCopy class in multi-threading, five Ways to challenge the 4-second limit.
Database aspects using SQL 2000 and SQL 2008, table name TableB, field name Value1, database name can be modified in app. Config, default to test.
Figure 1 5 methods in the experiment
method One. Using the basic INSERT statement
This approach is the most basic approach, and most people will think of it at the beginning. But the INSERT statement doesn't seem to be suitable for large-volume operations, is it?
This method divides 1 million data into 10 batches, each batch of 100,000, each 100,000 1 transactions, 10 times the database is imported.
Basic statement:Insert into TableB (Value1) VALUES (' "+i+");
Description: The I in the statement is an additive variable in the host program that populates the values in the database fields.
SQL 2000 time consuming: 901599
SQL 2008 Time Consuming: 497638
method Two. Using the BULK INSERT statement
The effect of this class, in this experiment can be said to be the most satisfactory, its use of the most simple, flexible, fast.
The "BULK INSERT" statement does not seem to be very common, Aicken heard that Oracle has a way to map an external file to an Oracle staging table and then directly import the data from the staging table into other Oracle tables, which is very fast, and the SQL Is the server's bulk insert equally satisfying?
Basic statement:BULK INSERT TableB from ' C:\\sql.txt ' with (FieldTerminator = ', ', Rowter/.,mbminator= ' | ', batchsize = 100000)
Description: "C:\\sql.txt" is a pre-generated file containing 100 data with "|" The symbols are separated by one transaction per 100,000 data lines.
SQL 2000 time consuming: 4009
SQL 2008 Time Consuming: 10722
method Three. Using bulk INSERT in multi-threading
On the basis of method two, 1 million data is divided into five threads, each thread is responsible for 200,000 data, every 50,000 items, five threads start at the same time, see this effect.
SQL 2000 time consuming: 21099
SQL 2008 Time Consuming: 10997
method Four. Using the SqlBulkCopy class
This method is also very fast, but to rely on memory, for tens of millions of, multi-field complex data, there may be significant memory consumption, but you can use a 64-bit solution to deal with this problem.
Tens of millions of, multi-field data is typically encountered in a number of business scenarios, such as calculating a business cycle consumption for a global consumer, first obtaining a snapshot of a member's consumption record in the master database table, and storing the snapshot in a staging table, which is then used by the calculation program. And sometimes the consumer data is not in a database server, but from multiple countries of the server, so we have to use memory or external memory device to relay the data, and then clean, merge, detect, and finally import a dedicated table for the calculation program to use.
Basic statement:
Code
using(System.Data.SqlClient.SqlBulkCopy sqlbc= NewSystem.Data.SqlClient.SqlBulkCopy (conn))
{
Sqlbc.batchsize= 100000;
Sqlbc.bulkcopytimeout= -;
Sqlbc.destinationtablename= "dbo. TableB";
SQLBC.COLUMNMAPPINGS.ADD ("Valuea", "Value1");
Sqlbc.writetoserver (DT);
}
Description
BatchSize = 100000; Indicates one transaction per 100,000 and commits
Bulkcopytimeout = 60; Indicates 60 seconds by timeout processing
DestinationTableName = "dbo." TableB "; Indicates that data is imported into the TableB table
Columnmappings.add ("Valuea", "Value1"); Indicates that the in-memory Valuea field matches the Value1 field in the TableB
WriteToServer (dt); Write to database. Where DT is a pre-built DataTable that contains the Valuea field.
SQL 2000 time consuming: 4989
SQL 2008 Time Consuming: 10412
method Five. Using the SqlBulkCopy class in multi-threading
Based on method Four, 1 million data is divided into five threads, each thread is responsible for 200,000 data, every 50,000 items, five threads start at the same time, to see this effect.
SQL 2000 time consuming: 7682
SQL 2008 Time Consuming: 10870
Summarize
Database Test method |
SQL 2000 |
SQL 2008 |
Basic INSERT INTO |
901599 |
497638 |
Single Thread Bulk Insert |
4209 |
10722 |
Multithreaded Bulk Insert |
21099 |
10997 |
Single Thread SqlBulkCopy |
4989 |
10412 |
Multithreading SqlBulkCopy |
7682 |
10870 |
These are the results of these days, and it is disappointing that the performance of SQL SERVER 2008 Import data does not seem to be as good as we thought.
In addition, there are download source code to help me see, why the results of multithreading is not as good as single-threaded? Is it a result of improper use of static resources, or what are the other reasons?
Source Address: Http://files.cnblogs.com/isline/sqltest.rar
Test data: Http://files.cnblogs.com/isline/Data.rar
I am li (aicken) welcome you to follow my next article.
Limit challenge-c#100 data import into SQL Server database in only 4 seconds (with source)