Millions data 4 SEC Import SQL Server Introduction

Last Update:2017-09-14 Source: Internet

Author: User

Keywords Network programming MSSQL tutorials

Tags basic class consumer data framework import memory multiple

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Presumably every DBA would like to challenge the data import time, the shorter the work efficiency, the more sufficient to prove their strength. In the actual work sometimes need to import a large amount of data into the database, and then for various program calculation, this article will recommend a challenge 4 seconds limit to let millions data into SQL Server experiment case.

This experiment will use the 5 method to complete this process, and detailed records of the various methods of time spent. The tools used are Visual Studio 2008 and SQL Server 2000, SQL Server 2008, respectively, using the 5 method to import 1 million data into SQL Server 2000 and SQL Server 2008, the experimental environment is the Dell 2850 dual 2.0gcpu,2g Memory server. Interested friends can download the source code themselves to verify the time spent.

OK, so let's use the basic INSERT statement, use the BULK INSERT statement, use Bulk inserts in multiple threads, use the SqlBulkCopy class, and use the SqlBulkCopy class in multiple threads Five ways to challenge the 4-second limit. There is also a need to explain that the execution of SQL statements in this experiment uses the Dataprovider module of the Isline Framework framework, which only reads and encapsulates the SQL configuration and does not have an essential impact on the final result, about Isline Framework framework, refer to the Isline Framework series.

Database uses SQL Server 2000 with SQL Server 2008, table name TableB, field name is Value1, database name can be modified in app.config, default is test.

Method I. Using BASIC INSERT statements

This approach is the most basic method that most people think of at first. But the INSERT statement does not seem to be suitable for high-volume operations, does it?

In this method, 1 million data is divided into 10 batches, each batch 100,000, 1 transactions per 100,000, and 10 times to import the database.

--> Basic statement:

Insert into TableB (Value1) VALUES (' +i+ '); Description: I in the statement is an additive variable in the host program that populates the values in the database field.

SQL Server 2000 time consuming: 901599

SQL Server 2008 time consuming: 497638

Method Two. Using the BULK INSERT statement

The effect of this class, in this experiment can be said to be the most satisfactory, its use of the most simple, flexible, fast.

The "BULK INSERT" statement does not seem to be very common, and Aicken heard that Oracle has a way to map external files to Oracle temporary tables and then directly import data from temporary tables into other tables in Oracle, a method that is very satisfying, with SQL Is the server's bulk insert equally satisfying?

--> Basic statement:

BULK INSERT TableB from '

C:sql.txt ' with (FieldTerminator = ', ', rowter

/.,mbminator= ' | ', batchsize = 100000)

Description: "C:sql.txt" is a previously generated file containing 100 of data, which is "|" Symbol-delimited, one transaction per 100,000 data.

SQL Server 2000 time consuming: 4009

SQL Server 2008 time consuming: 10722

Method three. Using bulk inserts in multiple threads

On the basis of method two, 1 million data is divided into five threads, each thread is responsible for 200,000 data, every 50,000 one thing, five threads start at the same time, see this effect.

SQL Server 2000 time consuming: 21099

SQL Server 2008 time consuming: 10997

Method Four. Using the SqlBulkCopy class

This approach is also fast, but depending on the memory, complex data with tens of millions of or more fields may be consumed in memory, but a 64-bit solution can be used to handle the problem.

The case for tens of millions of, multiple-field data is typically encountered in some business scenarios, such as computing a business cycle consumption for a global consumer, first obtaining a snapshot of the member consumption record in the master database table, and storing the snapshot in a temporary table, which is then used by the calculation program. And sometimes the consumer's consumer data is not in a database server, but more than one server from many countries, so we have to use the memory or external storage equipment to transfer the data, and then clean, merge, detect, and finally import a special table for the calculation program to use.

Basic statement:

using (System.Data.SqlClient.SqlBulkCopy SQLBC

= new System.Data.SqlClient.SqlBulkCopy (conn))

{sqlbc.batchsize = 100000; sqlbc.bulkcopytimeout

= 60; Sqlbc.destinationtablename = "dbo." TableB ";

SQLBC.COLUMNMAPPINGS.ADD ("Valuea", "Value1");

Sqlbc.writetoserver (DT); }

Description：

BatchSize = 100000; Instruct every 100,000 transactions and submit

Bulkcopytimeout = 60; Indicates that 60 seconds is handled by timeout

DestinationTableName = "dbo." TableB "; Indicates that data is imported into the TableB table

Columnmappings.add ("Valuea", "Value1"); Indicates that the Valuea field in memory matches the Value1 field in TableB

WriteToServer (DT); Where DT is a pre-built DataTable that contains the Valuea field.

SQL Server 2000 time consuming: 4989

SQL Server 2008 time consuming: 10412

Method Five. Using the SqlBulkCopy class in multiple threads

Based on method Four, 1 million data is divided into five threads, each thread is responsible for 200,000 data, each 50,000 one thing, five threads start at the same time, see this effect.

SQL 2000 time consuming: 7682

SQL 2008 Time Consuming: 10870

Results

A few days to finally complete the experiment, the more disappointing is that SQL SERVER 2008 Import data performance does not seem to be as good as we think

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More