Detailed Performance Comparison of SQL batch data insertion Solutions

Source: Internet
Author: User

Company Technical Background: DataBase Audit class (xxx. DataBase. Dll) calls the stored procedure to access the DataBase.

Technical solution 1:

The first version written by the programmer during the compression time is used to complete the task without optimization from the program. The implementation method is to use the database metadata class to call the stored procedure and insert data one by one in a loop. Obviously, this method is not very efficient, so we have two colleagues discussing the problem of low efficiency.

Technical solution 2:

Considering the large data volume of batch inserts, I came up with a new feature of ADO. NET2.0: SqlBulkCopy. For this performance, I personally tested the performance a long time ago and the efficiency was very high. This is also the technical solution I recommended to my colleagues.

Technical solution 3:

Use the new feature of SQLServer2008-Table-Valued Parameter ). The table value parameter is a new feature only available in SQLServer2008. With this feature, we can pass a table type as a parameter to a function or stored procedure. However, it also has a feature: Table value parameters have good execution performance when the number of inserted rows is less than 1000.

Technical solution 4:

For single-column fields, you can splice the data to be inserted into strings, split the data into arrays during storage, and insert the data one by one. Check the maximum length of the parameter string in the stored procedure, and divide it by the length of the field to calculate a value, which obviously meets the requirements, however, compared with the first method, this method does not seem to improve because the principles are the same.

Technical solution 5:

Consider asynchronous creation and message queue. This solution is difficult both in design and development.

Technical solution 1 must be rejected. The rest is to make a choice between technical solution 2 and technical solution 3. Given the current situation of the company, technical solution 4 and technical solution 5 will not be considered first.

Next, I will write more details about the creation and calling of Table value parameters, and the article may be a little longer, friends who do not pay attention to the details can choose a skip reading method.

Let's talk about the test scheme. There are three groups in the test. One group has less than 1000 inserts, and the other two groups have more than 1000 inserts (Here we take 10000 and 1000000 respectively ), each group of tests is divided into 10 times and the average value is obtained. Let's go!

1. Create a table.

For simplicity, the table has only one field, as shown in:

2. Create a table value parameter type

Open the query analyzer and execute the following code in the query Analyzer:

Create Type PassportTableType as Table(PassportKey nvarchar(50)
)

After the execution is successful, open the Enterprise Manager and expand the following nodes in sequence: database, scalability, type, and user-defined table type, you can see the Table value types we have created, as shown in:

The table value type is successfully created.

3. Write the Stored Procedure

The Stored Procedure Code is:
Copy codeThe Code is as follows:
USE [TestInsert]
GO
/***** Object: StoredProcedure [dbo]. [CreatePassportWithTVP] Script Date: 03/02/2010 00:14:45 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- ===================================================== ======
-- Author: <Kevin>
-- Create date: <2010-3-1>
-- Description: <create a pass>
-- ===================================================== ======
Create PROCEDURE [dbo]. [CreatePassportWithTVP]
@ TVP PassportTableType readonly
AS
BEGIN
Set nocount on;
Insert into Passport (PassportKey) select PassportKey from @ TVP
END

In the query analyzer, you may be prompted that the table value type is incorrect, and there will be a red underline (see). Ignore this and continue to run our code to complete the creation of the stored procedure.
 
4. write code to call the stored procedure.

The code for inserting the three databases is as follows. Due to the tight time, the Code may be less readable. I added some comments to the code.
Copy codeThe Code is as follows:
Using System;
Using System. Diagnostics;
Using System. Data;
Using System. Data. SqlClient;
Using com. DataAccess;
Namespace ConsoleAppInsertTest
{
Class Program
{
Static string connectionString = SqlHelper. ConnectionStringLocalTransaction; // database connection string
Static int count = 1000000; // Number of inserted items
Static void Main (string [] args)
{
// Long commonInsertRunTime = CommonInsert ();
// Console. WriteLine (string. Format ("the time used to insert {1} pieces of data in normal mode is {0} millisecond", commonInsertRunTime, count ));
Long sqlBulkCopyInsertRunTime = SqlBulkCopyInsert ();
Console. WriteLine (string. Format ("the time used to insert {1} pieces of data using SqlBulkCopy is {0} millisecond", sqlBulkCopyInsertRunTime, count ));
Long TVPInsertRunTime = TVPInsert ();
Console. WriteLine (string. Format ("the time used to insert {1} pieces of data using the table Value Method (TVP) is {0} millisecond", TVPInsertRunTime, count ));
}
/// <Summary>
/// Insert data in a stored procedure
/// </Summary>
/// <Returns> </returns>
Private static long CommonInsert ()
{
Stopwatch stopwatch = new Stopwatch ();
Stopwatch. Start ();
String passportKey;
For (int I = 0; I <count; I ++)
{
PassportKey = Guid. NewGuid (). ToString ();
SqlParameter [] sqlParameter = {new SqlParameter ("@ passport", passportKey )};
SqlHelper. ExecuteNonQuery (connectionString, CommandType. StoredProcedure, "CreatePassport", sqlParameter );
}
Stopwatch. Stop ();
Return stopwatch. ElapsedMilliseconds;
}
/// <Summary>
/// Use SqlBulkCopy to insert data
/// </Summary>
/// <Param name = "dataTable"> </param>
/// <Returns> </returns>
Private static long SqlBulkCopyInsert ()
{
Stopwatch stopwatch = new Stopwatch ();
Stopwatch. Start ();
DataTable dataTable = GetTableSchema ();
String passportKey;
For (int I = 0; I <count; I ++)
{
PassportKey = Guid. NewGuid (). ToString ();
DataRow dataRow = dataTable. NewRow ();
DataRow [0] = passportKey;
DataTable. Rows. Add (dataRow );
}
SqlBulkCopy sqlBulkCopy = new SqlBulkCopy (connectionString );
SqlBulkCopy. DestinationTableName = "Passport ";
SqlBulkCopy. BatchSize = dataTable. Rows. Count;
SqlConnection sqlConnection = new SqlConnection (connectionString );
SqlConnection. Open ();
If (dataTable! = Null & dataTable. Rows. Count! = 0)
{
SqlBulkCopy. WriteToServer (dataTable );
}
SqlBulkCopy. Close ();
SqlConnection. Close ();
Stopwatch. Stop ();
Return stopwatch. ElapsedMilliseconds;
}
Private static long TVPInsert ()
{
Stopwatch stopwatch = new Stopwatch ();
Stopwatch. Start ();
DataTable dataTable = GetTableSchema ();
String passportKey;
For (int I = 0; I <count; I ++)
{
PassportKey = Guid. NewGuid (). ToString ();
DataRow dataRow = dataTable. NewRow ();
DataRow [0] = passportKey;
DataTable. Rows. Add (dataRow );
}
SqlParameter [] sqlParameter = {new SqlParameter ("@ TVP", dataTable )};
SqlHelper. ExecuteNonQuery (connectionString, CommandType. StoredProcedure, "CreatePassportWithTVP", sqlParameter );
Stopwatch. Stop ();
Return stopwatch. ElapsedMilliseconds;
}
Private static DataTable GetTableSchema ()
{
DataTable dataTable = new DataTable ();
DataTable. Columns. AddRange (new DataColumn [] {new DataColumn ("PassportKey ")});
Return dataTable;
}
}
}

The mysterious code is actually the following two lines. The Code passes a dataTable as a parameter to our stored procedure. Simple.

SqlParameter[] sqlParameter = { new SqlParameter("@TVP", dataTable) };
SqlHelper.ExecuteNonQuery(connectionString, CommandType.StoredProcedure, "CreatePassportWithTVP", sqlParameter);
5. Test and record test results
The first group of tests, with 1000 inserted records

Test Group 2, with 10000 inserted records

In the third group of tests, the number of inserted records is 1000000.

Through the above test scheme, it is not difficult to find that the advantage of technical solution 2 is still quite high. Whether in terms of versatility or performance, it should be
The technical complexity is more complex than the technical solution three,

Suppose we create all the tables with the table value type, and the workload is still there. Therefore, I still stick to my decision at the beginning,
Recommend the second technical solution to the company.

This article is complete, but the research on new technologies continues. There are still a lot of things to do.

To facilitate your learning and communication, the code file has been packaged and uploaded. You are welcome to study and discuss it together.
Code download
Author: Shen Shan laolin
Source: http://wlb.cnblogs.com/

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.