Data BULK INSERT MSSQL

Source: Internet
Author: User
Tags bulk insert mssql

Preface

Now there is a requirement is to insert 10w data into the MSSQL database, the table structure is as follows, what would you do, how long do you feel inserting 10W data into the MSSQL table below?

Or how does your bulk data get plugged in? I have a discussion on this issue today.

Test the MVC HTTP interface to see the data

First of all, here is just a reference to understand the performance of the inserted database, and the opening of the demand for no half-dime relationship.

The MVC interface code is as follows:

public bool Add (customerfeedbackentity m)        {            using (var conn=connection)            {                String sql = @ ' INSERT into [ DBO]. [Customerfeedback]                                           ([BusType]                                           , [Customerphone]                                           , [Backtype]                                           , [Content]                                          )                                     VALUES                                           (@BusType                                           , @ Customerphone                                           , @BackType                                           , @Content                                           ) ";                Return Conn. Execute (SQL, M) > 0;            }        }

This MVC interface is measured by a single data inserted into the aggregated data graph of the database.

Use cases like this: 5,000 requests in 500 threads execute the POST request interface.

This diagram tells us that the slowest request only took 4 milliseconds. So let's make an algorithm.

As for the opening demand, we use the minimum response time to calculate.

Then inserting 10w data into the database requires =100000*4 milliseconds, which is roughly 6.67 minutes. So we're running this goal to make the insertion plan.

The most common insert practices

First, our engineers get the requirements and write the code like this:

Executes the number of data bars int cnt = 10 * 10000; Data to be inserted customerfeedbackentity m = new Customerfeedbackentity () {BusType = 1, Customerphone = "1888888888", Backt        ype = 1, Content = "123123DAGVHKFHSDJK will be spoiled by the flourishing of the spoiled to protect the level of the high-quality price of the Su Tak-ha"};                The first public void Fristway () {using (var conn = new SqlConnection (connstr)) { Conn.                Open ();                Stopwatch SW = new Stopwatch (); Sw.                Start ();                StringBuilder sb = new StringBuilder ();                Console.WriteLine ("From:" + DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "Start loop execution:" + cnt + "Bar SQL statement ..."); for (int i = 0; I <= cnt; i++) {sb.                    Clear (); Sb. Append (@ "INSERT into [dbo].[ Customerfeedback] ([BusType], [custome                              Rphone], [Backtype]             , [Content]) VALUES ("); Sb.                    Append (M.bustype); Sb.                    Append (", '"); Sb.                    Append (M.customerphone); Sb.                    Append ("',"); Sb.                    Append (M.backtype); Sb.                    Append (", '"); Sb.                    Append (m.content); Sb.                    Append ("')"); using (SqlCommand cmd = new SqlCommand (SB).                        ToString (), conn)) {cmd.commandtimeout = 0; Cmd.                    ExecuteNonQuery (); }} Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "when executed:" + cnt + "bar s QL Statement Complete! Time consuming: "+ SW." Elapsedmilliseconds + "milliseconds.            "); }        }

The results of the implementation are as follows:

10w data, 693906 milliseconds, 11 minutes, there is no feeling OK, or can accept. I'm spitting blood, I'm not talking, keep writing, you see where the MSSQL database and. NET mates are inserted?

Reviews under:

1, constantly create and release Sqlcommon objects, there will be performance waste.

2, keep the database to establish a connection, there will be a lot of performance loss.

This 2 o'clock and the results of the implementation tell us that this is not advisable, even if this is the most common way to insert data.

Then we optimize for the above two points, 1, create a Sqlcommon object, only to establish a connection with the database. The optimized retrofit code is as follows:

 public void Secondway () {using (var conn = new SqlConnection (connstr)) {con                N.open ();                Stopwatch SW = new Stopwatch (); Sw.                Start ();                StringBuilder sb = new StringBuilder ();                Console.WriteLine ("From:" + DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "Start loop stitching:" + cnt + "Bar SQL statement ..."); for (int i = 0; I <= cnt; i++) {sb. Append (@ "INSERT into [dbo].[ Customerfeedback] ([BusType], [custome                                          Rphone], [Backtype], [Content]                    ) VALUES ("); Sb.                    Append (M.bustype); Sb.                    Append (", '"); Sb.                    Append (M.customerphone); Sb.             Append ("',");       Sb.                    Append (M.backtype); Sb.                    Append (", '"); Sb.                    Append (m.content); Sb.                Append ("')"); } var result = sw.                Elapsedmilliseconds; Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "when looping stitching:" + cnt + "BAR SQL statement complete! Time-consuming: "+ result +" milliseconds.                "); using (SqlCommand cmd = new SqlCommand (SB).                    ToString (), conn)) {cmd.commandtimeout = 0;                    Stopwatch SW1 = new Stopwatch (); SW1.                    Start ();                    Console.WriteLine ("From:" + DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "Start execution:" + cnt + "Bar SQL statement ..."); Cmd.                    ExecuteNonQuery (); Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "execute:" + cnt + "BAR SQL statement complete! Time consuming: "+ SW1. Elapsedmilliseconds + "milliseconds.                "); }            }        }

The results of the implementation are as follows:

Yes, it's strange, why is it not so different from the previous plan?

First we look at how the concatenation of such a long SQL statement is implemented in the database.

1. View the connection of the database

SELECT * from sysprocesses where dbid in (select dbid from sysdatabases where name= ' dbname ')--or select * From[master]. [dbo]. [sysprocesses] WHERE [DBID] In (SELECT    [Dbid]from    [master].[ DBO]. [sysdatabases] WHERE    name= ' dbname ')

2. View the SQL statement that the database is executing

SELECT [Spid] = session_id, ecid, [Database] = db_name (sp.dbid), [User] = Nt_username                                           , [Status] = Er.status, [Wait] = Wait_type, [individual Query] = SUBSTRING (Qt.text, ER.STATEMENT_START_OFFSET/2, (Case WH                                                        EN Er.statement_end_offset = 1 Then LEN (CONVERT (NVARCHAR (MAX), Qt.text)) * 2 ELSE Er.statem                                           Ent_end_offset End-er.statement_start_offset)            /2), [Parent Query] = qt.text, program = program_name, hostname, Nt_domain, start_time from sys.dm_exec_requests er INNER joins sys.sysprocesses sp on ER.    session_id = Sp.spid        Cross APPLY Sys.dm_exec_sql_text (er.sql_handle) as QT WHERE session_id >-Ignore system SPIDs. and session_id not in (@ @SPID)--Ignore the current statement. ORDER by 1, 2

Reviews: Although it seems to be optimized, in fact, the implementation of the previous solution is almost the same, so there is no need to say anything more.

Optimized for user-defined table types in MSSQL database

The code is still on first, so you can be interested in user-defined table types.

CREATE TYPE customerfeedbacktemp as  TABLE (bustype int not null,customerphone varchar (+) not null,backtype int not NUL L,content nvarchar (+) not NULL)
 public void Thirdway () {Stopwatch SW = new Stopwatch ();            Stopwatch SW1 = new Stopwatch ();            DataTable dt = GetTable (); using (var conn = new SqlConnection (connstr)) {String sql = @ "INSERT into[dbo].[ Customerfeedback] ([BusType], [custome                                          Rphone], [Backtype], [Content]                ) Select Bustype,customerphone,backtype,[content] from @TempTb "; using (SqlCommand cmd = new SqlCommand (SQL, conn)) {cmd.                    CommandTimeout = 0; SqlParameter catparam = cmd.                    Parameters.addwithvalue ("@TempTb", DT);                    Catparam.sqldbtype = sqldbtype.structured; Catparam.typename = "dbo."                    Customerfeedbacktemp "; Conn.        Open ();            Console.WriteLine ("From:" + DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "start looping into the memory table:" + cnt + "bar data ..."); Sw.                    Start (); for (int i = 0; i < cnt; i++) {DataRow dr = dt.                        NewRow ();                        Dr[0] = M.bustype;                        DR[1] = M.customerphone;                        DR[2] = M.backtype;                        DR[3] = m.content; Dt.                    Rows.Add (DR); } Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "when the loop is inserted into the memory table:" + cnt + "bar data complete! Time consuming: "+ SW." Elapsedmilliseconds + "milliseconds.                    "); SW1.                    Start (); if (dt! = null && dt. Rows.Count! = 0) {cmd.                        ExecuteNonQuery (); Sw.                    Stop (); } Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "when executed:" + cnt + "bar data datatable data into the database! Time consuming: "+ SW1. Elapsedmilliseconds + "milliseconds.                "); }            }        }

Operation Result:

Wow catch Q, less than 2 seconds, less than 2 seconds, how much faster than every 4 milliseconds, can not believe, is not running out of the problem.

I'll do it again.

I'll do it again.

Yes, you didn't read it correctly, 10w data, less than 2 seconds. Can't wait to know why? Can't wait to know what type of user-defined table are we using?

User-defined table types

First type people should be very easy to understand, like int,varchar,bit, etc. are all types, then this table type is a yarn?

Actually, he is. Users can define a table structure themselves and then treat him as a type.

To create a detailed document for a custom type: https://msdn.microsoft.com/zh-cn/library/ms175007.aspx

Second, there are some restrictions on custom types, security: https://msdn.microsoft.com/zh-cn/library/bb522526.aspx

Then there is how to use this type, and his use is used as a table-valued parameter.

With table-valued parameters, you can send multiple rows of data to a Transact-SQL statement or routine, such as a stored procedure or function, without having to create a temporary table or many parameters.

Table-valued parameters are similar to parameter arrays in OLE DB and ODBC, but have greater flexibility and are more tightly integrated with Transact-SQL. Another advantage of table-valued parameters is the ability to participate in data-set-based operations.

Transact-SQL passes a table-valued parameter by referencing a usually to avoid creating a copy of the input data. You can use table-valued parameters to create and execute Transact-SQL routines, and you can invoke them from Transact-SQL code, managed clients, and native clients using any managed language.

Advantages

As with other parameters, the scope of a table-valued parameter is a stored procedure, function, or dynamic Transact-SQL text. Similarly, a table-type variable has scope just like any other local variable created using the DECLARE statement. Table-valued variables can be declared within dynamic Transact-SQL statements, and these variables can be passed as table-valued parameters to stored procedures and functions.

Table-valued parameters offer greater flexibility, and in some cases, better performance than temporary tables or other methods that pass parameter lists. Table-valued parameters have the following advantages:

    • Locks are not acquired when data is first populated from the client.

    • Provides a simple programming model.

    • Allows complex business logic to be included in a single routine.

    • Reduce the round-trip to the server.

    • A table structure that can have different cardinality.

    • is strongly typed.

    • Enables the client to specify a sort order and a unique key.

    • is cached as a temporary table when used for stored procedures. Starting with SQL Server 2012, table-valued parameters are also cached for parameterized queries.

Limit

Table-valued parameters have the following limitations:

    • SQL Server does not maintain statistical information for table-valued parameter columns.

    • The table-valued parameter must be passed as an input READONLY parameter to the Transact-SQL routine. DML operations such as UPDATE, DELETE, or INSERT cannot be performed on a table-valued parameter in the body of a routine.

    • You cannot use a table-valued parameter as the target of a SELECT into or INSERT EXEC statement. Table-valued parameters can be in the FROM clause of the SELECT into, or in the INSERT EXEC string or stored procedure.

Common BULK INSERT data set insertion optimization
public void Fourway () {Stopwatch SW = new Stopwatch ();            Stopwatch SW1 = new Stopwatch ();            DataTable dt = GetTable (); using (SqlConnection conn = new SqlConnection (connstr)) {SqlBulkCopy bulkcopy = new SqlBulkCopy                (conn);                bulkcopy.bulkcopytimeout = 0;                Bulkcopy.destinationtablename = "Customerfeedback"; bulkcopy.batchsize = dt.                Rows.Count; Conn.                Open ();                Console.WriteLine ("From:" + DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "start looping into the memory table:" + cnt + "bar data ..."); Sw.                Start (); for (int i = 0; i < cnt; i++) {DataRow dr = dt.                    NewRow ();                    Dr[0] = M.bustype;                    DR[1] = M.customerphone;                    DR[2] = M.backtype;                    DR[3] = m.content; Dt.                Rows.Add (DR); } Console.WriteLine (DatetIme. Now.tostring ("Yyyy-mm-dd HH:mm:ss FFF") + "when the loop is inserted into the memory table:" + cnt + "bar data complete! Time consuming: "+ SW." Elapsedmilliseconds + "milliseconds.                "); SW1.                Start (); if (dt! = null && dt.                    Rows.Count! = 0) {bulkcopy.writetoserver (dt); Sw.                Stop (); } Console.WriteLine (DateTime.Now.ToString ("Yyyy-mm-dd HH:mm:ss FFF") + "when executing:" + cnt + "data for DataTable data into data Library! Time consuming: "+ SW1. Elapsedmilliseconds + "milliseconds.            "); }

Execution Result:

1 seconds to complete, 1 seconds to complete, read this simply to complete in 1 seconds 10w data inserted rhythm, inverse day, inverse days ah.

Bulk Insert Explanation: https://msdn.microsoft.com/zh-cn/library/ms188365.aspx

Professional Reviews:

Table-valued parameters are used in a similar way to other data-set-based variables, but frequent use of table-valued parameters will be faster than large data sets. The start-up cost of a bulk operation is larger than the table-valued parameter, and the table-valued parameter performs well when inserting fewer than 1000 rows.

Reused table-valued parameters can benefit from the temporary table cache. This table caching feature provides better scalability than a peer-to-BULK INSERT operation. When you use a small row insert operation, you can obtain small performance improvements by using a parameter list or a bulk statement instead of a BULK insert operation or table-valued parameter. However, these methods are not very convenient to program, and as the rows increase, the performance decreases rapidly.

Table-valued parameters are fairly or even better than the equivalent parameter pattern implementations on the execution performance.

Data BULK INSERT MSSQL

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.