Experience in using SqlBulkCopy (massive data import) and experience in using sqlbulkcopy
Article reprinted original address: http://www.cnblogs.com/mobydick/archive/2011/08/28/2155983.html
Recently, due to the lazy work of previous designers, the extended information of a table is stored in a "key-value" table instead of in the paradigm. For example:
For each piece of information in the primary table, there are about 60 "keys". That is to say, each entry in the primary table requires 60 records to be inserted in the subtable.
It is estimated that the final data volume in the primary table is about 0.2 million, that is, 20x60 = 12 million records in the subtable. For the same type of "Master-Sub" tables, we have a total of four pairs, not to mention the query efficiency of these tables. Data Import once a day alone is a huge challenge for us.
Technorati label: SqlBulkCopy
Here I spoke about it. A 100,000-level database was designed to generate tens of millions of junk "designers. In addition, the original data insertion scheme was to generate an Insert statement for each piece of data and call ExecuteNoQuery one by one to execute the statement one by one. The consequence was that the 3000 master table records were used for the test, 3000x60x4 = 0.72 million pieces of data, completed in 7 hours, Performance 30 records/second. Later, he resigned, changed his individual, and then resigned in the second role. The third role was the next employee.
After the project was in my hands, I re-designed the software structure with my insistence. Because the database system has been used by another subsystem, I cannot change it, we had to find an efficient insertion method.
At the beginning, I used multithreading and opened 10 threads to improve the performance to 300 records per second. It took about 40 minutes to insert test records. For 60 times more formal data, the 40-hour execution clearly does not meet our daily data import requirements.
Through Google, I found SqlBulkCopy.
After testing, I am very satisfied with the performance. If there are 4000 records/second, use it first. The focus of the next stage is to kill the "key-value" table.
During use, I also encountered some "inexplicable" problems. Write down them here for query.
- 2. The default time-out period of SqlBulkCopu is 30 seconds, 30x4000 = 12 thousand, which cannot meet the needs of one-time insertion of a single table and is changed to 3600 seconds.