Compare and test the efficiency of inserting massive data into the database, and compare and test massive data volumes

Source: Internet
Author: User

Compare and test the efficiency of inserting massive data into the database, and compare and test massive data volumes

Abstract: Using. NET related technologies to insert massive data into databases is a common operation. This article compares ADO. NET and LINQ, and uses SqlBulkCopy () and InsertAllOnSubmit () Methods to perform operations. It is concluded that the insertion efficiency of the same insert workload (200 pieces of data) is times different!

 

Test scenario:

Prepare two databases, TestDb and TestDb2, with the table T_Users. Shows the table structure:


SqlBulkCopy () is inserted as follows:

private static Stopwatch InsertUsingSqlBulkCopy(List<UserInMemory> list)        {            Stopwatch stopWatch = new Stopwatch();            stopWatch.Start();            DataTable dt = new DataTable();            dt.Columns.Add("ID");            dt.Columns.Add("UserName");            for (int i = 0; i < list.Count; i++)            {                DataRow row = dt.NewRow();                row["ID"] = list[i].ID;                row["UserName"] = list[i].UserName;                dt.Rows.Add(row);            }            using (SqlConnection con = new SqlConnection(connStr2))            {                con.Open();                using (SqlBulkCopy bulkCopy = new SqlBulkCopy(con))                {                    try                    {                        bulkCopy.DestinationTableName = "dbo.T_UserName";                        bulkCopy.ColumnMappings.Add("ID", "ID");                        bulkCopy.ColumnMappings.Add("UserName", "UserName");                        bulkCopy.WriteToServer(dt);                    }                    catch (Exception ex)                    {                        Console.WriteLine(ex.ToString());                    }                    finally { con.Close(); }                }            }            stopWatch.Stop();            return stopWatch;        }

The following figure shows how to insert the data into the table:

private static Stopwatch InsertUsingLINQ(List<T_User> _list)        {            Stopwatch stopWatch = new Stopwatch();            stopWatch.Start();            DataClasses1DataContext dataContext = new DataClasses1DataContext();            dataContext.T_Users.InsertAllOnSubmit(_list);            dataContext.SubmitChanges();            stopWatch.Stop();            return stopWatch;        }

Use the preceding code to import 10 thousand User data to the database.

The result is as follows:


Conclusion:

1. The SqlBulkCopy () method under ADO. NET is a 266-fold difference in the insert efficiency of the InsertAllOnSubmit () method under linq to SQL!

2. convenient use, short code, low learning costs, and elegant language expression. However, the ADO. NET method is recommended for operations related to massive data volumes that require high efficiency.


Related Source: http://download.csdn.net/detail/fanrong1985/8130953


How can we prompt efficiency when analyzing and summarizing massive data?

I have seen that a multinational company uses preprocessing to obtain the desired data after the customer sets the time period, the system generally runs three hours ahead of schedule, so the data is amazing.

In BI, data compression, interval, table sharding, and re-aggregation are basically used.
For example, the sales log -- is compressed into the sales records of a product within one day.
In this way, 10 thousand items are 10 thousand rows per day. The maximum value for the year is about 3.66 million, which is five seconds for the database.

Therefore, it is the principle to compress the data into the minimum unit of measurement for the query and aggregate the data into the maximum data output.

As for caching, data drilling is only available after data is processed.

Essentially, it is dedicated to table and analysis.

Database insertion Efficiency

It is faster to insert all data at a time. It is best to generate a file and then import it as a file, but insert it as a single-pick record, every time a database inserts a record, it performs Syntax Parsing, execution plan generation, and statement execution. Therefore, the efficiency is very low, and the small data volume is not obvious, and the large data volume is very obvious.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.