C # in the MySQL large number of data efficient reading, writing detailed _c# tutorial

Source: Internet
Author: User
Tags bulk insert

Objective

The most common operation of C # to manipulate MySQL's large amount of data is to select the data and then process the data in C # before inserting it into the database. In short, select-> process-> Insert three steps. For small amounts of data (millions or hundreds of trillion) may

It'll be over in 1 hours at most. But for TENS data, it may be days or even more. So the question comes, how to optimize??

The first step is to solve the problem of reading

There are a lot of ways to deal with databases, so let me just list them:

1. "Heavy weapons-tank cannons" use a heavy-duty ORM framework, such as Ef,nhibernat.

2. "Light Weapons-ak47" use Dapper,petapoco, such as a single CS file. Flexible and efficient, easy to use. Home more goods are necessary (I prefer petapoco:))

3. "Cold Weapon"? Dagger? "Use the native connection, Command. Then write the native SQL statement ...

Analysis:

"Heavy weapons" must be pass directly on us, they should be used in large projects.

"Light Weapons" Dapper,petapoco look at the source code you will find that use of reflection, although the use of IL and caching technology, but still affect the reading efficiency, pass

All right, that's it. With a dagger, native SQL walks, uses DataReader for efficient reading, and uses indexes to fetch data (faster) rather than column names.

The approximate code is as follows:

using (var conn = new Mysqlconnection ("Connection String ...")
{
 Conn. Open ();
 Sets the read timeout here, otherwise it is easy to timeout
 var c = new Mysqlcommand ("Set net_write_timeout=9999999;") in massive data. Set net_read_timeout=9999999 ", conn);
 C.executenonquery ();

 Mysqlcommand rcmd = new Mysqlcommand ();
 Rcmd. Connection = conn;
 Rcmd.commandtext = @ "Select ' F1 ', ' F2 ' from ' table1 '";
 Sets the execution timeout for the command
 rcmd.commandtimeout = 99999999;
 var myData = Rcmd. ExecuteReader ();

 while (Mydata.read ())
 {
  var f1= mydata.getint32 (0);
  var f2= mydata.getstring (1);
  Here do the data processing ...
 }

Haha, how, the code is very primitive, or use the index to fetch data, it is easy to make mistakes. Of course, it's all for the sake of performance.

Second Step data processing

In fact this step, according to your business needs, the code is certainly different, but nothing more than a string processing, type conversion operations, this is the test of your C # basic skills. And how to write regular expressions efficiently ...

The specific code can not write Ah, first read the CLR via C # in to discuss it with me, O (∩_∩) o hahaha ~ Skip ....

Part III Data insertion

How to BULK INSERT is most efficient? Some alumni say, use affairs, BeginTransaction, and then endtransaction. Well, it does improve insertion efficiency. But there are more efficient ways to merge INSERT statements.

So how do we merge?

Insert into table (F1,F2) VALUES (1, ' SSS '), Values (2, ' bbbb '), Values (3, ' CCCC ');

is to put all the values after the comma, linked together, and then executed.

Of course, you cannot commit 100MB of SQL execution at a time, and the MySQL server has a limit on the length of each execution of the command. The MySQL server side of the Max_allowed_packet property can be viewed, the default is 1MB

Let's take a look at the pseudocode.

 Use StringBuilder efficient stitching string var sqlbuilder = new StringBuilder ();
 Adds the header string sqlheader = "INSERT INTO table1 (' F1 ', ' F2 ') values";
 Sqlbuilder.append (Sqlheader); using (var conn = new Mysqlconnection ("Connection String ...") {conn.
  Open (); Sets the read timeout here, otherwise it is easy to timeout var c = new Mysqlcommand ("Set net_write_timeout=9999999;") in massive data.
  Set net_read_timeout=9999999 ", conn);

  C.executenonquery ();
  Mysqlcommand rcmd = new Mysqlcommand (); Rcmd.
  Connection = conn;
  Rcmd.commandtext = @ "Select ' F1 ', ' F2 ' from ' table1 '";
  Set the execution timeout for the command rcmd.commandtimeout = 99999999; var myData = Rcmd.
  ExecuteReader ();
   while (Mydata.read ()) {var f1 = mydata.getint32 (0);
   var F2 = mydata.getstring (1);
   Here do the data processing ... sqlbuilder.appendformat ("({0}, ' {1} '),", F1,addslash (F2)); if (sqlbuilder.length >= 1024 * 1024 * 1024)//Of course the 1MB Length string here is not equal to 1MB packet ... I know: {Insertcmd.execute (Sqlbuilder.remove) (sqlbuilder.length-1,1).
ToString ())//Remove the comma, and then execute sqlbuilder.clear ()/Empty    Sqlbuilder.append (Sqlheader);//In addition to insert header}}} 

All right, here's the optimized query, the insertion is done.

Summarize

summed up, nothing more than 2 key technical points, DataReader, SQL Merge, are some of the old technology. In fact, the above code can only be called efficient, but it is very not elegant. The above is the entire content of this article, I hope the content of this article can help, if you have questions you can message exchange.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.