MySQL Development performance Research--insert,replace,insert-update performance comparison

Last Update:2014-09-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, why do we have this experiment

Our system is a batch processing system, similar to the architecture of a pipeline. Each data table is the end of the pipeline, and our program is similar to the pipeline itself. All we need to do is to extract the data from a table, through a certain filtering, summary and other operations placed in the B table. If there is an error, then run the pipeline again. Therefore, our system actually does not have any kind of transactional at all, just hang up the table to truncate (or conditionally delete), and then re-run on the line.

This makes it relatively easy for SELECT statements, which basically do not require a join operation. There are, however, some requirements for write operations. For example, you need to deal with the primary key duplication (probably before you run, now you need to re-run, whether it is a hint of error, or do a replace or update) and so on.

After the introduction of MySQL, we found that MySQL provides a solution to a similar problem at the SQL statement level. Includes the operation of the insert,replace,insert-on-duplicate. Please check here for specific instructions. The only thing to be aware of is insert-on-duplicate this operation, in the update the meaning of the values is the insert list of the fixed value, if you need to reference the original value in the data table, or directly use the column name, do not need to use the values to wrap.

II. Preparation of the experiment

I'm still using the largest table we could use here, which has nearly 200 fields. The experimental environment is the same as in the previous article. With the comparison in that article, I have directly used 10 multi-line insertion methods, and each of the 5,000 articles was submitted once. To make a comparison, I deliberately made a traditional insert-update operation. The Advanced Line Insert action, and then check the output, if there is a "primary key duplication" error, then call the UPDATE statement directly, with the same data to replace the row (that is, the direct original value overrides). Note that this approach is not possible to do multi-line insertion.

Again, in order to make the scene more realistic. I created three databases on the same MySQL service, all of which created the table. And all the operations are done directly against the three tables. The tool I used in the code was a class library that I wrote myself. Through multi-line thread attached to docu (one library one connection) then the main thread sends a INSERT/REPLACE/INSERT-UPDATE/INSERT-ON-DUPLICATE-KEY command to all threads, waiting for all threads to return to continue down. All commit operations are threads that actively choose to do according to the cumulative amount of affected rows.

Again, the machine sucks and TPs doesn't make sense. Just looking at a trend.

Third, the experimental results

Description

Multi-line insert empty table--use "insert INTO ... VALUES (..), (..), (..), ... "To insert data into an empty table.
insert-update--on the basis of the previous step, the advanced line Insert Insert Action (one insert), and then check the error output, if there is a "primary key repetition" error, then call the UPDATE statement directly, Replace the row with the same data (that is, the direct original value overrides).
Multi-line replace empty table--use "Replace INSERT into ... VALUES (..), (..), (..), ... "To insert data into an empty table.
insert-duplicate--using INSERT into. VALUES (..), (..), (..), ... On DUPLICATE KEY UPDATE ... "The syntax is based on the previous step.

The conclusions are as follows:

For empty table operations, replace has the same performance as insert, but one additional benefit is that it can be overridden. This gives us a hint, if we really do not care about duplicate key error, and want to achieve coverage effect, then use replace really good, if do not care about duplicate key error, but also do not want to overwrite, then insert ignore better.
The traditional way of insert-update is really slow, understanding is not complicated, send the past--return--and send it--and return. Let's switch to Insert-on-duplicate-key-update.

MySQL Development performance Research--insert,replace,insert-update performance comparison

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

MySQL Development performance Research--insert,replace,insert-update performance comparison

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

MySQL Development performance Research--insert,replace,insert-update performance comparison

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support