MySQL Development performance Research--insert,replace,insert-update performance comparison

Source: Internet
Author: User

First, why do we have this experiment

Our system is a batch processing system, similar to the architecture of a pipeline. Each data table is the end of the pipeline, and our program is similar to the pipeline itself. All we need to do is to extract the data from a table, through a certain filtering, summary and other operations placed in the B table. If there is an error, then run the pipeline again. Therefore, our system actually does not have any kind of transactional at all, just hang up the table to truncate (or conditionally delete), and then re-run on the line.

This makes it relatively easy for SELECT statements, which basically do not require a join operation. There are, however, some requirements for write operations. For example, you need to deal with the primary key duplication (probably before you run, now you need to re-run, whether it is a hint of error, or do a replace or update) and so on.

After the introduction of MySQL, we found that MySQL provides a solution to a similar problem at the SQL statement level. Includes the operation of the insert,replace,insert-on-duplicate. Please check here for specific instructions. The only thing to be aware of is insert-on-duplicate this operation, in the update the meaning of the values is the insert list of the fixed value, if you need to reference the original value in the data table, or directly use the column name, do not need to use the values to wrap.

II. Preparation of the experiment

I'm still using the largest table we could use here, which has nearly 200 fields. The experimental environment is the same as in the previous article. With the comparison in that article, I have directly used 10 multi-line insertion methods, and each of the 5,000 articles was submitted once. To make a comparison, I deliberately made a traditional insert-update operation. The Advanced Line Insert action, and then check the output, if there is a "primary key duplication" error, then call the UPDATE statement directly, with the same data to replace the row (that is, the direct original value overrides). Note that this approach is not possible to do multi-line insertion.

Again, in order to make the scene more realistic. I created three databases on the same MySQL service, all of which created the table. And all the operations are done directly against the three tables. The tool I used in the code was a class library that I wrote myself. Through multi-line thread attached to docu (one library one connection) then the main thread sends a INSERT/REPLACE/INSERT-UPDATE/INSERT-ON-DUPLICATE-KEY command to all threads, waiting for all threads to return to continue down. All commit operations are threads that actively choose to do according to the cumulative amount of affected rows.

Again, the machine sucks and TPs doesn't make sense. Just looking at a trend.

Third, the experimental results

Description

    • Multi-line insert empty table--use "insert INTO ... VALUES (..), (..), (..), ... "To insert data into an empty table.
    • insert-update--on the basis of the previous step, the advanced line Insert Insert Action (one insert), and then check the error output, if there is a "primary key repetition" error, then call the UPDATE statement directly, Replace the row with the same data (that is, the direct original value overrides).
    • Multi-line replace empty table--use "Replace INSERT into ... VALUES (..), (..), (..), ... "To insert data into an empty table.
    • insert-duplicate--using INSERT into. VALUES (..), (..), (..), ... On DUPLICATE KEY UPDATE ... "The syntax is based on the previous step.

The conclusions are as follows:

    • For empty table operations, replace has the same performance as insert, but one additional benefit is that it can be overridden. This gives us a hint, if we really do not care about duplicate key error, and want to achieve coverage effect, then use replace really good, if do not care about duplicate key error, but also do not want to overwrite, then insert ignore better.
    • The traditional way of insert-update is really slow, understanding is not complicated, send the past--return--and send it--and return. Let's switch to Insert-on-duplicate-key-update.

MySQL Development performance Research--insert,replace,insert-update performance comparison

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.