Optimization policies for updating SQL statements in mysql large tables

Source: Internet
Author: User

Problem SQL Background: if the project has six tables, the corresponding brand_id field should be written according to the pid field. However, there are two tables of tens of millions. After my worker is running, the online mysql master-slave synchronization is immediately delayed! After running for more than an hour, the latency was 40 minutes, and only tens of thousands of rows of data were updated. The SQL is as follows:

<! -- Update the brand id based on the product id --> <update id = "updateBrandIdByPid" parameterClass = "com. jd. chat. worker. domain. param. updateBrandIdParam "> UPDATE $ tableName $ SET brand_id = # newBrandId # WHERE pid = # pid # AND brand_id = 0 </update>
The mysql expert of the project team helped me analyze the problem. Because the pid field has no index, the mysql engine needs to scan the columns with the same pid value as the input one row by row and then update the data, that is, the SQL statement can be executed only after more than 1000 rows of disk data are scanned. What's more serious is the number of different PIDs in a ten-million-level table, and the number of such SQL statements to be executed.

My colleague suggested vertical table sharding Based on the id field in the SQL code hierarchy. Each time you update 1000 rows of data, the mysql engine does not need to scan the entire table every time, and the database pressure is one thousandth of the previous. As the primary key, id is indexed and indexed, which can greatly optimize the query performance. The optimized SQL statement is as follows:

<! -- Update the brand id based on the product id --> <update id = "updateBrandIdByPid" parameterClass = "com. jd. chat. worker. domain. param. updateBrandIdParam "> UPDATE $ tableName $ SET brand_id = # newBrandId # WHERE pid = # pid # AND brand_id = 0 AND id BETWEEN # startNum # AND # endNum # </update>
Only use the id range statement to vertically cut the code layers of a large table of tens of millions. After the worker is re-launched, there is no latency between the mysql master and slave! After monitoring, tens of thousands of data records were updated in just 10 minutes, which is six times more efficient than before! More importantly, database load balancing allows applications to run properly.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.