A 20-second SQL slow query optimization experience and processing scheme

Last Update:2015-04-11 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Background

A few days ago in the project on-line process, found that a page does not get the data correctly, after troubleshooting the interface call time-out, and the last discovery is because the SQL query length of more than 20 seconds to cause the problem occurs.

Here, there is no advanced theory or technology, just remember to go through and read some of the ideological misunderstanding.

Composition of complex SQL statements

There is not much to describe the business function, but in order to highlight the problem, the analogy statement is used to describe the scene at that time. A complex SQL statement can be expressed as follows:

SELECT * from A_table as a left JOIN b_table as B on A.id=b.id WHERE a.id in (SELECT DISTINCT ID from a_table wher E user_id in (100,102,103) GROUP by USER_ID have count (ID) > 3)

Correlation Query

From the simplified SQL statement above, you can see that the first thing to do is to correlate the query.

Sub-query

Second, it is a nested subquery. This subquery is to find out the group IDs that are common to multiple users. So the "100,102,103" in the statement is based on the scene and needs to correspond to the number of "count (ID) > 3" later. Simply put, it is the group ID of the user intersection.

Where's the time?

Let's assume that the A_table table has a data volume of 20W, while the b_table data is 2000W. Can you think of the main time-consuming part of the query, or the sub-query section?

(think space ....) ）

Problem Locator

For the principle of SQL underlying and advanced theory, I temporarily mastered not deep enough. But I know that it is possible to use analogies and simple tests to verify which part of the problem is wrong.

Preliminary determination

First, for a single user ID, I'll simplify the above statement to:

SELECT * from A_table as a left JOIN b_table as B on A.id=b.id WHERE user_id in (100)

Therefore, the initial assertion should be that the subset of nested subqueries takes up most of the time.

Further verification

Now that you have the problem of nested subquery statements, it is divided into two areas to be cleared: is the subquery itself time consuming, or is it nesting and causing slow queries?

It is easy to see that when I execute the query alone in DB, it is very fast. So ruled out.

The rest of it goes without saying that a slow query of20 seconds is caused by nesting .

But because of the urgency of the line, to make sure, I quickly verified my conclusion:

1, the ID of the subquery is executed separately, and the resulting sequence is manually spelled into an ID, such as: 1,2,3,4, ..., 999

2, manually replace the above-obtained sequence ID with the original SQL statement

3, execution, Discovery, soon! it only takes about a few Ms

Well done! Ready to repair on-line!

Solution Solutions

Online problems, a lot of time is in the location of the problem and analysis of the reasons, since the problem has been found, the reasons are found, the solution is self-evident. The code can be easily processed.

Another point to be aware of

Currently, the actual SQL statement is more complex than this, but it is enough to express the problem. But in the early days, the author also made some SQL code.

Because b_table is larger than a_table, so the first b_table left associated a_table, very slow, about 1 seconds more , and the amount of data is very small, but if in turn, a_table left associated b_table, then quickly, It's about 100 milliseconds .

So, another interesting phenomenon is found:

Large table Left association small table, very slow, small table left associative right table, soon.

Of course, these we all know theoretically, but the actual development will be forgotten. Or when both tables are empty at first, and without taking into account the speed at which the two tables grow later, the pits will be buried in the future.

Summarize

First, nested subqueries are slow.

Reasons, I have not studied carefully, but on the way to work with my colleagues, he said that once read the relevant books, is that every time the subquery will produce an SQL statement, so it is N times query. Another senior QA colleague told me that it should be a m*n problem.

Second, when I started using nested subqueries, there was a misconception that I thought it would be more efficient to give these operations to MySQL itself, after all, there would be a good mechanism within DB to execute these queries.

Then, the actual confession, I was wrong. Because this is not a simple merge MC batch query.

When we decide to use some underlying technology, we can use it more appropriately only when we understand it thoroughly. And because ignorance determines the tools, the framework, the bottom of the omnipotent, often will be in the strokes.

A 20-second SQL slow query optimization experience and processing scheme

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

A 20-second SQL slow query optimization experience and processing scheme

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

A 20-second SQL slow query optimization experience and processing scheme

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support