The method and thinking of solving the problem that the Mysql sending data causes the query to be slow _mysql

Source: Internet
Author: User
Tags mysql query

Recently help locate a MySQL query is very slow problem, positioning process synthesis of a variety of methods, theories, tools, very representative, to share with you.

"Problem phenomenon"

Using Sphinx to support inverted indexing, but Sphinx query source data from MySQL, the number of records in the query is only tens of thousands of, but the query is very slow, about 4-5 minutes

"Processing Process"

1) explain

First suspect that the index is not built, so use explain to view the query plan, the results are as follows:


From the results of the explain, the entire statement's index design is not a problem, except for the first table because the business needs to perform a full table scan, the other tables are accessed through the index

2) show processlist;

Explain see no problem, that is slow in the end where?

So I thought of it. Use show processlist to view the state of the SQL statement execution, and the query results are as follows:


Found for a long time, the query is in the "sending data" state

Query the meaning of "sending data" state, the original name of this state is very misleading, the so-called "sending data" is not simply send data, but includes "Collect + Send data".

The key here is why to collect data, because MySQL uses the index to complete the query, MySQL gets a bunch of row IDs, if some columns are not in the index, MySQL needs to go back to "data rows" to read the data returned to the client.

3) Show profile

To further verify the time distribution of the query, the show Profile command is used to view the detailed time distribution

First Open configuration: Set Profiling=on;
After executing the query, use Show profiles to view the query ID;
Use show profiles for query query_id to view details;

The results are as follows:


As can be seen from the result, the state of sending data executes 216s

4) Investigation and comparison

The above steps have determined that the query is slow because a lot of time is spent on the sending data state, combining the definition of sending data to focus the target on the return column of the query statement

After one by one troubleshooting, it was finalized to a description column designed to: ' description ' varchar (8000) DEFAULT NULL COMMENT ' game description ',

So we took a contrastive approach and looked at how "not returning description results". The results of show profile are as follows:


Can be seen, do not return to description, query time only need 15s, return time, Need 216s, the difference is 15 times times

"Principle Research"

The question is clear, but we need to continue to explore the principle.

This Taobao article is a good explanation of the principles: some optimization suggestions for using large-InnoDB Text,blob

The key message here is that when the InnoDB storage format is ROW_FORMAT=COMPACT (or ROW_FORMAT=REDUNDANT ), the InnoDB stores only the first 768 bytes of the length and the remaining data is stored in the overflow page.

We use Show table status to view information about the table:


You can see that an average line of about 1.5K, also say about 1/10 rows using "Overflow storage", once used in this way to store, the return of data is originally ordered to read the data, it becomes a random read, resulting in a sharp decline in performance.

In addition, in the test process also found that no matter how many times the statement executed, even the entire table select * several times, the execution speed of the statement has not changed significantly. The table's data and index add up to only about 150M, and the entire InnoDB buffer pool has 5G, cache the entire table more than enough, if the overflow page cache, the performance should be significantly improved.

But the measured results did not improve, so from this test it can be inferred that InnoDB did not cache the overflow page (overflow page) into memory.

This design is also logical, because the overflow page is stored large data, if also placed in the cache, there will be a large data column (blob, text, varchar) query, may be all the cache update, This can cause other common query performance to drop dramatically.

"Workaround"

Find the root cause of the problem, the solution is not difficult. There are several ways:

1 query to remove description query, but this is limited to the implementation of the business, may need to do a large business adjustment

2 table structure optimization, will descripion split to another table, this change is larger, need to have a business with the modification, and if the business or to continue to query this description information, the optimization performance will not be greatly improved.

The above is the entire content of this article, I hope to help you learn.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.