MySQL Big Data volume limit optimization

Source: Internet
Author: User
Tags joins


Background: When the data in the database reaches millions of thousands, if the page is to be paged (but not so much in general), what do we need to do if the business requires it?
I used a local production of a table has more than 5 million of the table, to test, the table is named Big_data;
First we look at the following SQL statements:
Before that, we turned on profiling to monitor the execution of SQL statements.
Set profiling=1;
1. Query from the beginning of section 10w data page 10
2. Query page 10 from the data of article 20w
3. Query page 10 from the data of article 30w

3. Query page 10 from the data of article 300w

3. Query page 10 from the data of article 500w

We can see that the query from the beginning of the 200w paging is still relatively fast, but from the beginning of 500w speed has become very slow, this is not very satisfying.

Mysql> Select Id,my_name from Big_data limit 5000000, 10;

+---------+------------+

| ID | My_name |

+---------+------------+

| 5000001 | Kwcwziqhnu |

| 5000002 | nlpqmmwajv |

| 5000003 | kskutlxdbx |

| 5000004 | Ptavbtpubz |

| 5000005 | WHSUSHIUVX |

| 5000006 | TCDLWZHNQT |

| 5000007 | QHMNEKJSMH |

| 5000008 | Uqrmluqvgr |

| 5000009 | UZKEQPEBTQ |

| 5000010 | SKUVSEPMPQ |

+---------+------------+

Rows in Set (2.34 sec)

Mysql> Show Profiles;

+----------+------------+--------------------------------------------------+

| query_id | Duration | Query |

+----------+------------+--------------------------------------------------+

| 1 | 0.02591075 | Select Id,my_name from Big_data limit 100000,10 |

| 2 | 0.05773150 | Select Id,my_name from Big_data limit 200000,10 |

| 3 | 0.08253525 | Select Id,my_name from Big_data limit 300000,10 |

| 4 | 1.38455375 | Select Id,my_name from Big_data limit 3000000,10 |

| 5 | 2.34040775 | Select Id,my_name from Big_data limit 5000000,10 |

+----------+------------+--------------------------------------------------+

5 rows in Set, 1 Warning (0.00 sec)


Show Profiles;

We have the following two ways to solve the problem:
(1), by judging the scope of the ID to page
Select Id,my_sn from Big_data where id>5000000 limit 10;
Also got paged data, but we found that if the ID is not sequential, that is, if the data is deleted, then the paging data will be incorrect, this is flawed.
(2), through the connection query to pagination
We can first query the 500w data to start paging the 10 IDs, and then through the connection query display data
Mysql> Select B.id,b.my_name from big_data as b inner join (SELECT ID from big_data ORDER by ID limit 4500000,10) as TMP on Tmp.id=b.id;

We test the paging data at different starting ends

Mysql> Select B.id,b.my_name from big_data as b inner join (SELECT ID from big_data ORDER by ID limit 5000000,10) as TMP on Tmp.id=b.id;

+---------+------------+

| ID | My_name |

+---------+------------+

| 5000001 | Kwcwziqhnu |

| 5000002 | nlpqmmwajv |

| 5000003 | kskutlxdbx |

| 5000004 | Ptavbtpubz |

| 5000005 | WHSUSHIUVX |

| 5000006 | TCDLWZHNQT |

| 5000007 | QHMNEKJSMH |

| 5000008 | Uqrmluqvgr |

| 5000009 | UZKEQPEBTQ |

| 5000010 | SKUVSEPMPQ |

+---------+------------+

Rows in Set (2.15 sec)

Mysql> Show Profiles;

+----------+------------+-------------------------------------------------------------------------------------- ----------------------------------------------+

| query_id | Duration |                                                                                                                              Query |

+----------+------------+-------------------------------------------------------------------------------------- ----------------------------------------------+

| 1 | 0.02591075 |                                                                                    Select Id,my_name from Big_data limit 100000,10 |

| 2 | 0.05773150 |                                                                                    Select Id,my_name from Big_data limit 200000,10 |

| 3 | 0.08253525 |                                                                                    Select Id,my_name from Big_data limit 300000,10 |

| 4 | 1.38455375 |                                                                                   Select Id,my_name from Big_data limit 3000000,10 |

| 5 | 2.34040775 |                                                                                   Select Id,my_name from Big_data limit 5000000,10 |

| 6 | 0.00004200 |                                                                                                                  Reset Query Cache |

| 7 | 0.01999275 | Select B.id,b.my_name from Big_data as b inner joins (SELECT ID from Big_data order by ID, limit 100000,10) as TMP on TMP. Id=b.id |

| 8 | 0.03888825 | Select B.id,b.my_name from Big_data as b inner joins (SELECT ID from Big_data order by ID, limit 200000,10) as TMP on TMP. Id=b.id |

| 9 | 0.37394450 | Select B.id,b.my_name from Big_data as b inner join (SELECT ID from big_data ORDER by ID limit 1000000,10) as TMP on TMP . id=b.id |

| 10 | 1.33475700 | Select B.id,b.my_name from Big_data as b inner join (SELECT ID from big_data ORDER by ID limit 3000000,10) as TMP on TMP . id=b.id |

| 11 | 2.14759000 | Select B.id,b.my_name from Big_data as b inner join (SELECT ID from big_data ORDER by ID limit 5000000,10) as TMP on TMP . id=b.id |

If we suspect there's a cache, we can clearly cache the query later.

Reset query Cache;

?
Show profile for query 3;//view the execution of the third SQL statement that was recorded
It can be seen that the data found in both methods are consistent, but the speed of the method two is faster than that of the previous single-table query.

Analysis: Because the MySQL paging query is the first paging data before the query, and then after the interception of data is not paged out to get rid of the results so that the amount of data is too large after the slow paging is understandable.
However, we can first query the ID needs to be paged out, because the ID is the primary key ID primary key index, the query is still much faster, and then according to the ID connection query the corresponding paging data, not all the connection query will be more than
A single query should be slow, depending on the situation.

MySQL Big Data volume limit optimization

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.