The amount of data is too large, paging query slow, what is the way to optimize the query?

Source: Internet
Author: User
Because the user record is too much (3 million), the paging query becomes very slow, how to optimize the statement or index? Do you have any optimization plan?

Supplemental content:
Thank you for your proposal, I see very inspired, but the combination of my situation is more special, not very applicable. The details are as follows:
1 when the user opens the participation record page, it is time to display the record of the latest 10 participants. (These 10 persons ' records are not the latest 10 data from the user's participation in the record table but the 10 data filtered by the Where condition.) )
2 The latest participation in 10 person data is obtained from two tables (User information table, User participation record table).
3 to ensure that users can turn the next page of data, the previous page of data.
Now check the data is connected to two tables, where two tables of conditions, and then limit the data. This is too slow, so want to ask what is the optimization plan?

Reply content:

Because the user record is too much (3 million), the paging query becomes very slow, how to optimize the statement or index? Do you have any optimization plan?

Supplemental content:
Thank you for your proposal, I see very inspired, but the combination of my situation is more special, not very applicable. The details are as follows:
1 when the user opens the participation record page, it is time to display the record of the latest 10 participants. (These 10 persons ' records are not the latest 10 data from the user's participation in the record table but the 10 data filtered by the Where condition.) )
2 The latest participation in 10 person data is obtained from two tables (User information table, User participation record table).
3 to ensure that users can turn the next page of data, the previous page of data.
Now check the data is connected to two tables, where two tables of conditions, and then limit the data. This is too slow, so want to ask what is the optimization plan?

1. Do not use limit,limit 1000,20 without conditions will be scanned from 0~10002
2. Remember previous page last user primary key, paging with SELECT * from user where Uid>lastuid limit pagesize
3. Data consistency requirements are not high, consider applying to the middle of the database and a layer of cache

EXPLAIN SELECT * FROM lagou where id <=100

EXPLAIN SELECT * FROM lagou LIMIT 100

Note that type one is range and one is all

Summary: How painful the Understanding ~

MySQL Limit paging optimization (where ID is the posts table self-increment primary key):
SELECT * FROM posts ORDER BY id DESC LIMIT 20 OFFSET 10000
Scan 10020 lines, skip the previous 10000 lines, return to the last 20 rows, slow.
A better design is to replace specific pages with the "next Page" button.
Assuming that 20 records are displayed per page, we will return 21 records each time we query and show only 20 entries.
If 21st exists, then we'll show "next page" or "Ajax load more" button.

Use previous and next pages for paging:
Prev (id:60~41)
Current page (id:40~21)
Next page (id:20~01)

Prev (new article 20):
Page.php?id=40 positive represents the previous page of a new article, where 40 represents the ID of the topmost article in the current page.
SELECT * FROM posts WHERE id > 40 ORDER BY id ASC LIMIT 20;
Here is the ascending ID sequence, which PHP uses to array_reverse the inverted array for descending output.

Next page (old article 20):
Page.php?id=-21 negative indicates the next page of the old article, where 21 represents the ID of the article at the bottom of the current page.
SELECT * FROM posts WHERE id < 21 ORDER BY id DESC LIMIT 20;
If you think negative numbers are bad, you can add an additional parameter, such as Page.php?id=21&next

I personally think that you are a paged query optimization, only two table join after the paging

Typical paging optimizations are:

General wording:

select * from buyer where sellerid=100 limit 100000,20

Normal limit M, N of the page, the more slowly in the process of paging back, the reason MySQL will read the table in the first m+n bar data, M larger, the worse performance

Optimized notation:

select t1.* from buyer t1, (select id from buyer where sellerid=100 limit 100000,20 ) t2 where t1.id=t2.id

You need to create an index in the Sellerid field in the T table, with the ID as the primary key of the table

If the ID primary key is self-increasing and contiguous (no delete record appears) then you can use the where between

Suppose user_id is a self-increasing primary key

SELECT * FROM userWHERE user_id >= (SELECT user_id FROM userLIMIT pageIndex, 1)LIMIT pageSize;

Reference: https://mp.weixin.qq.com/s?__biz=MjM5NzMyMjAwMA==&mid=2651477279&idx=1&sn= 4a799a4aa2f8563d482973e262e906b8&scene=1&srcid=0629qe4rkqmpbhwouzfjek1b&key= 77421cf58af4a65302d0d5987b2f5d1610469f10ba9f2ac129747c2578d1a68463e33f8c7f388537de80a7bc38d00c05&ascene=0 &uin=odyxodm1mde4&devicetype=imac+macbookair6%2c2+osx+osx+10.9.5+build (13F34) &version=11020012 &pass_ticket=azdru5jchrxbpbvmrg0m%2bkuwb4sjz2hifztt6ldc1xnw2zmj0mgdjwuyzgisqjxa

Define the WHERE clause according to your own query criteria, and do not repeat the count record quantity.
1-remembers the last record of the current page, adding the corresponding where clause to the next query, instead of skipping n rows, according to the order by
2-large amount of data when the total number of pages in the actual use of the time has lost meaning, but the consumption of the count is very large, it is necessary to obtain the first time when the query is good.

Agree with the upstairs opinion

1. Perfect Search and Index establishment
2. Design data table default sorting as collation, reduce query ordering time
3. The return data set is greatly reduced by using the last ID as the next retrieval condition in the case of an ordinal number of IDs
4. Reduce count and precision, very large data can cache count value
5. Larger data can be split by page rules, types, etc. to reduce the amount of data per query

1 You reduce the use of functions for SQL statements
2 reduction of order by and GROUP by
3 adding an index to a field of a query condition
4 300w of records should be operational records, not real-time needs to show the data, then you can do data caching.
5 JS uses $ (obj). Load ("URL #id") to make a partial refresh of Ajax pages to avoid reloading your resources
That's all I know.

As I understand it, your situation is that the constraints are in the User Participation record table, and the User information table is actually as long as the ID of the previous table is used to fetch the data. I guess your index has been done, the rest of the big change is the data partition and pre-sorting, participate in the record table by the main query criteria, user information table by ID partition. Then participate in the record table must be pre-ordered, if the query to reorder, then how can not get up.

Add the Where Condition primary key >0 and then just take the primary key, this is paged, take the content with where in direct to take the primary key, much faster than you check

A user's information is cached, paged query to check the user participation in the record table

Two tables are associated, and a record of the 10 most recent participants is displayed. When this is associated, the SQL of the query is now complex. Therefore, it is recommended to implement through two sql+php traversal. The query requires a personal record of a SQL, queries the member for a SQL, and then iterates through the array to generate the required array. In short, the complex SQL is split into simple SQL

Build the right Index
Query first paging primary key, and then through the primary key results check the content has done 7kw log data query, paging quickly.

If the data is not effective, it can be queried with Sphinx.

When the limit base is larger, use between to compare the subsequent data with DESC reverse lookup.

Can write a timed script, 10 minutes to run this, the data will be taken to save, the user opens the page from this table. Just to show the latest engagement, a few minutes of delay should be acceptable.

First carefully review your SQL there is no room for optimization, and then consider the database tuning, caching, read and write separation method

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.