In mysql, limit can achieve fast paging. However, if the data reaches several millions, our limit must be optimized to effectively implement paging. Otherwise, your server may be stuck.
This is a problem when a table has millions of data records!
For example, there is no problem with * from table limit. When limit and 10 are used, data reading is very slow. You can follow the steps below to solve this problem.
Page 1 will soon
On percona performance conference 2009, several Yahoo engineers provided a report titled "EfficientPagination Using MySQL".
Limit10000, 20 means to scan 10020 rows that meet the conditions, discard the first 10000 rows, and return the last 20 rows. The problem is here.
The LIMIT 451350 and 30 scans more than 0.45 million rows.
However
The limit 30 statement only scans 30 rows.
So if we have recorded the largest ID, we can write it here.
For example
Daily paging SQL statement
Select id, name, content from users order by id asc limit 100000,20
Scan 100020 rows
If the maximum ID of the last time is recorded
Select id, name, content from users where id> 100073 order by id asc limit 20
Scan 20 rows.
About 5 million of the total data
In the following example, select * from wl_tagindex where byname = 'F' order by id limit 3.21, and the 10 execution time is s.
After optimization:
Select * from (
Select id from wl_tagindex
Where byname = 'F' order by id limit 300000,10
)
Left join wl_tagindex B on a. id = B. id
The execution time is 0.11 s, and the speed is significantly improved.
Here, we need to note that the field I used here is byname, and id needs to composite the two fields; otherwise, the effect will not be significantly improved.
Summary
When a database table is too large, the offset value in LIMIT offset and length is too large, the SQL query statement will be very slow. You need to add order by and index the order by field.
If the subquery is used to optimize the LIMIT, the subquery must be continuous. In a sense, the subquery should not have the where condition. The where clause filters data and disconnects the data.
If you query a large number of records and the data transmission volume is large, such as fields of the text type, you can create a subquery.
SELECT id, title, content FROM items WHERE id IN (SELECT id FROM items order by id limit 900000, 10 );
If the offset value of the limit statement is large, you can pass the pk key value to reduce offset = 0. This primary key is preferably int type and auto_increment
SELECT * FROM users WHERE uid> 456891 order by uid LIMIT 0, 10;
This statement has the following effect:
SELECT * FROM users WHERE uid> = (SELECT uid FROM users order by uid limit 895682, 1) limit 0, 10;
If the offset value of limit is too large, the user may experience page turning fatigue. You can set a maximum offset value, which can be processed after it is exceeded. Generally, page turning is too large continuously and the user experience is poor, therefore, a better user experience should be provided to users.
Limit paging optimization method
1. Subquery optimization method
First find the first data, and then the id greater than or equal to this data is the data to be obtained
Disadvantage: the data must be continuous. It can be said that there is no where condition, and the where condition will filter the data, leading to data loss of continuity.
Lab
Mysql> set profi = 1;
Query OK, 0 rows affected (0.00 sec)
Mysql> select count (*) from Member;
+ ---- +
| Count (*) |
+ ---- +
| 1, 169566 |
+ ---- +
1 row in set (0.00 sec)
Mysql> pager grep !~ -
PAGER set to 'grep !~ -'
Mysql> select * from Member limit 10,100;
100 rows in set (0.00 sec)
Mysql> select * from Member where MemberID> = (select MemberID from Member limit 100) limit;
100 rows in set (0.00 sec)
Mysql> select * from Member limit 1000,100;
100 rows in set (0.01 sec)
Mysql> select * from Member where MemberID> = (select MemberID from Member limit 100, 1) limit;
100 rows in set (0.00 sec)
Mysql> select * from Member limit 100000,100;
100 rows in set (0.10 sec)
Mysql> select * from Member where MemberID> = (select MemberID from Member limit 100000,1) limit 100;
100 rows in set (0.02 sec)
Mysql> nopager
PAGER set to stdout
Mysql> show profilesG
* *************************** 1. row ***************************
Query_ID: 1
Duration: 0.00003300
Query: select count (*) from Member
* *************************** 2. row ***************************
Query_ID: 2
Duration: 0.00167000
Query: select * from Member limit 10,100
* *************************** 3. row ***************************
Query_ID: 3
Duration: 0.00112400
Query: select * from Member where MemberID> = (select MemberID from Member limit 100)
* *************************** 4. row ***************************
Query_ID: 4
Duration: 0.00263200
Query: select * from Member limit 1000,100
* *************************** 5. row ***************************
Query_ID: 5
Duration: 0.00134000
Query: select * from Member where MemberID> = (select MemberID from Member limit 100, 1) limit
* *************************** 6. row ***************************
Query_ID: 6
Duration: 0.09956700
Query: select * from Member limit 100000,100
* *************************** 7. row ***************************
Query_ID: 7
Duration: 0.02447700
Query: select * from Member where MemberID> = (select MemberID from Member limit 100000,1) limit 100
The results show that the subquery method can effectively improve the performance when the offset is greater than 1000.
2. Inverted table optimization
The inverted table method is similar to creating an index. A table is used to maintain the number of pages, and data is obtained through efficient connections.
Disadvantage: it is only suitable for scenarios where the data count is fixed and the data cannot be deleted, making it difficult to maintain the page table.
3. Reverse search optimization method
When the offset exceeds half the number of records, sort it first, so that the offset is reversed.
Disadvantage: order by is difficult to optimize. To increase the index, the index affects the data modification efficiency and the total number of records.
, The offset is greater than half of the data
Reference
Limit offset algorithm:
Forward search: (current page-1) * page length
Reverse Lookup: total record-current page * page length
Perform an experiment to check the performance.
Total Records: 1,628,775
Number of records per page: 40
Total number of pages: 1,628,775/40 = 40720
Intermediate page number: 40720/2 = 20360
Page 1
Forward query SQL:
SQL code
SELECT * FROM 'ABC' WHERE 'batchid' = 123 LIMIT 839960, 40
Time: 1.8696 seconds
Reverse Query SQL:
SQL code
SELECT * FROM 'ABC' WHERE 'batchid' = 123 order by InputDate desc limit 788775, 40
Time: 1.8336 seconds
Page 1
Forward query SQL:
SQL code
1. SELECT * FROM 'ABC' WHERE 'batchid' = 123 LIMIT 1199960, 40
SELECT * FROM 'ABC' WHERE 'batchid' = 123 LIMIT 1199960, 40
Time: 2.6493 seconds
Reverse Query SQL:
SQL code
1. SELECT * FROM 'ABC' WHERE 'batchid' = 123 order by InputDate desc limit 428775, 40
SELECT * FROM 'ABC' WHERE 'batchid' = 123 order by InputDate desc limit 428775, 40
Time: 1.0035 seconds
Note: The reverse query result is desc in descending order, and InputDate is the insert time of the record. You can also use the primary key to join the index, but it is not convenient.
4. limit optimization method
Limit the limit offset to a certain number .. If this number is exceeded, there is no data. I remember the dba of alibaba said they did this.
5. Query index only