A simple paging Optimization Method for MySQL in Big Data scenarios.
Usually, an application needs to flip the data in a table. If the data volume is large, it may cause performance problems:
root@sns 07:16:25>select count(*) from reply_0004 where thread_id = 5616385 and deleted = 0;+———-+| count(*) |+———-+| 1236795 |+———-+1 row in set (0.44 sec)root@sns 07:16:30>select idfrom reply_0004 where thread_id = 5616385 and deleted = 0order by id asc limit 1236785, 10 ;+———–+| id |+———–+| 162436798 || 162438180 || 162440102 || 162442044 || 162479222 || 162479598 || 162514705 || 162832588 || 162863394 || 162899685 |+———–+10 rows in set (1.32 sec)
Index: threa_id + deleted + id (gmt_Create)
10 rows in set (1.32 sec)
These two sqls are used for querying the paging SQL query on the last page. Because a page flip usually only needs to query small data items, such as 10, but a large amount of data needs to be scanned backward, that is, the more pages to be searched, the more data to be scanned, the query speed is getting slower and slower.
Because the size of the queried data volume is fixed, if the query speed is not affected by the page turning, or the impact is the lowest, this is the best result (the speed of the last few pages of the query is the same as the speed of the first few pages ).
When turning pages, you often need to sort a field (This field is in the index) in ascending order. So can we solve the above problems by using the order of indexes? The answer is yes. For example, if 10000 pieces of data need to be sorted by page, the first 5000 pieces are sorted by asc, And the last 5000 pieces are sorted by desc, and corresponding adjustments are made in the limit startnum and pagesize parameters.
But this undoubtedly brings complexity to the application. This SQL is used to reply to posts in the forum. When users read the post, they usually view the previous and last pages, in the case of page flip, the last page flip query uses the desc method to achieve page flip, which can better improve the performance:
root@snsgroup 07:16:49>select * from (select id-> from group_thread_reply_0004 where thread_id = 5616385 and deleted = 0-> order by id desc limit 0, 10)t order by t.id asc;+———–+| id |+———–+| 162436798 || 162438180 || 162440102 || 162442044 || 162479222 || 162479598 || 162514705 || 162832588 || 162863394 || 162899685 |+———–+10 rows in set (0.87 sec)
We can see that the performance has been improved by more than 50%.