When there are tens of thousands of records in a table that needs to be queried from a database, all the results of a one-time query become slow, especially as the amount of data is particularly noticeable, and paged queries are required. There are also a number of methods and optimized points for database paging queries. Here's a brief look at some of the ways I know.
Preparatory work
To test some of the optimizations listed below, a table is described below.
- Table Name: Order_history
- Description: Order History table for a business
- primary field: Unsigned int id,tinyint (4) INT type
- Field condition: The table has 37 fields, does not contain large arrays such as text, the maximum is varchar ($), the ID field is indexed, and is incremented.
- Data Volume: 5709294
MySQL version: 5.7.16
It's not easy to find a millions test sheet offline, and if you need to test it yourself, you can write a shell script or insert data to test it.
The following SQL all statement execution environment has not changed, the following is the basic test results:
selectcountfrom orders_history;
Results returned: 5709294
The three query times were:
General Paging Query
A generic paging query can be implemented using a simple limit clause. The limit clause declares the following:
SELECTFROMtableLIMITrowsrows OFFSET offset
The limit clause can be used to specify the number of records returned by the SELECT statement. The following points are to be noted:
- The first parameter specifies the offset of the first return record row
- The second parameter specifies the maximum number of rows that are returned for a record
- If only one parameter is given: it represents the maximum number of record rows returned
- The second parameter is-1 to retrieve all record rows from an offset to the end of the recordset
- The offset of the initial record line is 0 (not 1)
Here is an example of an application:
selectfromwhere type=8limit1000,10;
This statement will query the 10 data after the 1000th data from table Orders_history, that is, 1001th to 10,010th data.
Records in a datasheet are sorted by default using a primary key (usually an ID), and the result is the equivalent of:
selectfromwhere type=8orderbyidlimit10000,10;
The three query times were:
For this type of query, the following tests the impact of the query record volume on time:
Select* fromOrders_historywhereType=8 Limit 10000,1;Select* fromOrders_historywhereType=8 Limit 10000,Ten;Select* fromOrders_historywhereType=8 Limit 10000, -;Select* fromOrders_historywhereType=8 Limit 10000, +;Select* fromOrders_historywhereType=8 Limit 10000,10000;
Three times the query time is as follows:
- Query 1 records: 3072ms 3092ms 3002ms
- Query 10 records: 3081ms 3077ms 3032ms
- Query 100 records: 3118ms 3200ms 3128ms
- Query 1000 records: 3412ms 3468ms 3394ms
- Query 10,000 records: 3749ms 3802ms 3696ms
In addition I did 10 cretin query, from the query time, the basic can be determined, in the query record volume is less than 100, the query time basically no gap, with the query records more and more large, the time spent will be more and more.
Test for Query offset:
Select* fromOrders_historywhereType=8 Limit -, -;Select* fromOrders_historywhereType=8 Limit +, -;Select* fromOrders_historywhereType=8 Limit 10000, -;Select* fromOrders_historywhereType=8 Limit 100000, -;Select* fromOrders_historywhereType=8 Limit 1000000, -;
Three times the query time is as follows:
- Query 100 offset: 25ms 24ms 24ms
- Query 1000 offset: 78ms 76ms 77ms
- Query 10000 offset: 3092ms 3212ms 3128ms
- Query 100000 offset: 3878ms 3812ms 3798ms
- Query 1000000 offset: 14608ms 14062ms 14700ms
As the query offset increases, the query time increases sharply, especially after the query offset is greater than 100,000.
This method of paging is scanned from the first record in the database, so the slower the query, the more data you query, and the slower the overall query speed.
Using sub-query optimizations
This method first locates the ID of the offset position and then queries back, which is appropriate for the case of an ID increment.
Select* fromOrders_historywhereType=8 Limit 100000,1;Select ID fromOrders_historywhereType=8 Limit 100000,1;Select* fromOrders_historywhereType=8 andId>= (Select ID fromOrders_historywhereType=8 Limit 100000,1)Limit -;Select* fromOrders_historywhereType=8 Limit 100000, -;
The query time for the 4 statement is as follows:
- 1th statement: 3674ms
- 2nd statement: 1315ms
- 3rd statement: 1327ms
- 4th statement: 3710ms
The above query requires attention:
- Compare 1th and 2nd statements: 3 times times faster with select ID instead of SELECT *
- Compare 2nd and 3rd statements: Speed difference dozens of milliseconds
- Compare 3rd and 4th statements: The 3rd statement query speed increases by 3 times times thanks to the increase in the select ID speed
This way, compared to the original general Query method, will increase the number of times faster.
Using ID-qualified optimizations
In this way, if the ID of the data table is continuously incremented , we can calculate the range of the ID of the query based on the number of pages and records of the query, which can be queried using ID between and:
selectfromwhere type=2andidbetween1000000and1000100limit100;
Enquiry Time: 15ms 12ms 9ms
This kind of query can greatly optimize the query speed, basically can be completed within dozens of milliseconds. The restriction is only used to explicitly know the ID, but when the table is established, the Basic ID field is added, which brings a lot of traversal to the paging query.
There is another way to do this:
selectfromwhereid1000001limit100;
Of course, you can also use in the way to query, this method is often used in multi-table association when the query, using the other table Query ID collection, to query:
selectfromwhereidin(selectfromwhere‘pen‘)limit100;
This in query way to note: Some MySQL versions do not support the use of limit in the in clause.
Using Temporal table optimization
This approach is not part of query optimization, which comes with a mention here.
For problems in using the ID-Qualifying optimization, the ID is continuously incremented, but in some scenarios, such as when using a history table, or when there is a data loss problem, consider using a temporary stored table to record the paging ID, using the paged ID for in queries. This can greatly improve the traditional paging query speed, especially the amount of data tens of millions of times.
About the data table ID description
In general, when creating tables in a database, it is mandatory to add an ID increment field for each table, which makes it easy to query.
If the amount of data, such as the order library, is very large, the database is generally divided into tables. It is not recommended to use the database ID as a unique identifier at this time, but instead use the distributed high concurrency unique ID generator to generate and use additional fields in the data table to store this unique identity.
Using the scope query to locate the ID (or index) and then use the index to locate the data can improve the query speed several times. That is, select ID First, then select *;
I caishuxueqian, inevitably make mistakes, if found that there are errors in the article omitted, hope to enlighten.
Original address: http://uusama.com/458.html
MySQL Paging query optimization