How does Mysql LIMIT optimize (reprinted) a large amount of data? mysqllimit
The following is a brief introduction to Mysql LIMIT. We all know that the LIMIT clause is generally used to LIMIT the actual number of rows returned by the SELECT statement. LIMIT takes one or two numeric parameters. If two parameters are specified, the first parameter specifies the offset of the first row to be returned, and the second parameter specifies the maximum number of returned rows.
Offset of the initial row
The value is 0 (not 1 ).
- Mysql> select * from table LIMIT 6, 10;
Line 7-16
If a parameter is specified, it indicates the maximum number of returned rows.
- Mysql> select * from table LIMIT 5;
Obtain the first five rows.
In other words, LIMIT n is equivalent to Mysql LIMIT 0, n. MYSQL optimization is very important. The other most commonly used and most needed optimization is limit. Mysql limit greatly facilitates paging. However, when the data volume is large, the performance of limit decreases sharply. 10 data records are also retrieved.
- Select * from yanxue8_visit limit, 10
And
- Select * from yanxue8_visit limit 0, 10
It is not an order of magnitude.
There are also many five limit optimization guidelines on the Internet, which are translated from the mysql manual. They are correct but not practical. Today I found an article about limit optimization, which is quite good. Original address: http://www.zhenhua.org/article.asp? Id = 200 (the original text is attached below)
Instead of using limit directly, we first get the offset id and then use Mysql limit size to get data. Based on his data, it is much better to use limit directly. Here I use data in two cases for testing. (Test environment: win2033 + p4 dual-core (3 GHZ) + 4G memory mysql 5.0.19)
1. When the offset value is small.
- Select * from yanxue8_visit limit 10, 10
Run multiple times and keep the time between 0.0004 and 0.0005
- Select * From yanxue8_visit Where vid> = (
- Select vid From yanxue8_visit Order By vid limit 10, 1
- ) Limit 10
Run multiple times and keep the time between 0.0005-0.0006. The main reason is 0.0006. Conclusion: When the offset is small, it is better to use limit directly. This is obviously the cause of subquery.
2. When the offset value is large
- Select * from yanxue8_visit limit, 10
Run multiple times and keep the time at around 0.0187
- Select * From yanxue8_visit Where vid> = (
- Select vid From yanxue8_visit Order By vid limit, 1
- ) Limit 10
Run multiple times, with a time of around 0.0061, only 1/3 of the former. It can be predicted that the larger the offset, the higher the latter.
Attached Original:
Select * from table LIMIT 5, 10; # return data in rows 6-15
Select * from table LIMIT 5; # Return the first five rows
Select * from table LIMIT; # Return the first five rows
Performance Optimization:
Based on the high performance of Mysql limit in MySQL5.0, I have a new understanding of data paging.
- Select * From cyclopedia Where ID> = (
- Select Max (ID) From (
- Select ID From cyclopedia Order By ID limit 90001
- ) As tmp
- ) Limit 100;
- Select * From cyclopedia Where ID> = (
- Select Max (ID) From (
- Select ID From cyclopedia Order By ID limit 90000,1
- ) As tmp
- ) Limit 100;
Is it faster to get 90000 records after 100, 1st sentences or 2nd sentences?
The first 1st records are obtained first, and the largest ID value is used as the start ID. Then, the first 90001 records can be quickly located.
The 2nd clause selects only the first record after the first record, and then takes the ID value as the starting marker to locate the next 90000 records.
1st statement execution result. 100 rows in set (0.23) sec
2nd statement execution result. 100 rows in set (0.19) sec
Obviously, 2nd sentences won. it seems that limit does not seem as much as I previously imagined to do a full table scan and return limit offset + length records, so it seems that limit is much better than the Top performance of the MS-SQL.
In fact, 2nd sentences can be simplified
- Select * From cyclopedia Where ID> = (
- Select ID From cyclopedia limit 90000,1
- ) Limit 100;
Using the IDs of 90,000th records directly does not require the Max operation. In this way, the theoretical efficiency is higher, but the results are almost invisible in actual use, because the ID returned by the positioning itself is a record, Max can get the result almost without operation, but this write is clearer, saving the time to draw a snake.
However, since MySQL has a limit that can directly control the location where records are retrieved, why not simply use Select * From cyclopedia limit, 1? Isn't it more concise?
I thought it would be wrong. After I tried it, I knew that the result was: 1 row in set (8.88) sec. How is it so scary, it reminds me of the "high score" I had in 4.1 yesterday ". select * it is best not to use it casually. Based on the principle of "what to use" and "what to choose", the more Select fields, the larger the field data volume, the slower the speed. which of the above two paging methods is much better than the single-write method? Although it seems that the number of queries is more, it is actually at a lower cost in exchange for efficient performance, it is very worthwhile.
The 1st schemes can also be used for MS-SQL, and may be the best, because it is always the fastest to locate the start segment by the primary key ID.
- Select Top 100 * From cyclopedia Where ID> = (
- Select Top 90001 Max (ID) From (
- Select ID From cyclopedia Order By ID
- ) As tmp
- )
But whether the implementation method is stored in the process or the direct code, the bottleneck is always that the TOP of the MS-SQL is always to return the first N records, this situation in the amount of data is not deep, but if hundreds of thousands, efficiency will definitely be low. in contrast, MySQL limit has many advantages. Execute:
- Select ID From cyclopedia limit 90000
- Select ID From cyclopedia limit 90000,1
The MS-SQL can only use Select Top 90000 ID From cyclopedia for execution time is 390 ms, the same operation time is less than MySQL 360 ms.