SQLLimit usage _ MySQL

Source: Internet
Author: User
Tags percona
In SQLLimit usage, the first parameter in Limit is offset.

In MySQL, limit is used to retrieve data in order. Therefore, order by random () is used to obtain data randomly ();


Netezza: use the random () function

Select setseed (random ());

Select * from table order by random () limit 100;


Mysql uses the Rand () function.

----------------------------------------------------------------------------



SELECT * FROM Table LIMIT [Offset,] Rows | Rows OFFSET Offset Mysql> SELECT * FROM Table LIMIT 5, 10; // Retrieve record rows 6- 15

// To retrieve all record rows from an offset to the end of the record set, you can specify the second parameter - 1:
Mysql> SELECT * FROM Table LIMIT 95,- 1; // Retrieve record rows 96-Last.

// If only one parameter is specified, it indicates the maximum number of record rows returned:
Mysql> SELECT * FROM Table LIMIT 5; // Before retrieval 5 Record rows

// In other words, LIMIT N Equivalent LIMIT 0, N. Select * from table LIMIT 5, 10; # return data in rows 6-15
Select * from table LIMIT 5; # return the first five rows
Select * from table LIMIT; # return the first five rows

1. when the offset value is small.
Select * from yanxue8_visit limit 10, 10

Run multiple times and keep the time between 0.0004 and 0.0005

Select * From yanxue8_visit Where vid> = (
Select vid From yanxue8_visit Order By vid limit 10, 1
) Limit 10


Run multiple times, and the time is kept between 0.0005-0.0006, mainly 0.0006
Conclusion: When the offset is small, it is better to use limit directly. This is obviously the cause of subquery.


2. when the offset value is large.
Select * from yanxue8_visit limit, 10
Run multiple times and keep the time at around 0.0187

Select * From yanxue8_visit Where vid> = (
Select vid From yanxue8_visit Order By vid limit, 1
) Limit 10
Run multiple times, with a time of around 0.0061, only 1/3 of the former. It can be predicted that the larger the offset, the higher the latter.


Performance Optimization:

Based on the high performance of limit in MySQL5.0, I have a new understanding of data paging.

1.
Select * From cyclopedia Where ID> = (
Select Max (ID) From (
Select ID From cyclopedia Order By ID limit 90001
) As tmp
) Limit 100;

2.
Select * From cyclopedia Where ID> = (
Select Max (ID) From (
Select ID From cyclopedia Order By ID limit 90000,1
) As tmp
) Limit 100;

Is it faster to get 90000 records after 100, 1st sentences or 2nd sentences?
The first 1st records are obtained first, and the largest ID value is used as the start ID. then, the first 90001 records can be quickly located.
The 2nd clause selects only the first record after the first record, and then takes the ID value as the starting marker to locate the next 90000 records.
1st statement execution result. 100 rows in set (0.23) sec
2nd statement execution result. 100 rows in set (0.19) sec

Obviously, 2nd sentences won. it seems that limit does not seem as much as I previously imagined to do a full table scan and return limit offset + length records, so it seems that limit is much better than the Top performance of the MS-SQL.

In fact, 2nd sentences can be simplified

Select * From cyclopedia Where ID> = (
Select ID From cyclopedia limit 90000,1
) Limit 100;

Using the IDs of 90,000th records directly does not require the Max operation. in this way, the theoretical efficiency is higher, but the results are almost invisible in actual use, because the ID returned by the positioning itself is a record, Max can get the result almost without operation, but this write is clearer, saving the time to draw a snake.

However, since MySQL has a limit that can directly control the location where records are retrieved, why not simply use Select * From cyclopedia limit, 1? Isn't it more concise?
This is wrong. if you try it, you will know that the result is: 1 row in set (8.88) sec. how is it so scary, it reminds me of the "high score" I had in 4.1 yesterday ". select * it is best not to use it casually. based on the principle of "what to use" and "what to choose", the more Select fields, the larger the field data volume, the slower the speed. which of the above two paging methods is much better than the single-write method? although it seems that the number of queries is more, it is actually at a lower cost in exchange for efficient performance, it is very worthwhile.

The 1st schemes can also be used for MS-SQL, and may be the best, because it is always the fastest to locate the start segment by the primary key ID.

Select Top 100 * From cyclopedia Where ID> = (
Select Top 90001 Max (ID) From (
Select ID From cyclopedia Order By ID
) As tmp
)

But whether the implementation method is stored in the process or the direct code, the bottleneck is always that the TOP of the MS-SQL is always to return the first N records, this situation in the amount of data is not deep, but if hundreds of thousands, efficiency will definitely be low. in contrast, MySQL limit has many advantages. execute:
Select ID From cyclopedia limit 90000
Select ID From cyclopedia limit 90000,1
The results are as follows:
90000 rows in set (0.36) sec
1 row in set (0.06) sec
While the MS-SQL can only use Select Top 90000 ID From cyclopedia execution time is 390 ms, the execution of the same operation time is less than MySQL 360 ms. ------------------------------------------------------ LIMIT thinking On percona performance conference 2009, several Yahoo engineers brought a "Efficient Pagination Using MySQL" report, which has many highlights. This article is a further extension based on the original article. First, let's take a look at the basic principle of paging: mysql> explain SELECT * FROM message order by id desc limit 10000, 20G
* *************** 1. row **************
Id: 1
Select_type: SIMPLE
Table: message
Type: index
Possible_keys: NULL
Key: PRIMARY
Key_len: 4
Ref: NULL
Rows: 10020
Extra:
1 row in set (0.00 sec) limit, 20 means scanning 10020 rows that meet the condition, throwing away the first 10000 rows, and returning the last 20 rows. The problem is here, for limit 100000,100, 100100 rows need to be scanned. in a highly concurrent application, more than rows need to be scanned for each query, and the performance will be compromised. The performance of limit n is okay because only n rows are scanned. This article introduces a "clue" approach to provide some "clues" for turning pages, such as SELECT * FROM message order by id DESC, paging BY id in descending ORDER, 20 entries per page, currently, there are 10th pages. The maximum id of the current page is 9527, and the minimum is 9500, if we only provide jumps such as "previous page" and "next page" (do not provide jumps to page N), the SQL statement can be: SELECT * FROM message WHERE id> 9527 order by id asc limit 20; the SQL statement can be: SELECT * FROM message WHERE id <9500 order by id desc limit 20; no matter how many pages are turned over, only 20 rows are scanned for each query. The disadvantage is that we can only provide "previous" and "next" links, but our product managers like it very much" <上一页 1 2 3 4 5 6 7 8 9 下一页> "What should I do with this link? If LIMIT m and n are unavoidable, we need to optimize efficiency BY minimizing m as much as possible. we need to extend the previous "clue" approach, or SELECT * FROM message order by id DESC, page by id in descending order. There are 20 entries per page. Currently, there are 10th pages. The maximum id of the current page is 9527, and the minimum is 9500. for example, to jump to 8th pages, the SQL statement I see can be written as follows: SELECT * FROM message WHERE id> 9527 ORDER BY id ASC LIMIT 20, 20; jump to page 13th: SELECT * FROM message WHERE id <9500 order by id desc limit 40, 20; the principle is the same, record the maximum and minimum values of the current page id, calculate the relative offset between the jump page and the current page, because the page is similar, this offset is not very large. in this case, the m value is relatively small, greatly reducing the number of scanned rows. In fact, the relative offset of the traditional limit m, n has always been the first page. the more we move to the next page, the worse the efficiency, and the above method does not have such a problem. Note the ASC and DESC in the SQL statement. if the result is obtained by ASC, remember to put it upside down when it is displayed. It has been tested in a table with a total data volume of 60 million, and the effect is very obvious.
SELECT * FROM Table LIMIT [Offset,] Rows | Rows OFFSET Offset Mysql> SELECT * FROM Table LIMIT 5, 10; // Retrieve record rows 6-15

// To retrieve all record rows from an offset to the end of the record set, you can specify the second parameter -1:
Mysql> SELECT * FROM Table LIMIT 95,-1; // Retrieve record rows 96-last.

// If only one parameter is specified, it indicates the maximum number of record rows returned:
Mysql> SELECT * FROM Table LIMIT 5; // Before retrieval 5 Record rows

// In other words, LIMIT N Equivalent LIMIT 0, n. Select * from table LIMIT 5, 10; # return data in rows 6-15
Select * from table LIMIT 5; # return the first five rows
Select * from table LIMIT; # return the first five rows

1. when the offset value is small.
Select * from yanxue8_visit limit 10, 10

Run multiple times and keep the time between 0.0004 and 0.0005

Select * From yanxue8_visit Where vid> = (
Select vid From yanxue8_visit Order By vid limit 10, 1
) Limit 10


Run multiple times, and the time is kept between 0.0005-0.0006, mainly 0.0006
Conclusion: When the offset is small, it is better to use limit directly. This is obviously the cause of subquery.


2. when the offset value is large.
Select * from yanxue8_visit limit, 10
Run multiple times and keep the time at around 0.0187

Select * From yanxue8_visit Where vid> = (
Select vid From yanxue8_visit Order By vid limit, 1
) Limit 10
Run multiple times, with a time of around 0.0061, only 1/3 of the former. It can be predicted that the larger the offset, the higher the latter.


Performance Optimization:

Based on the high performance of limit in MySQL5.0, I have a new understanding of data paging.

1.
Select * From cyclopedia Where ID> = (
Select Max (ID) From (
Select ID From cyclopedia Order By ID limit 90001
) As tmp
) Limit 100;

2.
Select * From cyclopedia Where ID> = (
Select Max (ID) From (
Select ID From cyclopedia Order By ID limit 90000,1
) As tmp
) Limit 100;

Is it faster to get 90000 records after 100, 1st sentences or 2nd sentences?
The first 1st records are obtained first, and the largest ID value is used as the start ID. then, the first 90001 records can be quickly located.
The 2nd clause selects only the first record after the first record, and then takes the ID value as the starting marker to locate the next 90000 records.
1st statement execution result. 100 rows in set (0.23) sec
2nd statement execution result. 100 rows in set (0.19) sec

Obviously, 2nd sentences won. it seems that limit does not seem as much as I previously imagined to do a full table scan and return limit offset + length records, so it seems that limit is much better than the Top performance of the MS-SQL.

In fact, 2nd sentences can be simplified

Select * From cyclopedia Where ID> = (
Select ID From cyclopedia limit 90000,1
) Limit 100;

Using the IDs of 90,000th records directly does not require the Max operation. in this way, the theoretical efficiency is higher, but the results are almost invisible in actual use, because the ID returned by the positioning itself is a record, Max can get the result almost without operation, but this write is clearer, saving the time to draw a snake.

However, since MySQL has a limit that can directly control the location where records are retrieved, why not simply use Select * From cyclopedia limit, 1? Isn't it more concise?
This is wrong. if you try it, you will know that the result is: 1 row in set (8.88) sec. how is it so scary, it reminds me of the "high score" I had in 4.1 yesterday ". select * it is best not to use it casually. based on the principle of "what to use" and "what to choose", the more Select fields, the larger the field data volume, the slower the speed. which of the above two paging methods is much better than the single-write method? although it seems that the number of queries is more, it is actually at a lower cost in exchange for efficient performance, it is very worthwhile.

The 1st schemes can also be used for MS-SQL, and may be the best, because it is always the fastest to locate the start segment by the primary key ID.

Select Top 100 * From cyclopedia Where ID> = (
Select Top 90001 Max (ID) From (
Select ID From cyclopedia Order By ID
) As tmp
)

But whether the implementation method is stored in the process or the direct code, the bottleneck is always that the TOP of the MS-SQL is always to return the first N records, this situation in the amount of data is not deep, but if hundreds of thousands, efficiency will definitely be low. in contrast, MySQL limit has many advantages. execute:
Select ID From cyclopedia limit 90000
Select ID From cyclopedia limit 90000,1
The results are as follows:
90000 rows in set (0.36) sec
1 row in set (0.06) sec
While the MS-SQL can only use Select Top 90000 ID From cyclopedia execution time is 390 ms, the execution of the same operation time is less than MySQL 360 ms. ------------------------------------------------------ LIMIT thinking On percona performance conference 2009, several Yahoo engineers brought a "Efficient Pagination Using MySQL" report, which has many highlights. This article is a further extension based on the original article. First, let's take a look at the basic principle of paging: mysql> explain SELECT * FROM message order by id desc limit 10000, 20G
* *************** 1. row **************
Id: 1
Select_type: SIMPLE
Table: message
Type: index
Possible_keys: NULL
Key: PRIMARY
Key_len: 4
Ref: NULL
Rows: 10020
Extra:
1 row in set (0.00 sec) limit, 20 means scanning 10020 rows that meet the condition, throwing away the first 10000 rows, and returning the last 20 rows. The problem is here, for limit 100000,100, 100100 rows need to be scanned. in a highly concurrent application, more than rows need to be scanned for each query, and the performance will be compromised. The performance of limit n is okay because only n rows are scanned. This article introduces a "clue" approach to provide some "clues" for turning pages, such as SELECT * FROM message order by id DESC, paging BY id in descending ORDER, 20 entries per page, currently, there are 10th pages. The maximum id of the current page is 9527, and the minimum is 9500, if we only provide jumps such as "previous page" and "next page" (do not provide jumps to page N), the SQL statement can be: SELECT * FROM message WHERE id> 9527 order by id asc limit 20; the SQL statement can be: SELECT * FROM message WHERE id <9500 order by id desc limit 20; no matter how many pages are turned over, only 20 rows are scanned for each query. The disadvantage is that we can only provide "previous" and "next" links, but our product managers like it very much" <上一页 1 2 3 4 5 6 7 8 9 下一页> "What should I do with this link? If LIMIT m and n are unavoidable, we need to optimize efficiency BY minimizing m as much as possible. we need to extend the previous "clue" approach, or SELECT * FROM message order by id DESC, page by id in descending order. There are 20 entries per page. Currently, there are 10th pages. The maximum id of the current page is 9527, and the minimum is 9500. for example, to jump to 8th pages, the SQL statement I see can be written as follows: SELECT * FROM message WHERE id> 9527 ORDER BY id ASC LIMIT 20, 20; jump to page 13th: SELECT * FROM message WHERE id <9500 order by id desc limit 40, 20; the principle is the same, record the maximum and minimum values of the current page id, calculate the relative offset between the jump page and the current page, because the page is similar, this offset is not very large. in this case, the m value is relatively small, greatly reducing the number of scanned rows. In fact, the relative offset of the traditional limit m, n has always been the first page. the more we move to the next page, the worse the efficiency, and the above method does not have such a problem. Note the ASC and DESC in the SQL statement. if the result is obtained by ASC, remember to put it upside down when it is displayed. It has been tested in a table with a total data volume of 60 million, and the effect is very obvious.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.