MySQL LIMIT and Page Optimization _ MySQL

Source: Internet
Author: User
MySQL LIMIT and paging optimization bitsCN.com
Select * from table LIMIT 5, 10; # return data in rows 6-15 select * from table LIMIT 5; # return the first 5 rows select * from table LIMIT; # return the first 5 rows

Performance Optimization:

Based on the high performance of limit in MySQL5.0, I have a new understanding of data paging. 1. select * From cyclopedia Where ID> = (Select Max (ID) From (Select ID From cyclopedia Order By ID limit 90001) As tmp) limit 100; 2. select * From cyclopedia Where ID> = (Select Max (ID) From (Select ID From cyclopedia Order By ID limit 90000,1) As tmp) limit 100; is it faster to get 90000 records after 100, 1st sentences or 2nd sentences? The first 1st records are obtained first, and the largest ID value is used as the start identifier, then, it can be used to quickly locate the next 100 records. The first sentence is only the last one after the first 2nd records. then, the ID value is used as the start identifier to locate the next 90000 records and the execution results of the 100 sentences. 100 rows in set (0.23) sec 2nd statement execution result. 100 rows in set (0.19) sec obviously won 2nd sentences. it seems that limit does not seem as much as I previously imagined to do a full table scan and return limit offset + length records, so it seems that limit is much better than the Top performance of the MS-SQL. in fact, the first sentence can be simplified to Select * From cyclopedia Where ID> = (Select ID From cyclopedia limit 90000,1) limit 2nd; the ID of the first 100 records is directly used without the Max operation, theoretically, this is more efficient, but it is almost There is no effect, because the positioning ID returns a record, and Max can get the result without any operation, but it is clearer, saving the trouble of painting snakes. however, since MySQL has a limit that can directly control the location where records are retrieved, why not simply use Select * From cyclopedia limit, 1? Isn't it more concise? I thought it would be wrong. After I tried it, I knew that the result was: 1 row in set (8.88) sec. how is it so scary, it reminds me of the "high score" I had in 4.1 yesterday ". select * it is best not to use it casually. based on the principle of "what to use" and "what to choose", the more Select fields, the larger the field data volume, the slower the speed. which of the above two paging methods is much better than the single-write method? although it seems that the number of queries is more, it is actually at a lower cost in exchange for efficient performance, it is very worthwhile.

The larger the LIMIT offset, the more lines of records read from the disk IO. Therefore, you need to read as little data as possible from the disk IO. In general, there are the following methods:

1. subquery optimization method
First find the first data, and then the id greater than or equal to this data is the data to be obtained
Disadvantage: the data must be continuous. it can be said that there is no where condition, and the where condition will filter the data, leading to data loss of continuity.

2. inverted table optimization
The inverted table method is similar to creating an index. a table is used to maintain the number of pages, and data is obtained through efficient connections.
Disadvantage: it is only suitable for scenarios where the data count is fixed and the data cannot be deleted, making it difficult to maintain the page table.

3. reverse search optimization method
When the offset exceeds half the number of records, sort it first, so that the offset is reversed.
Disadvantage: order by is difficult to optimize. to increase the index, the index affects the data modification efficiency, and the total number of records must be known. The offset is greater than half of the data.

4. limit optimization method
Limit the limit offset to a certain number .. If this number is exceeded, there is no data. I remember the dba of alibaba said they did this.

Conclusion: there are many limits on limit optimization. Therefore, you can use the actual situation or not analyze the actual situation. After the number of pages, there are very few people looking at it...

========================================================== ========================================================== ========================================================== ================

Four paging optimization methods

I read an article about pagination a long time ago. later, I found it again and translated it. For more information, see Four ways to optimize paginated displays.

Background: In the case of large data volumes, if the original simple paging is not handled properly, you will find that paging requests consume a lot of your database time. If you encounter this problem, the article provides you with some good solutions. Of course, if a beginner can read this article, it will guide you to write more scalable paging code.

Full text Overview: The paging method is summarized as follows:

  1. Cache all query results. Cache all query results (such as file cache and static result page ).
  2. The total number of pages is not displayed in detail. Here are two optimization tips. First, when calculating the total number of items, I fixed the query for 501 items and displayed the first 500 items on pages. if the first 501st items do exist, then the button "view more..." is provided... "(this is rare ). Second, when listing this page, for example, if I want to display 1-20 entries on the first page, I will query 1-21 entries. If 21st entries exist, I will give the "next page" button, and so on.

    In fact, google does this. When you view the search results on the first page, google only displays the first 10 pages (100 entries in total), but does not display the total number of search results:

    When you view the second page, only one more page is displayed.
  3. Estimate the total number of entries in the result by using the "row" column of the EXPLAIN statement. Google estimates the result set as follows:

    In actual development, paging display is one of the most common optimization problems. Such as search results, point lists, and rankings. General paging model: in a sorting result set (large), we need to display 20 consecutive entries, and the links of "next page" and "previous page" must be displayed; sometimes we need to display the total number of entries and the total number of pages.

    To provide such a complete display, the cost of the database is very high. sometimes, to display such a page, the SQL statement to be executed is more expensive than displaying all other SQL statements on the whole page.
    I once encountered such a case: once we performed an Slow Query LOG analysis for one of our customers, we found that the SQL in the LOG took 6300 s, two major paging queries consume about (2850 + 380) seconds, accounting for 50% of the total Slow Query.
    It's so bad if the page is not processed ~.

    Let's analyze the general page information:

    # Typical paging SQL statements are as follows:
    SELECT... FROM... order by... limit x, 20

    If the order by part cannot be indexed (in this case, there are still many), MYSQL will sort the files (filesort ); suppose that if the WHERE condition is met for a total of millions of entries, MYSQL will retrieve the millions of results, temporary storage, and file sorting, then delete the large part of the data and retain 20 of them. When you click "next page", the above process will be completely redone, but the result is a little backward. If you want to display "how many entries are in total and how many pages are divided", this is generally done (1) use SQL _CALC_FOUND_ROWS (2) to execute a separate SQL statement to calculate the number of rows. If the user executes the above operations for each request, it can be imagined that when your data volume increases, the situation will get worse and worse.

    In fact, there are many ways to optimize the above process. (I have previously written an article titled article on optimizing ranked data. However, the methods described in that article are difficult to implement. Therefore, it is not worthwhile to do so if it is not complicated or important to a certain extent .) What should we do in general? In addition to indexing, data restructuring, and SQL optimization, we have two major aspects to consider. First, actively cache the SQL query results to reduce SQL Execution. second, rethink about your paging architecture. in the application, not all parts of the page need to be completely displayed every time. For example, if you give links from 1st to 50 pages, users do not directly click a page. The idea we consider is to first display the most important part.

    In this way, we have the following four optimization suggestions to improve the performance.

    1. Cache results for the first query. This makes it easy, whether it is the number of results entries, the total number of pages, or some of the entries.
    2. The total number of entries is not displayed. This feature is used for the paging display of Google search results. Most of the time, you may have read the first few pages. In this way, I can limit the number of results to 500 each time (the larger the data size, the larger the resource consumption). then, each time you query, you can query 501 records. in this way, if there are 501 results, we will display the link "show 500 records ".
    3. The total number of pages is not displayed. Only the "next page" link is provided, if there is a next page. (If the user wants to view the previous page, he will return to the previous page through the browser ). Then you may ask me "no display of the total number of pages". how do you know if there is a next page? Here is a good tip: you query one more entry each time you display your current page entry. for example, if you want to display the 11-20 entries, you can retrieve 11-21 records (one more record does not show the retrieved content). when you find that 21st records exist, the "next page" is displayed. Otherwise, the last page is displayed. In this way, you do not need to calculate the total number of pages each time, especially when caching is very difficult.
    4. Estimate the number of summary results. Google does this, and it turns out to be very effective. Use EXPLAIN to EXPLAIN your SQL statement, and then use the EXPLAIN results to estimate. A "row" column in The EXPLAIN result will give you an approximate result. (This method does not work everywhere, but it works well in some places.) these methods can greatly reduce the pressure on the database and have no impact on the user experience.

      These methods can greatly reduce the pressure on the database and have no impact on the user experience.


      BitsCN.com

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.