MySQL tips: do limit optimization

Source: Internet
Author: User
Keywords Web programming MySQL Tutorial
Tags application data display force group index it is mysql

When querying data, we often need to specify a few rows of data to be returned. As now there is a B/s architecture application, each page may display only 30 records. In order to improve the efficiency of the display, the database is generally required to return only 30 records at a time. When the user presses the next page, return 30 records from the database, and so on. This can shorten the time that the data is displayed. This is very effective when the base table for the query is larger. You can use the Limit keyword to implement this requirement at this time. The limit clause can be used to force a SELECT query statement to return the specified number of records.

Typically, the Limit keyword can accept one or two numeric parameters. It is important to note that this parameter must be an integer constant. If the user gives two parameters, the first parameter represents the offset of the first row that returned the record, and the second parameter represents the maximum data that returns the row of the record. Another caveat is that the initial record line has an offset of 0 instead of 1. Many users will make mistakes here.

Although you use the limit statement to limit the number of records returned, you can increase the efficiency of your application. But it also brings some negative effects on the performance of the system. This can result in a full table scan, and so on. To this end, the author gives some suggestions for the optimization of limit keywords for reference.

Recommendation I: Flexible use of limit 0 clause

Depending on the definition of the Limit keyword, if the argument is 0, it returns an empty record. It doesn't seem to have much meaning. Fact。 In practical work, flexible use of this 0 parameters, can bring us a lot of harvest.

As now the database engineer wants to confirm the validity of a query statement, if you run the query directly, you need to wait for the record that it returns. If the number of records involved is higher, or the computational logic is more complex, it will take a long time to wait. You can use the Limit 0 clause in the SELECT query statement at this point. This allows the database to quickly return an empty collection as long as the query statement has no grammatical errors. Thus help the database Designer to quickly judge the validity of the query statement. In addition, this empty set and also returns the data type of each field in a table. This limit 0 clause also allows you to query the table structure of a table.

It can be seen that the flexible application of the Limir 0 clause, does bring us a very small profit. It should be noted, however, that this clause may not work in certain situations. As a rule, this limit 0 clause is not supported in the monitor's working environment. The result will only show empty Set, not what we need.

Recommendation two: limit is used in conjunction with GROUP by

The Group by keyword is used primarily to subtotal data. However, before subtotals, it is often necessary to sort the advanced data. When the limit statement is used to specify the number of results to display, it is often necessary to refer to the problem of subtotal and sorting records. As now in a school performance management system, the student's total score needs to be sorted. That is to summarize the students ' scores and then show their top 50 records. You need to use both the GROUP BY clause and the LIMIT clause at this point. In fact, we can see from this case that the two clauses are interdependent. Because of this feature, which is often used in conjunction with each other, combining the GROUP BY clause can improve the query efficiency of limit.

This is mainly because both if used together, the Limit keyword will not recalculate any unnecessary group by values. In other words, in some cases, the Group by clause can be used in order to read the key or sort on the key to solve the sorting problem at the subtotal, and then calculate the summary until the value of the keyword changes. In this case, two clauses need to do some common work, just do it once. This can be used from another angle to improve the performance of the application system. It is more efficient to use a query statement to extract a specific number of records than to make a single view of the data subtotals. Because the latter is to use two clauses separately, you cannot enjoy the advantages embodied in the combined use.

Recommendation three: Use Sql_calc_found_rows to increase the flexibility of clauses

By default, the LIMIT clause returns the number of record rows specified by the user. The database system discards the remaining queries as long as the database has sent the number of rows required by the user. In this case of student achievement, the database returns only 50 records and then terminates the query job if the user only needs to return to the top 50 of the total score.

However, in certain circumstances, the user may still need to continue the subsequent query? If the user out of the query some specific records, but also need to know the total number of records, at this point how to deal with? If users now need to know the top 50 of the student information, but also need to know the total score of more than 500 points. Using the limit clause alone may not meet the needs of the user because it only cares about the previous 50 records. If you want to implement this requirement, often need to combine sql_calc_found_rows keyword.

The primary purpose of this keyword is to be able to prepare the database administrator for the number of records in the query that conforms to the WHERE condition statement. The user can then obtain the total number of records that meet the criteria only after a SELECT Found_rows statement is subsequently executed. Note, however, that using this keyword can have some side effects. The query statement with this keyword cannot use the data cache. In some cases, the performance of the data query can be reduced. In general, this keyword is used only in situations where the case statement is more complex. Of course, this is a performance-driven proposition, not a technical one. That is, you can use this keyword even if the WHERE condition statement is not complex, and there is no grammatical error. But it is not ideal in performance.

Recommendation IV: Special phenomena when used in conjunction with the DISTINCT keyword

The DISTINCT keyword is used primarily to filter duplicate records. The Limit keyword is used primarily to specify the number of rows returned by the record. What happens if these two keywords are used together? The database returns the specified number of distinct records, if understood literally. If the limit parameter is 50, the database returns 50 records that are not duplicates. Then the subsequent query stops. If there are duplicate records in the query's records, the actual number of database queries is often more than the number specified by the Limit keyword.

In practical work, the function of this statement is still very big. If you have a table of employee attendance information now. The database administrator now needs to count the number of absences in the top 20. To prevent duplicate records at this point, you can add a distinct keyword to the query to filter the number of duplicate records. This eliminates the need to use multiple query statements to accomplish this requirement.

Recommendation five: Relationship between limit and indexes

If the database administrator decides to use the limit clause to specify the number of records to display, it is best to use the index to the maximum to avoid full table scans and improve productivity. That is, when the database chooses to do a full table scan, you can use the index in some cases.

As now the database administrator decides to use the limit clause with the ORDER BY clause. Once the database finds the first rowcount row of the sort result, the system will end the sort, not the entire table. The entire table is sorted if the ORDER BY clause is used separately. Even so, sorting must be a waste of time. When the database administrator decides to use the index, it can greatly improve the efficiency of the query.

For this content, I want to emphasize a problem. If you have to sort files, you must select all matching queries, and you must sort most of them before you are sure that you have found the first row. In particular, it is necessary to emphasize that in any case, once a row is found, there is no need to reorder other parts of the result, and the database will automatically end the sort.

The essential function of the limit clause is to limit the number of records a user has. But there are many other uses. Such as quick to judge the validity of query statements, the space required to calculate the table, and so on. But it also has certain side effects, may bring the system the operation to bring some negative influence. At this point, it is best to take some measures to improve the performance of the system.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.