MySQL specific explanation (s)----------the optimization of the paging query of massive data

Last Update:2016-02-20 Source: Internet

Author: User

Tags dba mysql index

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

For a detailed explanation of the pagination, see http://blog.csdn.net/u011225629/article/details/46775947

View Code Print 1 SELECT * FROM table ORDER by ID LIMIT 1000, 10;
There is no problem with the above SQL statements in principle and in actual operation, but when the data volume of table tables reaches more than hundreds of thousands of. The above statement runs once and may take a few minutes to run, and the longer the page is, the more time it will run. This time we need to find a faster way to replace such operations.

There are already many ways to optimize the Internet. This is basically the case:

View Code Print 1 SELECT * FROM table WHERE ID >= (SELECT id from table limit 1000000, 1) limit 10;
The above SQL statement is a good workaround. The speed of the paging query is basically kept within 1 seconds. But the experience concludes that there is another better approach, such as the following:

View Code Print 1 SELECT * FROM table WHERE ID between 1000000 and 1000010;
Compared with the above solution, the speed will be much faster.

Also share two tips on querying:

(1) Suppose the need to query ID is not a continuous paragraph, the best way is to find the ID first, and then use in query:

View Code Print 1 SELECT * FROM table WHERE ID in (10000, 100000, 1000000 ...);
(2) When querying the field for a longer string, the table is designed to add a field to the field, such as the field where the URL is stored. When you need to query the field, do not query the string directly, inefficient. The CRC32 or MD5 value of the string should be found.

How to optimize MySQL Tens high-speed paging. The following excerpt and online, the reader to participate in their own examination.

How high can MySql performance be? Using PHP for more than half a year, really so deep to think about this problem or from the day before yesterday began. There has been pain and despair. Now full of confidence! MYSQL This database is definitely suitable for DBA level master to play, generally do a little bit of 10,000 news small system how to write can. With the XX framework can achieve high-speed development.

But the volume of data to 100,000, million to tens of millions, his performance can be so high? A little bit of error may have caused the whole system to rewrite. Even the system does not perform properly. All right. Not so much nonsense. To speak with facts, see examples:

Data table collect (ID, title, info, VType) on these 4 fields, the title with a fixed length, info with text, ID is gradual. VType is Tinyint,vtype is an index. This is a simple model of a major news system.

Now fill in the data, fill 100,000 news.

The last collect is 100,000 records, the database table occupies the hard disk 1.6G. OK, look at the following SQL statement:

Select Id,title from collect limit 1000, 10; Very fast, basically 0.01 seconds is ok, and then look at the following

Select Id,title from collect limit 90000, 10; Start paging from 90,000 lines. Results?

8-9 seconds to complete. What's wrong with my God? In fact, to optimize this data, online find the answer.

Look at one of the following statements:

Select ID from collect order by ID limit 90000, 10;

Very fast, 0.04 seconds is OK. Why? Because the ID primary key is used to index of course fast.

The method of online modification is:

Select Id,title from collect where id>= (select ID from collect order by ID limit 90000,1) limit 10;

This is the result of indexing with an ID. But the problem is a little bit complicated. It's done. Look at the following statement

Select ID from collect where vtype=1 the order by ID of limit 90000, 10; Very slow. It took 8-9 seconds!

Here I believe very many people will be like me, have a crash feeling! VType, did you index it? How can it be slow? VType did an index is good that you directly

Select ID from collect where vtype=1 limit 1000, 10;

is very fast. Basically 0.05 seconds, but 90 times times higher, starting from 90,000. That's the speed of 0.05*90=4.5 seconds. and a test result of 8-9 seconds to an order of magnitude. From here on, someone put forward the idea of the table. This is the same idea as the Dis #cuz forum. Ideas such as the following:

Build an Index Table: T (id,title,vtype) and set the fixed length, then do the paging, page out the results and then to collect inside to find info. Is it feasible? I know it under the experiment.

100,000 records to T (id,title,vtype), data table size about 20M. Use

Select ID from t where vtype=1 the order by ID of limit 90000, 10;

It's very fast. Basically 0.1-0.2 seconds to run out.

Why is that? I guess it's because collect data is too much. So paging to run a very long way. Limit is completely related to the size of the data table. In fact, this is still a full-scale scan, just because the amount of data is small, only 100,000 faster. OK, a crazy experiment, add to 1 million, test performance. With 10 times times the data, the T-table is now over 200 m, and it's a fixed length. Or just the query statement. Time is 0.1-0.2 seconds to complete! Table Performance No problem? Wrong!

Because our limit is still 90,000. So hurry up. Let's start with a big one, 900,000.

Select ID from t where vtype=1 the order by ID of limit 900000, 10;

Look at the results. Time is 1-2 seconds!

Why?

The time is still so long, very depressed! Someone agreed that the long will improve the performance of limit, beginning I thought. Since the length of a record is fixed, MySQL should be able to figure out the 900,000 position. But we overestimate MySQL's intelligence. He is not a business database, it turns out that fixed length and non-fixed length have little effect on limit. No wonder some people say that discuz to 1 million records will be very slow. I believe this is true, this is related to database design!

Can't mysql exceed the 1 million limit??? To 1 million of the page is really the limit?

The answer is: no why not breakthrough 1 million is due to not design MySQL caused.

The Non-table method is described below. Let's have a crazy test!

A table for 1 million records, and a 10G database. How to paging at high speed!

Well, our test went back to the collect table, and we started the test conclusion:

300,000 data, using the Sub-table method is feasible, more than 300,000 of his speed will a slow line you unbearable!

Of course I use the Sub-table + my method. That's absolutely perfect. But after using my method, I can solve it perfectly without a table.

The answer is: Composite Index!

Once the MySQL index was designed, it was inadvertently discovered that the index name could be taken. What's the use of being able to choose several fields to come in? Start with

Select ID from collect order by ID limit 90000, 10;

So fast is because the index is gone, but assuming that the addition of where will not go index. The search (VTYPE,ID) index was added to the idea of a try. and test it.

Select ID from collect where vtype=1 limit 90000, 10; Soon! 0.04 seconds to complete.

Test again: Select ID, title from collect where vtype=1 limit 90000, 10; It's a pity, 8-9 seconds, not to go to the search index.

Again test: Search (Id,vtype), or select ID this statement, it is also regrettable. 0.5 seconds.

In summary: Assume that for a where condition, but also want to go index with limit. You must design an index, place the where first, limit the primary key to place the 2nd bit, and only select the primary key!

Perfect to overcome the paging problem.

The ability to return IDs at high speed has the hope of optimizing the limit, by this logic. The millions limit should be divided in 0.0x seconds.

It seems to be important to optimize and index MySQL statements!

All right. Back to the original question, how to apply the above research successfully to the development of the high speed? Assuming a composite query, my lightweight framework is useless. Paging string you have to write it yourself, how much trouble? Here we look at a sample, the idea comes out:

SELECT * from collect where ID in (9000,12,50,7000); 0 seconds to check it out!

Mygod, MySQL's index is actually the same as the in statement valid!

It seems that the online say in cannot be indexed is wrong!

With this conclusion, it is very easy to apply to the lightweight framework:

With simple transformations, the idea is actually very easy:

(1) by optimizing the index. Find the ID. and spell "123,90000,12000″ this string."

(2) The 2nd query finds the results. A small index + a little bit of modification will enable MySQL to support the efficient paging of millions or even tens.

Through the example here, I reflect on the point: for large systems. PHP must never use frames, especially those that can't be seen with SQL statements!

I almost collapsed because of my lightweight framework! For high-speed development of small applications only. For Erp,oa, a large site, the data layer contains the logical layer of things can not be used in the framework. Assuming the program ape loses control of the SQL statement, the risk of the project will be added in a geometric progression!

Especially when it comes to MySQL. MySQL must be a professional DBA to perform his best performance.

The performance difference caused by an index can be thousands of thousand.

MySQL specific explanation (s)----------the optimization of the paging query of massive data

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More