MySQL performance optimization

Source: Internet
Author: User

How to extract a random efficiency from a data table in MySQL, at the same time to ensure the highest efficiency.

Method One

This is the most primitive and intuitive syntax. For example, the following:

SELECT * from Foo ORDER by RAND () LIMIT 1

This method works when the amount of data in the data table is small. But when the amount of data reaches a certain level, for example 1 million data or more. There is a very big performance problem.

Assuming that you parse this statement through explain, you will find that although MySQL is sorted by creating a temporary table, it is because of the nature of order by and limit itself. We are still unable to get the required records through limit until the sorting is complete. i.e. You must first sort the data by how many records you have.

Method Two

It seems that for the large data volume of random data extraction, the crux of the performance is on the order by, then how to avoid? Method two provides a scenario.



First, get the total number of records for the data table:

SELECT Count (*) as num_rows from Foo

Then, the total number of records (assumed to be num_rows) is recorded by the appropriate daemon.

Then runs:

SELECT * from foo LIMIT [0 to num_rows a random number],1

above this random number can be completed by the daemon. The premise of this method is that the ID of the table is continuous or self-growing.

This method has successfully avoided the creation of an order by.

Method Three

is not possible without an order by. Using an SQL statement to implement the method two? Yes, that's to join.

Select * from Bar B joins (SELECT ceil (MAX (ID) *rand ()) as ID from bar) as M on b.ID >= m.id LIMIT 1;

This method implements our purpose, at the same time, in the case of large amounts of data, also avoids the order by the whole record of the order process. Because the SELECT statement inside the join is actually only run once, not n times (n equals the num_rows in method two). Also, we can add the "greater than" symbol to the filter statement. It is also possible to avoid the occurrence of empty records due to the fact that the ID is not continuous.

To query 5 non-repeating data in MySQL, use the following:

SELECT * from ' table ' ORDER by RAND () LIMIT 5

will be able to. But the real test to find that this efficiency is very low. A library of more than 150,000 articles, query 5 data, unexpectedly more than 8 seconds

Searching for Google, the internet is basically querying max (ID) * RAND () to get data randomly.

SELECT *
From ' table ' as T1 joins (select ROUND (RAND () * (select MAX (ID) from ' table ')) as ID) as T2
WHERE t1.id >= t2.id
ORDER by T1.id ASC LIMIT 5;

However, this will produce 5 consecutive records. The solution is to query only one, query 5 times.

Even so it is worth it, because 150,000 of the tables, the query only need 0.01 seconds less than.

The above statement uses a join. MySQL on the forum someone used

SELECT *
From ' table '
WHERE ID >= (SELECT floor (MAX (ID) * RAND ()) from ' table ')
ORDER by ID LIMIT 1;

I tested it for 0.5 seconds and it was a good speed. But with the above statement there is still a very big gap. There is something wrong with the total sleep.

So I rewrote the sentence a bit.

SELECT * from ' table '
WHERE ID >= (SELECT Floor (RAND () * (select MAX (ID) from ' table ')))
ORDER by ID LIMIT 1;

This, the efficiency is improved, the query time only 0.01 seconds

At last. Put the statement in good condition again. Plus the inference of min (id). I was at the very beginning of the test because I did not add the min (id) inference. The result is that half of the time is always queried to the previous rows in the table.
The complete query statement is:

SELECT * from ' table '
WHERE ID >= (SELECT Floor (RAND () * ((select MAX (ID) from ' table ')-(select min (id) from ' table ') + (select min (id) FRO M ' table ')))
ORDER by ID LIMIT 1;

SELECT *
From ' table ' as T1 JOIN (select ROUND (RAND () * ((select MAX (ID) from ' table ')-(select min (id) from ' table ') + (select min (i d) from ' table ') as ID) as T2
WHERE t1.id >= t2.id
ORDER by T1.id LIMIT 1;

Finally, the two statements are queried 10 times respectively.
The former takes 0.147433 seconds
The latter takes time 0.015130 seconds
It seems that the syntax for join is much higher than the use of function directly in where.

MySQL performance optimization

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.