MySQL ORDER BY Rand () efficiency analysis

Source: Internet
Author: User
Tags manual mysql manual rand time 0 mysql slow query log

The general wording is: SELECT * from Content order by RAND () LIMIT 1.
"Drift Easy Note: 30,000 records query cost 0.3745 seconds (same below); from the MySQL slow query log, see" Order by RAND () "Full table scanned 2 times! 】

Then I looked up the official MySQL manual, and the hint for rand () probably meant that the rand () function could not be used in an ORDER BY clause because it would cause data columns to be scanned multiple times. However, in the MySQL 3.23 version, it is still possible to implement random by the order by RAND ().

But the real test was found to be very inefficient. A library of more than 150,000, query 5 data, incredibly more than 8 seconds. View the official manual, also said that Rand () is executed multiple times in the ORDER BY clause, which is naturally inefficient and very low.

Search Google, join, query Max (ID) * RAND () for random data acquisition.

The code is as follows Copy Code
SELECT *
From ' content ' as T1 JOIN (select ROUND (RAND () * (SELECT MAX (id) from ' content ') as ID) as T2
WHERE t1.id >= t2.id
ORDER BY T1.id ASC LIMIT 1;

"The query cost 0.0008 seconds, floating easily think can recommend the use of this statement!! 】

But this will produce a continuous 5 records. The solution can only be one query at a time, query 5 times. Even so, because of the 150,000 table, the query only needs 0.01 seconds.

There is a way:

The code is as follows Copy Code
SELECT * from ' content ' as a JOIN (select MAX (ID) as ID from ' content ') as B on (a.id >= FLOOR (b.id * RAND ())) LIMIT 5;

The above method guarantees a certain range of random, the query cost 0.4265 seconds, also not recommended.

The following statements are used on the MySQL forum by someone

The code is as follows Copy Code
SELECT *
From ' content '
WHERE ID >= (SELECT FLOOR (MAX (ID) * RAND ()) from ' content '
ORDER by ID LIMIT 1;

"Query cost 1.2254 seconds, drift easily strongly not recommended!" Since the measured, 30,000 rows of the table, this statement will scan 5 million lines!! 】

There is still a big gap with the above statement. There is something wrong with the total sleep. So I rewrote the statement.

The code is as follows Copy Code
SELECT * from ' content '
WHERE ID >= (SELECT Floor (RAND () * (select MAX (id) from ' content '))
ORDER by ID LIMIT 1;

"Query takes 0.0012 seconds"

This, the efficiency is increased, the query time is only 0.01 seconds

Finally, the statement to improve, plus the min (id) judgment. I was at the beginning of the test, because I did not add min (id) judgment, the result is half of the time is always query to the first few lines in the table.
The full query statement is:

The code is as follows Copy Code
SELECT * from ' content '
WHERE ID >= ((select MAX (id) from ' content ') + (select min (id) ' content ') + (select min (id) From ' content '))
ORDER by ID LIMIT 1;

"Query takes 0.0012 seconds"

The code is as follows Copy Code
SELECT *
From ' content ' as T1 JOIN (select ROUND (select MAX (id) from ' content ') + (select MIN. (ID) from ' ") MIN (ID) from ' content ') as ID) as T2
WHERE t1.id >= t2.id
ORDER by t1.id LIMIT 1;

"Query takes 0.0008 seconds"

Finally in PHP, the two statements are queried separately 10 times,
The former takes 0.147433 seconds.
The latter takes time 0.015130 seconds
It seems that the syntax for join is much higher than the efficiency of using functions directly in a where. (VIA)


The first scenario, the original order by Rand () method:

The code is as follows Copy Code
$sql = "SELECT * from Content order by rand () LIMIT 12";
$result =mysql_query ($sql, $conn);
$n = 1;
$rnds = ';
while ($row =mysql_fetch_array ($result)) {
$rnds = $rnds. $n. " <a href= ' show '. $row [' id ']. " -". Strtolower (Trim ($row [' title ']))." ' > ". $row [' title ']." </a><br/>n ";
$n + +;
}

30,000 data Check 12 random records, need 0.125 seconds, with the increase of data volume, the efficiency is getting lower.

The second scenario, the improved JOIN method:

The code is as follows Copy Code
For ($n =1 $n <=12; $n + +) {
$sql = "SELECT * from ' content ' as T1
JOIN (select ROUND RAND () (select MAX (id) from ' content ') as ID) as T2
WHERE t1.id >= t2.id ORDER by t1.id ASC LIMIT 1 ";
$result =mysql_query ($sql, $conn);
$yi =mysql_fetch_array ($result);
$rnds = $rnds. $n. " <a href= ' show '. $yi [' id ']. " -". Strtolower (Trim ($yi [' title ']))." ' > ". $yi [' title ']." </a><br/>n ";
}

30,000 data to check 12 random records, need 0.004 seconds, the efficiency of a significant increase, compared to the first one to upgrade about 30 times times. Disadvantages: Multiple Select queries, IO overhead.

The third scenario, the SQL statement first random good ID sequence, with in query (floating easily recommend this usage, IO overhead, the fastest):

  code is as follows copy code
$sql = "Select MAX (ID), MIN (ID) from content ";
$result =mysql_query ($sql, $conn);
$yi =mysql_fetch_array ($result);
$idmax = $yi [0];
$idmin = $yi [1];
$idlist = ';   
for ($i =1 $i <=20; $i + +) {   
if ($i ==1) {$idlist =mt_rand ( $idmin, $idmax); }   
else{$idlist = $idlist. ', '. Mt_rand ($idmin, $idmax);    

$idlist 2= "id,". $idlist
$sql = "SELECT * from content where ID into ($idlist) Order B Y field ($idlist 2) LIMIT 0,12 ";
$result =mysql_query ($sql, $conn);
$n = 1;
$rnds = ';
while ($row =mysql_fetch_array ($result)) {
$rnds = $rnds. $n. ". <a href= ' show '. $row [' id ']. " -". Strtolower (Trim ($row [' title ']))." ' > ". $row [' title ']." </a><br/>n ";
$n + +;
}

30,000 data Check 12 random records, need 0.001 seconds, the efficiency of the second method has increased by about 4 times times, than the first method to increase 120 times times. Note, the Order by field ($idlist 2) is used here to not sort, or in is automatically sorted. Disadvantage: It is possible to encounter the deletion of the ID, so you need to select a few more IDs.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.