Author:flymorn Source: Easy to float
categories:php Programming Posttime:2011-1-14 15:35:07
Text:
Recently, due to the need to study the MySQL random extraction implementation method. For example, to randomly extract a record from the TableName table, the general notation is: SELECT * from Content ORDER by RAND () LIMIT 1.
"Floating Easy note: 30,000 records of the query took 0.3745 seconds (the same as); from MySQL slow query log See" ORDER by RAND () "Full table Scan 2 times! 】
Then I checked the official MySQL manual, and the Hints for rand () probably mean that the rand () function cannot be used in an ORDER BY clause because it causes the data column to be scanned multiple times. However, in MySQL version 3.23, the Order by RAND () can still be implemented randomly.
But the real test is that it's very inefficient. A library of more than 150,000, query 5 data, incredibly more than 8 seconds. Check out the official manual, which also says Rand () is executed several times in the ORDER BY clause, naturally efficient and very low.
Search Google, use join, query Max (ID) * RAND () to get the data randomly.
SELECT *
From ' content ' as T1 joins (select ROUND (RAND () * (select MAX (id) from ' content ')) as ID) as T2
WHERE t1.id >= t2.id
ORDER by T1.id ASC LIMIT 1;
"The query takes 0.0008 seconds, floating easily think it can be recommended to use this statement!! 】
However, this will produce 5 consecutive records. The solution can only be one query at a time, query 5 times. Even so it is worth it, because 150,000 of the tables, the query only need 0.01 seconds less than.
There is one way:
SELECT * from ' content ' as-a JOIN (select MAX (ID) as ID from ' content ') as-B on (a.id >= floor (b.id * RAND ())) LIMIT 5;
The above method guarantees a certain range of random, the query takes 0.4265 seconds, is not recommended.
The following statement is used by someone on the MySQL forum
SELECT *
From ' content '
WHERE ID >= (SELECT floor (MAX (ID) * RAND ()) from ' content ')
ORDER by ID LIMIT 1;
"The query takes 1.2254 seconds, it is not recommended to float easily!" Since the measured, 30,000 rows of the table, this statement unexpectedly will scan 5 million lines!! 】
There is still a big gap with the above statement. There is something wrong with the total sleep. So I rewrote the sentence a bit.
SELECT * from ' content '
WHERE ID >= (SELECT Floor (RAND () * (select MAX (id) from ' content ')))
ORDER by ID LIMIT 1;
"Query takes 0.0012 seconds"
This, the efficiency has improved, the query time only 0.01 seconds
Finally, the statement is refined, plus min (id) judgment. When I first tested it, it was because I didn't add the min (id) judgment, and half the time I always queried the previous rows in the table.
The complete query statement is:
SELECT * from ' content '
WHERE ID >= (SELECT Floor (RAND () * ((select MAX (id) from ' content ')-(select min (id) from ' content ') + (select min (id)) From ' content ')))
ORDER by ID LIMIT 1;
"Query takes 0.0012 seconds"
SELECT *
From ' content ' as T1 JOIN (select ROUND (RAND () * ([select MAX (ID) from ' content ')-(select MIN (id) from ' content ') + (select MIN (ID) from ' content ')) as ID) as T2
WHERE t1.id >= t2.id
ORDER by T1.id LIMIT 1;
"Query takes 0.0008 seconds"
Finally, the two statements were queried 10 times in PHP,
The former takes 0.147433 seconds
The latter takes time 0.015130 seconds
It seems that the syntax for join is much higher than the efficiency of using functions directly in the where. (VIA)
======================================
"Well, finally floating easy to summarize":
The first scenario, the original Order by Rand () method:
$sql = "SELECT * from Content ORDER by rand () LIMIT 12";
$result =mysql_query ($sql, $conn);
$n = 1;
$rnds = ";
while ($row =mysql_fetch_array ($result)) {
$rnds = $rnds. $n. ". <a href= ' show '. $row [' id ']. " -". Strtolower (Trim ($row [' title '])." > ". $row [' title ']." </a><br/>\n ";
$n + +;
}
30,000 data for 12 random records, it takes 0.125 seconds, with the increase of data volume, the efficiency is getting lower.
The second scenario, the improved JOIN method:
for ($n =1; $n <=12; $n + +) {
$sql = "SELECT * from ' content ' as T1
JOIN (select ROUND (RAND () * (select MAX (id) from ' content ')) as ID) as T2
WHERE t1.id >= t2.id ORDER by t1.id ASC LIMIT 1 ";
$result =mysql_query ($sql, $conn);
$yi =mysql_fetch_array ($result);
$rnds = $rnds. $n. ". <a href= ' show '. $yi [' id ']. " -". Strtolower (Trim ($yi [' title '])." > ". $yi [' title ']." </a><br/>\n ";
}
30,000 data for 12 random records, 0.004 seconds, a significant increase in efficiency, up to about 30 times times higher than the first option. Cons: Multiple select queries, high IO overhead.
The third scenario, the SQL statement first random good ID sequence, with in query (floating easy to recommend this usage, IO cost small, fastest):
$sql = "SELECT MAX (ID), MIN (ID) from content";
$result =mysql_query ($sql, $conn);
$yi =mysql_fetch_array ($result);
$idmax = $yi [0];
$idmin = $yi [1];
$idlist = ";
for ($i =1; $i <=20; $i + +) {
if ($i ==1) {$idlist =mt_rand ($idmin, $idmax);}
else{$idlist = $idlist. ', '. Mt_rand ($idmin, $idmax);}
}
$idlist 2= "id,". $idlist;
$sql = "SELECT * from the content where ID in ($idlist) Order By field ($idlist 2) LIMIT 0,12";
$result =mysql_query ($sql, $conn);
$n = 1;
$rnds = ";
while ($row =mysql_fetch_array ($result)) {
$rnds = $rnds. $n. ". <a href= ' show '. $row [' id ']. " -". Strtolower (Trim ($row [' title '])." > ". $row [' title ']." </a><br/>\n ";
$n + +;
}
30,000 data for 12 random records, which takes 0.001 seconds, the efficiency is 4 times times higher than the second method, and 120 times times higher than the first one. Note that the Order by field ($idlist 2) is used here to not sort, otherwise in is automatically sorted. Disadvantage: It is possible to encounter the case where the ID is deleted, so you need to select several more IDs.
Test method:
$t = Microtime (true);
EXECUTE statement
Echo Microtime (True)-$t;
Improved MySQL Order by Rand () inefficiency