MySQL Random data acquisition method, support large data volume

Source: Internet
Author: User
Tags rand time 0

Recent projects require a random fetch of data from the MySQL database.
It is always known that order by Rand will be dead. Because of my big data on the amount of only a little understanding, no solution, to find Baidu teacher. Search results are stereotyped. Special sent here, for everyone to learn.

In MySQL with a random data function, in MySQL we will have the rand () function, a lot of friends will be directly used, if hundreds of data is sure to be OK, if tens of thousands of or million when you will find that direct use is wrong. Let me introduce some optimization methods for random data fetching.

    1. SELECT * FROM table_name ORDER by Rand() LIMIT 5;
Copy Code

That's what Rand says in his handbook:
RAND ()
RAND (N)
Returns a random floating-point value within a range of 0 to 1.0. If an integer parameter n is specified, it is used as the seed value.

  1. MySQL> Select RAND();
  2. 0.5925
  3. MySQL> Select RAND(a);
  4. 0.1811
  5. MySQL> Select RAND(a);
  6. 0.1811
  7. MySQL> Select RAND();
  8. 0.2079
  9. MySQL> Select RAND();
  10. 0.7888
Copy Code

You cannot use a column with the rand () value in an ORDER BY clause because the order by will repeat the computed column multiple times. However in MySQL3.23, you can do: SELECT * FROM table_name ORDER by RAND (), which is advantageous to get a from SELECT * from Table1,table2 WHERE a=b and C<d O A random sample of the collection of Rder by RAND () LIMIT 1000. Note that a rand () in a WHERE clause will be re-evaluated each time the where is executed.

The internet is basically querying max (ID) * RAND () to get the data randomly.

  1. SELECT *
  2. From ' table ' as T1 JOIN (select ROUND(RAND() * ( Select MAX( ID) from ' table ') as ID) as T2
  3. WHERE T1. ID >= T2. ID
  4. ORDER by T1. ID ASC LIMIT 5;
Copy Code

However, this will produce 5 consecutive records. The solution can only be one query at a time, query 5 times. Even so it is worth it, because 150,000 of the tables, the query only need 0.01 seconds less than.

The above statement uses a join,mysql forum where someone uses

    1. SELECT *
    2. From ' table '
    3. WHERE ID >= (SELECT floor( MAX(ID) * RAND()) from ' table ' )
    4. ORDER by ID LIMIT 1;
Copy Code

I tested it, it takes 0.5 seconds, and it's a good speed, but there's a big gap with the above statement.
Later consulted Baidu, get the following code
The complete query statement is:

  1. SELECT * from ' table '
  2. WHERE ID>=(SELECT Floor( rand ()  * (select max (id  from  ' table ' )-( select min (id ' table ' )  +  (select min (id< Span class= "pun")  from  ' table ' &NBSP;
  3. ORDER by ID LIMIT 1;
Copy Code
  1. SELECT *
  2. From' Table 'as T1 JOIN(SELECT ROUND (rand ()  *  ((select max (id  from  ' table ' )-( select min (id ' table ' select min (id)  from  as id)  as  t2
  3. WHERE T1. ID >= T2. ID
  4. ORDER by T1. ID LIMIT 1;
Copy Code

Finally, the two statements were queried 10 times in PHP,
The former takes 0.147433 seconds
The latter takes time 0.015130 seconds

Execution efficiency requires 0.02 sec. Unfortunately, only MySQL 4.1.* above supports such subqueries.
Note Check the official manual, which also says Rand () is executed several times in the ORDER BY clause, naturally efficient and very low.

The last of the SQL statements above, I actually test through, 100W data, instant results.

MySQL Random data acquisition method, support large data volume

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.