Mysql random query data and Performance Analysis

Source: Internet
Author: User
Many of you know that you can use the rand () function to query data at random in mysql, but the rand () function can use hundreds of thousands of data records directly, if tens of thousands of data entries occur, let me introduce you to rand () random query data and Performance Analysis in mysql.

Many of you know that you can use the rand () function to query data at random in mysql, but the rand () function can use hundreds of thousands of data records directly, if tens of thousands of data entries occur, let me introduce you to rand () random query data and Performance Analysis in mysql.

For example, if you want to use an SQL statement to return a random integer ranging from-5 to 5, you can simply use
Print rand (-5, 5 );
?>
In mysql, the rand function can have only one parameter. // extract from the manual
RAND () RAND (N)
Returns a random floating point value v, ranging from 0 to 1 (that is, the range is 0 ≤ v ≤ 1.0 ). If an integer parameter N is specified, it is used as the seed value to generate a recurring series.

There are two ways to achieve the above effect.
1. Create a new table with a number ranging from-5 to 5. Use order by rand () to obtain the random number.
# Create a data table with a specified range
# Auther: Xiaoqiang (fortune Manager)
# Date: 2008-03-31

Create table randnumber
-1 as number
Union
Select-2
Union
Select-3
Union
Select-4
Union
Select-5
Union
Select 0
Union
Select 1
Union
Select 2
Union
Select 3
Union
Select 4
Union
Select 5

# Obtain a random number
# Auther: Xiaoqiang (fortune Manager)
# Date: 2008-03-31

Select number
From randnumber order by rand () limit 1

Advantage: a random number can specify a part of the data, and does not need to be consecutive.
Disadvantage: it is difficult to create a table when the random number range is wide.
2. Use the ROUND () and RAND () functions of MySQL to implement
# An SQL statement
# Auther: Xiaoqiang (fortune Manager)
# Date: 2008-03-31

Select round (0.5-RAND () * 2*5)

# Note
#0.5-rand () to obtain a random number ranging from-0.5 to + 0.5
# (0.5-rand () * 2 to obtain a random number from-1 to + 1
# (0.5-rand () * 2*5 to obtain a random number from-5 to + 5
# ROUND (0.5-RAND () * 2*5) returns a random integer ranging from-5 to + 5.

Advantage: When a random number has a wide range, you only need to change 5 of * 5, which is very convenient.
Disadvantage: random numbers can only be consecutive, and some data cannot be specified.


To query five non-duplicate data records in mysql, use the following:

SELECT * FROM 'table' order by rand () LIMIT 5

You can. However, the test results show that the efficiency is very low. It takes more than 8 seconds to query 5 data records in a database with more than 0.15 million entries

Search: basically, data is randomly obtained by querying max (id) * rand () on the Internet.

SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id asc limit 5;

However, five consecutive records are generated. The solution is to query only one item at a time and query five times. Even so, it is worthwhile because it takes less than 0.15 million seconds to query 0.01 tables.


The preceding statement uses JOIN, Which is used on the mysql forum.

SELECT *
FROM 'table'
WHERE id> = (select floor (MAX (id) * RAND () FROM 'table ')
Order by id LIMIT 1;

I tested it. It took 0.5 seconds and the speed was good, but there was still a big gap with the above statements. I always feel that something is abnormal.

So I changed the statement.

SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table ')))
Order by id LIMIT 1;

The query efficiency is improved, and the query time is only 0.01 seconds.

Finally, complete the statement and add the MIN (id) judgment. At the beginning of the test, because I did not add the MIN (id) Judgment, half of the time is always the first few rows in the table.
The complete query statement is:

The Code is as follows:


SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (SELECTMIN (id) FROM 'table ')))
Order by id LIMIT 1;

SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (select min (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id LIMIT 1;


Finally, the two statements are queried 10 times respectively,
The former takes 0.147433 seconds.
The latter takes 0.015130 seconds.
It seems that using the JOIN syntax is much more efficient than using functions directly in the WHERE clause.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.