MySQL optimization-RAND () optimization method _ MySQL

Source: Internet
Author: User
This article mainly introduces the RAND () optimization method for MySQL optimization. This article analyzes several optimization methods for RAND () in Mysql in detail, and finally draws a conclusion, if you want to directly order by rand () in MySQL, it is very inefficient because it will be executed multiple times. In fact, this is also true if the equivalent query uses RAND (). let's take a look at the different execution plans and execution time of the following SQL statements.

First, let's look at the table creation DDL. this is an InnoDB table without an explicitly auto-incrementing primary key:

The code is as follows:


[Yejr @ imysql]> show create table t_innodb_random \ G
* *************************** 1. row ***************************
Table: t_innodb_random
Create Table: create table 't_ innodb_random '(
'Id' int (10) unsigned not null,
'User' varchar (64) not null default '',
KEY 'idx _ id' ('id ')
) ENGINE = InnoDB default charset = latin1


Add some test data to this table, at least 0.1 million, and the id field is out of order.

The code is as follows:


[Yejr @ imysql]> select count (*) from t_innodb_random \ G
* *************************** 1. row ***************************
Count (*): 393216

1. constant equivalence retrieval:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id = 13412 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: ref
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: const
Rows: 1
Extra: Using index

[Yejr @ imysql]> select id from t_innodb_random where id = 13412;
1 row in set (0.00 sec)

We can see that the execution plan is very good, it is a constant equivalent query, the speed is very fast.

2. use the RAND () function multiplied by the constant to obtain the random number and then retrieve it:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id = round (rand () * 13241324) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index

[Yejr @ imysql]> select id from t_innodb_random where id = round (rand () * 13241324) \ G
Empty set (0.26 sec)

We can see that the execution plan is very bad. although it only scans the index, it performs a full index scan, which is very inefficient. Because the WHERE condition contains RAND (), MySQL treats it as a variable and cannot query it in constant equivalent mode, resulting in low efficiency.

Let's Change the constant to the maximum id value of the t_innodb_random table, multiply it by RAND () to obtain the random number, and then retrieve the result:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id = round (rand () * (select max (id) from t_innodb_random) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 2
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id = round (rand () * (select max (id) from t_innodb_random) \ G
Empty set (0.27 sec)

We can see that the execution plan is still a full index scan, and the execution time is basically the same.

3. transform to the normal subquery mode. here there are two subqueries.

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id = (select round (rand () * (select max (id) from t_innodb_random) as nid) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id = (select round (rand () * (select max (id) from t_innodb_random) as nid) \ G
Empty set (0.27 sec)


We can see that the execution plan is not good, and the execution time is slow.

4. convert to JOIN query, but the maximum value is represented by a constant.

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random t1 join (select round (rand () * 13241324) as id2) as t2 where t1.id = t2.id2 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table:
Type: system
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 1
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: PRIMARY
Table: t1
Type: ref
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: const
Rows: 1
Extra: Using where; Using index
* *************************** 3. row ***************************
Id: 2
Select_type: DERIVED
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: No tables used

[Yejr @ imysql]> select id from t_innodb_random t1 join (select round (rand () * 13241324) as id2) as t2 where t1.id = t2.id2 \ G
Empty set (0.00 sec)


At this time, the execution plan is perfect, which is the same as the constant equivalent query at the beginning, and the execution time is also very fast.

This method is good, but it may not be able to query the record and find the transformation scope, but the result is LIMIT 1:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id> (select round (rand () * (select max (id) from t_innodb_random) as nid) limit 1 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id> (select round (rand () * (select max (id) from t_innodb_random) as nid) limit 1 \ G
* *************************** 1. row ***************************
Id: 1301
1 row in set (0.00 sec)

As you can see, although the execution plan is also a full index scan, because of LIMIT 1, you only need to find a record to terminate the scan, so the efficiency is very fast.

Summary:

When a random record is retrieved from the database, you can place the random number generated by RAND () in the JOIN subquery to improve efficiency.

5. let's take a look at how to get multiple random values at a time using the ordrr by rand () method:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random order by rand () limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using index; Using temporary; Using filesort

[Yejr @ imysql]> select id from t_innodb_random order by rand () limit 1000;
1000 rows in set (0.41 sec)


Full Index scan: it is too slow to generate a temporary sorting table.

6. Place the random number in the subquery:

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random where id> (select rand () * (select max (id) from t_innodb_random) as nid) limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id> (select rand () * (select max (id) from t_innodb_random) as nid) limit 1000 \ G
1000 rows in set (0.04 sec)


Well, the speed has increased a lot, and it looks pretty good :)

7. follow the above method and change to JOIN and random number subquery Association.

The code is as follows:


[Yejr @ imysql]> explain select id from t_innodb_random t1 join (select rand () * (select max (id) from t_innodb_random) as nid) t2 on t1.id> t2.nid limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table:
Type: system
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 1
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: PRIMARY
Table: t1
Type: range
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 196672
Extra: Using where; Using index
* *************************** 3. row ***************************
Id: 2
Select_type: DERIVED
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: No tables used
* *************************** 4. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random t1 join (select rand () * (select max (id) from t_innodb_random) as nid) t2 on t1.id> t2.nid limit 1000 \ G
1000 rows in set (0.00 sec)


As you can see, the full index search method is the fastest in obtaining 1000 rows after it finds that the record meets the condition.

In summary, when you want to randomly retrieve one or N records from the MySQL database, it is best to place the random number generated by RAND () in the JOIN subquery to improve efficiency.
The above mentioned a lot of nonsense. In the end, simply put, the following SQL statement is used:

The code is as follows:


SELECT id FROM table order by rand () LIMIT n;


Transform to the following:

The code is as follows:


SELECT id FROM table t1 JOIN (select rand () * (select max (id) FROM table) AS nid) t2 ON t1.id> t2.nid LIMIT n;


You can enjoy directly obtaining random numbers in SQL without constructing a string of random numbers in the program for retrieval.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.