[MySQL optimization case] series-RAND () Optimization _ MySQL

Source: Internet
Author: User
[MySQL optimization case] series-RAND () optimization we all know that in MySQL, if order by rand () is used directly, the efficiency is very poor because it will be executed multiple times. In fact, this is also true if the equivalent query uses RAND (). let's take a look at the different execution plans and execution time of the following SQL statements. First, let's look at the table creation DDL. this is an InnoDB table without an explicitly auto-incrementing primary key:

[yejr@imysql]> show create table t_innodb_random/G*************************** 1. row ***************************Table: t_innodb_randomCreate Table: CREATE TABLE `t_innodb_random` (`id` int(10) unsigned NOT NULL,`user` varchar(64) NOT NULL DEFAULT '',KEY `idx_id` (`id`)) ENGINE=InnoDB DEFAULT CHARSET=latin1

Add some test data to this table, at least 0.1 million, and the id field is out of order.

[yejr@imysql]> select count(*) from t_innodb_random/G*************************** 1. row ***************************count(*): 393216

1. constant equivalence retrieval:

[yejr@imysql]> explain select id from t_innodb_random where id = 13412/G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t_innodb_randomtype: refpossible_keys: idx_idkey: idx_idkey_len: 4ref: constrows: 1Extra: Using index[yejr@imysql]> select id from t_innodb_random where id = 13412;1 row in set (0.00 sec)

We can see that the execution plan is very good, it is a constant equivalent query, the speed is very fast.

2. use the RAND () function multiplied by the constant to obtain the random number and then retrieve it:

[yejr@imysql]> explain select id from t_innodb_random where id = round(rand()*13241324)/G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using where; Using index[yejr@imysql]> select id from t_innodb_random where id = round(rand()*13241324)/GEmpty set (0.26 sec)

We can see that the execution plan is very bad. although it only scans the index, it performs a full index scan, which is very inefficient. Because the WHERE condition contains RAND (), MySQL treats it as a variable and cannot query it in constant equivalent mode, resulting in low efficiency.

Let's Change the constant to the maximum id value of the t_innodb_random table, multiply it by RAND () to obtain the random number, and then retrieve the result:

[yejr@imysql]> explain select id from t_innodb_random where id = round(rand()*(select max(id) from t_innodb_random))/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using where; Using index*************************** 2. row ***************************id: 2select_type: SUBQUERYtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: Select tables optimized away[yejr@imysql]> select id from t_innodb_random where id = round(rand()*(select max(id) from t_innodb_random))/GEmpty set (0.27 sec)

We can see that the execution plan is still a full index scan, and the execution time is basically the same.

3. transform to the normal subquery mode. here there are two subqueries.

[yejr@imysql]> explain select id from t_innodb_random where id = (select round(rand()*(select max(id) from t_innodb_random)) as nid)/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using where; Using index*************************** 2. row ***************************id: 3select_type: SUBQUERYtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: Select tables optimized away[yejr@imysql]> select id from t_innodb_random where id = (select round(rand()*(select max(id) from t_innodb_random)) as nid)/GEmpty set (0.27 sec)

We can see that the execution plan is not good, and the execution time is slow.

4. convert to JOIN query, but the maximum value is represented by a constant.

[yejr@imysql]> explain select id from t_innodb_random t1 join (select round(rand()*13241324) as id2) as t2 where t1.id = t2.id2/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: 
 
  type: systempossible_keys: NULLkey: NULLkey_len: NULL
  ref: NULLrows: 1Extra:*************************** 2. row ***************************id: 1select_type: PRIMARYtable: t1type: refpossible_keys: idx_idkey: idx_idkey_len: 4
  ref: constrows: 1Extra: Using where; Using index*************************** 3. row ***************************id: 2select_type: DERIVEDtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: No tables used[yejr@imysql]> select id from t_innodb_random t1 join (select round(rand()*13241324) as id2) as t2 where t1.id = t2.id2/GEmpty set (0.00 sec)
 

At this time, the execution plan is perfect, which is the same as the constant equivalent query at the beginning, and the execution time is also very fast. This method is good, but it may not be able to query the record and find the transformation scope, but the result is LIMIT 1:

[yejr@imysql]> explain select id from t_innodb_random where id > (select round(rand()*(select max(id) from t_innodb_random)) as nid) limit 1/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using where; Using index*************************** 2. row ***************************id: 3select_type: SUBQUERYtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: Select tables optimized away[yejr@imysql]> select id from t_innodb_random where id > (select round(rand()*(select max(id) from t_innodb_random)) as nid) limit 1/G*************************** 1. row ***************************id: 13011 row in set (0.00 sec)

As you can see, although the execution plan is also a full index scan, because of LIMIT 1, you only need to find a record to terminate the scan, so the efficiency is very fast.

Summary: when a random record is retrieved from the database, you can place the random number generated by RAND () in the JOIN subquery to improve efficiency.

5. let's take a look at how to get multiple random values at a time using the ordrr by rand () method:

[yejr@imysql]> explain select id from t_innodb_random order by rand() limit 1000/G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using index; Using temporary; Using filesort[yejr@imysql]> select id from t_innodb_random order by rand() limit 1000;1000 rows in set (0.41 sec)

Full Index scan: it is too slow to generate a temporary sorting table.

6. Place the random number in the subquery:

[yejr@imysql]> explain select id from t_innodb_random where id > (select rand() * (select max(id) from t_innodb_random) as nid) limit 1000/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: t_innodb_randomtype: indexpossible_keys: NULLkey: idx_idkey_len: 4ref: NULLrows: 393345Extra: Using where; Using index*************************** 2. row ***************************id: 3select_type: SUBQUERYtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: Select tables optimized away[yejr@imysql]> select id from t_innodb_random where id > (select rand() * (select max(id) from t_innodb_random) as nid) limit 1000/G1000 rows in set (0.04 sec)

Well, the speed has increased a lot, and it looks pretty good :)

7. follow the above method and change to JOIN and random number subquery Association.

[yejr@imysql]> explain select id from t_innodb_random t1 join (select rand() * (select max(id) from t_innodb_random) as nid) t2 on t1.id > t2.nid limit 1000/G*************************** 1. row ***************************id: 1select_type: PRIMARYtable: 
 
  type: systempossible_keys: NULLkey: NULLkey_len: NULL
  ref: NULLrows: 1Extra:*************************** 2. row ***************************id: 1select_type: PRIMARYtable: t1type: rangepossible_keys: idx_idkey: idx_idkey_len: 4
  ref: NULLrows: 196672Extra: Using where; Using index*************************** 3. row ***************************id: 2select_type: DERIVEDtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: No tables used*************************** 4. row ***************************id: 3select_type: SUBQUERYtable: NULLtype: NULLpossible_keys: NULLkey: NULLkey_len: NULLref: NULLrows: NULLExtra: Select tables optimized away[yejr@imysql]> select id from t_innodb_random t1 join (select rand() * (select max(id) from t_innodb_random) as nid) t2 on t1.id > t2.nid limit 1000/G1000 rows in set (0.00 sec)
 

As you can see, the full index search method is the fastest in obtaining 1000 rows after it finds that the record meets the condition.

In summary, when you want to randomly retrieve one or N records from the MySQL database, it is best to place the random number generated by RAND () in the JOIN subquery to improve efficiency. The above mentioned a lot of nonsense. In the end, simply put, the following SQL statement is used:

SELECT id FROM table ORDER BY RAND() LIMIT n;

Transform to the following:

SELECT id FROM table t1, JOIN (SELECT RAND() * (SELECT MAX(id) FROM table) AS nid) t2 ON t1.id > t2.nid LIMIT n;

You can enjoy directly obtaining random numbers in SQL without constructing a string of random numbers in the program for retrieval.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.