The RAND () Optimization Method for MySQL optimization and the rand Method for mysql Optimization

Source: Internet
Author: User

The RAND () Optimization Method for MySQL optimization and the rand Method for mysql Optimization

As we all know, in MySQL, if order by rand () is used directly, the efficiency is very poor because it will be executed multiple times. In fact, this is also true if the equivalent query uses RAND (). Let's take a look at the different execution plans and execution time of the following SQL statements.

First, let's look at the table creation DDL. This is an InnoDB table without an explicitly auto-incrementing primary key:
Copy codeThe Code is as follows:
[Yejr @ imysql]> show create table t_innodb_random \ G
* *************************** 1. row ***************************
Table: t_innodb_random
Create Table: create table 't_ innodb_random '(
'Id' int (10) unsigned not null,
'User' varchar (64) not null default '',
KEY 'idx _ id' ('id ')
) ENGINE = InnoDB default charset = latin1
Add some test data to this table, at least 0.1 million, and the id field is out of order.
Copy codeThe Code is as follows:
[Yejr @ imysql]> select count (*) from t_innodb_random \ G
* *************************** 1. row ***************************
Count (*): 393216

1. Constant equivalence retrieval:

Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id = 13412 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: ref
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: const
Rows: 1
Extra: Using index

[Yejr @ imysql]> select id from t_innodb_random where id = 13412;
1 row in set (0.00 sec)

We can see that the execution plan is very good, it is a constant equivalent query, the speed is very fast.

2. Use the RAND () function multiplied by the constant to obtain the random number and then retrieve it:

Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id = round (rand () * 13241324) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index

[Yejr @ imysql]> select id from t_innodb_random where id = round (rand () * 13241324) \ G
Empty set (0.26 sec)

We can see that the execution plan is very bad. Although it only scans the index, it performs a full index scan, which is very inefficient. Because the WHERE condition contains RAND (), MySQL treats it as a variable and cannot query it in constant equivalent mode, resulting in low efficiency.

Let's change the constant to the maximum id value of the t_innodb_random table, multiply it by RAND () to obtain the random number, and then retrieve the result:
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id = round (rand () * (select max (id) from t_innodb_random) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 2
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id = round (rand () * (select max (id) from t_innodb_random) \ G
Empty set (0.27 sec)

We can see that the execution plan is still a full index scan, and the execution time is basically the same.

3. transform to the normal subquery mode. Here there are two subqueries.

Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id = (select round (rand () * (select max (id) from t_innodb_random) as nid) \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id = (select round (rand () * (select max (id) from t_innodb_random) as nid) \ G
Empty set (0.27 sec)
We can see that the execution plan is not good, and the execution time is slow.

4. Convert to JOIN query, but the maximum value is represented by a constant.
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random t1 join (select round (rand () * 13241324) as id2) as t2 where t1.id = t2.id2 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: <derived2>
Type: system
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 1
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: PRIMARY
Table: t1
Type: ref
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: const
Rows: 1
Extra: Using where; Using index
* *************************** 3. row ***************************
Id: 2
Select_type: DERIVED
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: No tables used

[Yejr @ imysql]> select id from t_innodb_random t1 join (select round (rand () * 13241324) as id2) as t2 where t1.id = t2.id2 \ G
Empty set (0.00 sec)
At this time, the execution plan is perfect, which is the same as the constant equivalent query at the beginning, and the execution time is also very fast.

This method is good, but it may not be able to query the record and find the transformation scope, but the result is LIMIT 1:
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id> (select round (rand () * (select max (id) from t_innodb_random) as nid) limit 1 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id> (select round (rand () * (select max (id) from t_innodb_random) as nid) limit 1 \ G
* *************************** 1. row ***************************
Id: 1301
1 row in set (0.00 sec)

As you can see, although the execution plan is also a full index scan, because of LIMIT 1, you only need to find a record to terminate the scan, so the efficiency is very fast.

Summary:

When a random record is retrieved from the database, you can place the random number generated by RAND () in the JOIN subquery to improve efficiency.

5. Let's take a look at how to get multiple random values at a time using the ordrr by rand () method:
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random order by rand () limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using index; Using temporary; Using filesort

[Yejr @ imysql]> select id from t_innodb_random order by rand () limit 1000;
1000 rows in set (0.41 sec)
Full index scan: It is too slow to generate a temporary sorting table.

6. Place the random number in the subquery:
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random where id> (select rand () * (select max (id) from t_innodb_random) as nid) limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: t_innodb_random
Type: index
Possible_keys: NULL
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 393345
Extra: Using where; Using index
* *************************** 2. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random where id> (select rand () * (select max (id) from t_innodb_random) as nid) limit 1000 \ G
1000 rows in set (0.04 sec)
Well, the speed has increased a lot, and it looks pretty good :)

7. Follow the above method and change to JOIN and random number subquery Association.
Copy codeThe Code is as follows:
[Yejr @ imysql]> explain select id from t_innodb_random t1 join (select rand () * (select max (id) from t_innodb_random) as nid) t2 on t1.id> t2.nid limit 1000 \ G
* *************************** 1. row ***************************
Id: 1
Select_type: PRIMARY
Table: <derived2>
Type: system
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: 1
Extra:
* *************************** 2. row ***************************
Id: 1
Select_type: PRIMARY
Table: t1
Type: range
Possible_keys: idx_id
Key: idx_id
Key_len: 4
Ref: NULL
Rows: 196672
Extra: Using where; Using index
* *************************** 3. row ***************************
Id: 2
Select_type: DERIVED
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: No tables used
* *************************** 4. row ***************************
Id: 3
Select_type: SUBQUERY
Table: NULL
Type: NULL
Possible_keys: NULL
Key: NULL
Key_len: NULL
Ref: NULL
Rows: NULL
Extra: Select tables optimized away

[Yejr @ imysql]> select id from t_innodb_random t1 join (select rand () * (select max (id) from t_innodb_random) as nid) t2 on t1.id> t2.nid limit 1000 \ G
1000 rows in set (0.00 sec)
As you can see, the full index search method is the fastest in obtaining 1000 rows after it finds that the record meets the condition.

In summary, when you want to randomly retrieve one or N records from the MySQL database, it is best to place the random number generated by RAND () in the JOIN subquery to improve efficiency.
The above mentioned a lot of nonsense. In the end, simply put, the following SQL statement is used:
Copy codeThe Code is as follows:
SELECT id FROM table order by rand () LIMIT n;
Transform to the following:
Copy codeThe Code is as follows:
SELECT id FROM table t1 JOIN (select rand () * (select max (id) FROM table) AS nid) t2 ON t1.id> t2.nid LIMIT n;
You can enjoy directly obtaining random numbers in SQL without constructing a string of random numbers in the program for retrieval.


Best mysql optimization skills

1. select the most suitable field attribute

MySQL can support access to large data volumes, but generally, the smaller the table in the database, the faster the query will be executed on it. Therefore, when creating a table, we can set the field width in the table as small as possible to achieve better performance. For example, if you set it to CHAR (255) when defining the zip code field, it is obvious that unnecessary space is added to the database, and even the VARCHAR type is redundant, because CHAR (6) can well complete the task. Similarly, if possible, we should use MEDIUMINT instead of BIGIN to define integer fields.

Another way to improve efficiency is to set the field to not null whenever possible, so that the database does NOT need to compare NULL values during future queries.

Some text fields, such as "Province" or "gender", can be defined as ENUM. In MySQL, The ENUM type is processed as the numeric data, and the numeric data is processed much faster than the text type. In this way, we can improve the database performance.

2. Use JOIN instead of Sub-Queries)

MySQL supports SQL subqueries from 4.1. This technique can use the SELECT statement to create a single column query result, and then use this result as a filter condition in another query. For example, if you want to delete a customer who has no orders in the basic customer information table, you can use the subquery to retrieve the customer IDs of all orders from the sales information table, then pass the result to the primary query, as shown below:

Delete from customerinfo
WHERE CustomerID NOT in (SELECT CustomerID FROM salesinfo)

Subqueries can be used to complete SQL operations that require multiple logical steps at a time. At the same time, transactions or tables can be prevented from being locked and can be easily written. However, in some cases, subqueries can be replaced by more efficient JOIN. For example, if we want to retrieve all users without order records, we can use the following query:

SELECT * FROM customerinfo
WHERE CustomerID NOT in (SELECT CustomerID FROM salesinfo)

If you use JOIN... to complete this query, the speed will be much faster. Especially when the salesinfo table has an index on CustomerID, the performance will be better. The query is as follows:

SELECT * FROM customerinfo
Left join salesinfoON customerinfo. CustomerID = salesinfo.
CustomerID
WHERE salesinfo. CustomerID IS NULL

JOIN... it is more efficient because MySQL does not need to create a temporary table in the memory to perform the query in two steps.

3. Use UNION instead of creating a temporary table manually

MySQL 4.0 and later versions support UNION queries. It can merge two or more SELECT queries in a temporary table. When the query Session on the client ends, the temporary table is automatically deleted to ensure the database is neat and efficient. When using UNION to create a query, we only need to use UNION as the keyword to connect multiple SELECT statements. Note that the number of fields in all SELECT statements must be the same. The following example demonstrates a query using UNION.

SELECT Name, Phone FROM client
UNION
SELECT Name, BirthDate FROM author ...... remaining full text>

How to optimize the MYSQL database?

(1 ). in terms of database design, this is the responsibility of DBA and impact ect. A database with a good design structure should be de-normalized when necessary (I don't know what the Chinese translation is ), some data redundancy is allowed to avoid JOIN operations to improve query efficiency.
(2 ). in terms of system architecture design, the table is hashed, and massive data is hashed into several different tables. fast and slow tables: only the latest data is retained. Slow tables are archived in history. cluster, Master server Read & write, slave server read only, or N servers, each machine is a Master
(3). (1) and (2) better than PHP Programmer's requirements. It doesn't matter. check whether there is any less index.
(4 ). write efficient SQL statements to see if there are any inefficient SQL statements, such as generating full connections to cartesian products, a large number of Group By and order by statements, and no limit. when necessary, encapsulate the database logic in the stored procedure of the DBMS. cache query results and explain each SQL statement
(5). All the results are required. Only necessary data is obtained from the database. For example, you can query the number of comments of an article, select count (*)... where article_id =? You can. Do not select *... where article_id =? Then msql_num_rows.
Send only required SQL statements. For example, if you modify only the title when modifying an article, update... set title =? Where article_id =? Do not set content =? (Large text)
(6). Use different storage engines when necessary. For example, InnoDB can reduce deadlocks. HEAP can increase the query speed by an order of magnitude.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.