MySQL performance optimization and mysql Optimization

Source: Internet
Author: User

MySQL performance optimization and mysql Optimization

How to extract a random entry from a data table in MySQL, while ensuring the highest efficiency.

Method 1

This is the most primitive and intuitive syntax, as shown below:

SELECT * FROM foo order by rand () LIMIT 1

This method is feasible when the data volume in the data table is small. However, when the data volume reaches a certain level, such as 1 million or more data, there will be great performance problems. If you use EXPLAIN to analyze this statement, you will find that although MySQL creates a temporary table for sorting, due to the characteristics of order by and LIMIT, before sorting is complete, we still cannot use LIMIT to obtain the required records. That is, you must sort the data first when there are many records.

Method 2

It seems that order by is the crux of the performance for the extraction of large data volumes of random data. How can this problem be avoided? Method 2 provides a solution.

First, obtain the number of all records in the data table:

SELECT count (*) AS num_rows FROM foo

Then, use the corresponding background program to record the total number of records (assumed num_rows ).

Then execute:

SELECT * FROM foo LIMIT [a random number between 0 and num_rows], 1

The above random number can be obtained through the background program. The premise of this method is that the table ID is continuous or self-increasing.

This method has successfully avoided the generation of order.

Method 3

Is there a possibility that order by is not used and a SQL statement is used to implement method 2? Yes, that is, using JOIN.

SELECT * FROM Bar B join (select ceil (MAX (ID) * RAND () as id from Bar) AS m on B. ID> = m. ID LIMIT 1;

This method achieves our goal. At the same time, in the case of a large amount of data, it also avoids the sorting process of all records caused by order, because the SELECT statement in JOIN is actually executed only once, instead of N times (N is equal to num_rows in method 2 ). In addition, we can add the "greater than" symbol to the filter statement to avoid null records generated due to the continuous ID.

To query five non-duplicate data records in mysql, use the following:

SELECT * FROM 'table' order by rand () LIMIT 5

You can. However, the test results show that the efficiency is very low. It takes more than 8 seconds to query 5 data records in a database with more than 0.15 million entries

Search for Google. Basically, data is randomly obtained by querying max (id) * rand () on the Internet.

SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id asc limit 5;

However, five consecutive records are generated. The solution is to query only one item at a time and query five times. Even so, it is worthwhile because it takes less than 0.15 million seconds to query 0.01 tables.

The preceding statement uses JOIN, Which is used on the mysql forum.

SELECT *
FROM 'table'
WHERE id> = (select floor (MAX (id) * RAND () FROM 'table ')
Order by id LIMIT 1;

I tested it. It took 0.5 seconds and the speed was good, but there was still a big gap with the above statements. I always feel that something is abnormal.

So I changed the statement.

SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table ')))
Order by id LIMIT 1;

The query efficiency is improved, and the query time is only 0.01 seconds.

Finally, complete the statement and add the MIN (id) judgment. At the beginning of the test, because I did not add the MIN (id) Judgment, half of the time is always the first few rows in the table.
The complete query statement is:

SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (select min (id) FROM 'table ')))
Order by id LIMIT 1;

SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (select min (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id LIMIT 1;

Finally, the two statements are queried 10 times respectively,
The former takes 0.147433 seconds.
The latter takes 0.015130 seconds.
It seems that using the JOIN syntax is much more efficient than using functions directly in the WHERE clause.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.