SQL statement optimization principles and millions of Data Optimization Solutions

Last Update:2013-11-17 Source: Internet

Author: User

Tags mysql tutorial

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1. Use indexes to traverse tables faster.
The index created by default is a non-clustered index, but sometimes it is not optimal. Non-Cluster Index
The data is physically stored on the data page at random. Reasonable index design should be based on
Analyze and predict various queries. Generally speaking:
A. Columns with a large number of repeated values and frequent range queries (>,<,>=, <=) and order by and group by are available.
Create a cluster index;
B. Multiple columns are frequently accessed at the same time, and each column contains duplicate values. You can consider creating a composite index;
C. The composite index should try its best to make the key Query Form an index overwrite. Its leading column must be the most frequently used column. Although indexes can improve performance, the more indexes, the better. On the contrary, too many indexes will lead to low system efficiency. Each time you add an index to a table, you must update the index set.
2. Use as few formats as possible for massive queries.
3. order by and gropu by use order by and group by phrases. Any index can improve SELECT performance.
4. Any operations on columns will cause table scanning, including database tutorial functions and calculation expressions. During query, try to move the operations to the right of the equal sign.
5. IN And OR clauses usually use worksheets to invalidate indexes. If a large number of duplicate values are not generated, consider splitting the clause. The split clause should contain the index.

Mysql Optimization Principle 2:
1. Use a smaller data type as much as possible to meet your needs: for example, use MEDIUMINT instead of INT.
2. Try to set all columns as not null. If you want to save NULL, set it manually instead of setting it as the default value.
3. Use VARCHAR, TEXT, and BLOB types as little as possible
4. If your data is only a small amount of data you know. It is best to use the ENUM type
5. Create an index as described in graymice.

Method 2

Before optimization: data in Table A is redundant.
SELECT 'T'. 'img _ id', 'T'. 'thumb _ Path'
FROM 'gallery _ photofiles 'P
Left join 'gallery _ thumbs't on 'T'. 'img _ id' = 'P'. 'img _ id' and T. thumb_type = '11'
WHERE 'P'. 'owner _ user_id '= '1'
And p. img_id in (select A. img_id from 'gallery _ album_img_link 'a where a. img_id)
After optimization: count (*) greatly increases the speed
SELECT 'T'. 'img _ id', 'T'. 'thumb _ Path'
FROM 'gallery _ photofiles 'P
Left join 'gallery _ thumbs't on 'T'. 'img _ id' = 'P'. 'img _ id' and T. thumb_type = '11'
WHERE 'P'. 'owner _ user_id '= '1'
AND (select count (*) from 'gallery _ album_img_link 'a where a. img_id = P. img_id) <1

I always thought that the mysql tutorial would randomly query several pieces of data.

SELECT * FROM 'table' order by rand () LIMIT 5
You can.
However, the test results show that the efficiency is very low. It takes more than 8 seconds to query 5 data records in a database with more than 0.15 million entries

According to the official manual, rand () is executed multiple times in the order by clause, which is naturally inefficient and inefficient.

You cannot use a column with RAND () values in an order by clause, because order by wowould evaluate the column multiple times.

Search for Google. Basically, data is randomly obtained by querying max (id) * rand () on the Internet.

SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id asc limit 5;
However, five consecutive records are generated. The solution is to query only one item at a time and query five times. Even so, it is worthwhile because it takes less than 0.15 million seconds to query 0.01 tables.

The preceding statement uses JOIN, Which is used on the mysql forum.

SELECT *
FROM 'table'
WHERE id> = (select floor (MAX (id) * RAND () FROM 'table ')
Order by id LIMIT 1;
I tested it. It took 0.5 seconds and the speed was good, but there was still a big gap with the above statements. I always feel that something is abnormal.

So I changed the statement.

SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table ')))
Order by id LIMIT 1;
The query efficiency is improved, and the query time is only 0.01 seconds.

Finally, complete the statement and add the MIN (id) judgment. At the beginning of the test, because I did not add the MIN (id) Judgment, half of the time is always the first few rows in the table.
The complete query statement is:

SELECT * FROM 'table'
WHERE id> = (SELECT floor (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (select min (id) FROM 'table ')))
Order by id LIMIT 1; SELECT *
FROM 'table' AS t1 JOIN (select round (RAND () * (select max (id) FROM 'table')-(select min (id) FROM 'table ')) + (select min (id) FROM 'table') AS id) AS t2
WHERE t1.id> = t2.id
Order by t1.id LIMIT 1;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More