MySQL SQL optimization

Source: Internet
Author: User
Tags rand

--mysql SQL optimization

-------------------2014/07/23

case of several order by
At first glance the problem seems a bit complicated, and we start with the simplest case.
Use this table to illustrate: (10w rows of data)

1. The simplest Order――order by index field

Judging from the results of explain (extra column), this statement is not sorted. Because field A is already in order. is to read the value of the PK sequentially, in the Order of index A (here is the hidden system column), and read from the data of the clustered index.

2. A little more complicated ――order by non-indexed fields

Here the extra column shows a using Filesort. The filesort here is not literally a "file sort", but rather a sort of server layer compared to one of the above. As for whether to use the file, it depends on whether the memory in the sorting process is sufficient or not enough to require a temporary file.

Not so far, let's think about it, how does theserver layer sort it ?

A simple idea is to read the table data into memory and then sort it. Read the memory of course you can think how to complete the whole. But this approach is very memory-intensive. You need to occupy the same size of memory as the table.

Another procedure, read only in field B and its corresponding primary key ID. You can imagine a struct that consists of these two fields, sorted by the value of B. Once the sorting is complete, the primary key ID is followed by the Order of field B, and the result is returned.

In fact, the second approach is the actual implementation process in this example. The structure that holds the field values for sorting we call Sort_keys.

As for the order by B,c, the effect is the same as order by B, which can be understood as a single field more than the structure above.

3. Field function sorting

With the above process, it is simple here, or in order to read all the field B, only the Sort_keys is the length of B.

4. Order by Rand ()

According to the natural idea, order by rand () can also be modeled as described above, for each row, the generated rand () value into the Sort_kyes. But actually the effect is as follows:

There is a using temporary in the extra field, which means that a temporary table is used. So what is the operating flow of the using temporary?
A) Create a temporary table for the heap engine with the field named "" A B c D, and the first field to be anonymous;
b) Read the data in table TB into the temporary table in rows, and fill in the first field with a random real number (0,1);
c) Sort by the first field and return
d) Query Complete Delete temp table

Analysis of this process, because the data from the InnoDB table read into the temporary table, the InnoDB table has actually been read into memory, in this process, if the memory is not enough when the write file policy, there are two copies of the table in memory, as well as the process of copying data one by one from memory to the staging table.

--order by rank optimization

Basically it is solved by using query Max (ID) * RAND () to randomly fetch the data.

For example:

SELECT *
From ' table '
WHERE ID >= (SELECT floor (MAX (ID) * RAND ()) from ' table ')
ORDER by ID LIMIT 1;

You can optimize the judgment with min (id), because there is no judge of min (id), and the result is that some time is always queried for some rows before min in the table.

SELECT * from ' table '
WHERE ID >= (SELECT Floor (RAND () * ((select MAX (ID) from ' table ')-(select min (id) from ' table ') + (select min (id) FRO M ' table ')))
ORDER by ID LIMIT 1;

--Optimization drawbacks: If the data values are unevenly distributed, the resulting randomness is problematic . For example:

Data distribution 1-----3-------------------------------------------------2000000

So in this way, there should be a 90% chance of getting 2000000 of the rows.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.