Mysql-optimized filesort

Source: Internet
Author: User

Content transferred from Sohu gulei PPT

Gains from viewing MySQL source code

1. provide theoretical basis for Optimization
2. Learn the algorithms and ideas for solving the problem

Filesort algorithm (Sorting Algorithm) 1. Read all data to be sorted
2. Each row of data
A. Algorithm 1 (original): stores the sort key and row pointer
B. algorithm 2 (modified): stores the fields in the SORT key and select.
3. Each sorting sort_buffer_size can accommodate the number of rows. The sorting result is written to the io_cache object (io1). The location information of the sorting result is written to another io_cache object (io2 );
4. When io_cache exceeds 64 KB, write a temporary file.
5. When order by has limit N, you only need to write the first n sorting results to io_cache;
6. Use radixsort when the sorting key length is <= 20 and the number of sorting keys is between one thousand and 100,000. Otherwise, use quicksort.
7. Merge Buffer
8. read the sorting result (algorithm 2 reads the result directly from the temporary file; algorithm 1 reads the row pointer from the temporary file, and then reads the data from the table) Merge buffer (merge buffer) filesort algorithm select bgid from bigt order by bgname;
Create Table: Create Table 'bigint '(
'Bgid' int (10) unsigned not null auto_increment,
'Bgname' varchar (100) default null,
'Status' tinyint (4) default '0 ',
Primary Key ('bgid ')
) Engine = InnoDB default charset = Latin1
Bgid (4 bytes), bgname (102 bytes), (null_fields + 7)/8 = 1
Here, null_fields is 1 and bgname is a field that can be empty.
Length = 4 + 102 + 1 = 107
Sort_length = 101 (bgname length)

1. Select the original algorithm when the following two conditions are met
A. There are text or BLOB fields
B. ength + sortlength> max_length_for_sort_data
3. Otherwise, select the modified algorithm.
4. In this example, the modified algorithm is selected.
5. No text or BLOB fields
6. Length + sortlength = 208
7. max_length_for_sort_data = 1024
Sort buffer memory usage 1. Keys = sort_buff_size/(rec_length + sizeof (char *))
2. rec_length = Length + sortlength
In this example
A. rec_length = 208.
B. sizeof (char *) = 4
C. sort_buff_size = 2097116
D. Keys = 9892.
That is, 9892 keys can be sorted in the memory at a time.

Implementation in inverted Order 1. Not implemented when comparing key values
A. Positive and inverted orders are found, which are not treated differently in functions that compare the key value.
B. I almost thought the entire sorting process was wrong.
2. implemented when the key value is written to the sorting area
A. in descending order of tracking Character Types
B. Reverse each byte in make_sortkey
C. In this way, the subsequent Positive Sorting is equivalent to reverse sorting.

In the positive sort merge buffer example, the actual MySQL source code combines every seven buffers. This example is simplified and only five buffers are merged.
Buffer is a Sort result and is stored in a temporary file (io_cache.
Five buffers are the five segments in the temporary file. Each segment stores the sorting result.
The merge buffer algorithm is the mergesort implemented by heapsort.
First, take the first sorting key for each segment and add it to heap to ensure heap sorting.
Merge buffer summarizes the MySQL source code and merges them repeatedly.
Merge 7 buffers each time until all buffers are merged.
The sort buffer memory is still used during merge.
During the last merge, no sort key is written into the sorting result, and only the required field value is written.

Merge buffer is mainly used to obtain the minimum values of each buffer. The minimum values of all buffer data are obtained together.
In addition to the minimum value obtained, calculate the minimum value of the Current Buffer value, and so on to obtain all the sorted buffer data.
Mergesort implemented using heapsort.

Why heapsort? When several buffers are merged each time, all the data in the buffer cannot be obtained.
It is meaningless to sort all the data that can be obtained in the SORT buffer in full order.
For example, in ordered sorting, only the minimum values of the Current Buffer values can be the minimum values of all buffer values. If the minimum values are obtained in sequence, all data in full sorting is obtained.
Heapsort also happens to be not fully ordered, only to ensure that root is the smallest

O & M considerations: computing whether an SQL can be sorted in memory
Calculate the filesort algorithm used by an SQL statement.
Merge buffer cost?
How does filesort evaluate the resource consumption of old and new algorithms?

The essence of optimization is the high throughput of small resources.
High-frequency efficient resources
Low-frequency use of inefficient resources
Efficient resources are usually relatively scarce.
Algorithms help achieve optimization goals

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.