The following describes two sorting implementation methods and implementation diagrams through examples:
Assume that two tables, Table A and Table B, have the following structures:
Sky @ localhost: example 01:48:21> show create table AG
* *************************** 1. row ***************************
Table:
Create Table: create table 'A '(
'C1 'int (11) not null default '0 ′,
'C2 'char (2) default NULL,
'C3' varchar (16) default NULL,
'C4 'datetime default NULL,
Primary key ('c1 ')
) ENGINE = MyISAM default charset = utf8
Sky @ localhost: example 01:48:32> show create table BG
* *************************** 1. row ***************************
Table: B
Create Table: create table 'B '(
'C1 'int (11) not null default '0 ′,
'C2 'char (2) default NULL,
'C3' varchar (16) default NULL,
Primary key ('c1 '),
KEY 'B _ c2_ind' ('c2 ')
) ENGINE = MyISAM default charset = utf8
1. Using ordered indexes for sorting means that when the order by condition of our Query is exactly the same as the Index key used in the Query execution plan (or the previous Index keys, when the index access mode is rang, ref, or index, MySQL can directly obtain sorted data by using the index order. The order by method is basically the Optimal Sorting method, because MySQL does not need to perform the actual sorting operation.
Suppose we execute the following SQL statement on Table A and B:
Sky @ localhost: example 01:44:28> explain select a. * from a, B
-> Where a. c1> 2 and a. c2 <5 and a. c2 = B. c2 ORDER BY A. c1G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table:
Type: range
Possible_keys: PRIMARY
Key: PRIMARY
Key_len: 4
Ref: NULL
Rows: 3
Extra: Using where
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: B
Type: ref
Possible_keys: B _c2_ind
Key: B _c2_ind
Key_len: 7
Ref: example. A. c2
Rows: 2
Extra: Using where; Using index
We can see from the execution plan that MySQL does not actually sort data, as shown in the following figure:
2. Sort the obtained data in the memory using the corresponding sorting algorithm. MySQL needs to sort the data in the memory, the memory used is the sorting area we set through the sort_buffer_size system variable. This sorting area is exclusive to each Thread, So multiple sort buffer memory areas may exist in MySQL at the same time.
The second method is called filesort in the execution plan provided by MySQL Query Optimizer (viewed by the EXPLAIN command. In this method, the primary reason is that there is no available ordered index to obtain ordered data. MySQL can only sort the obtained data in the memory and then return the data to the client. In MySQL, there are actually two filesort implementation algorithms. One is to first retrieve the corresponding sorting fields based on the corresponding conditions and the row pointer information that can be directly located in the row data, then sort in sort buffer. The other is to retrieve all fields that satisfy the condition row at a time and sort them in sort buffer.
Only the first sorting algorithm was available before mysql. The second algorithm was an improved algorithm starting with MySQL. The main purpose was to reduce the I/O operations required to access table data twice in the first algorithm, it is converted to one time twice, but it also consumes more sort buffer space. Of course, All Versions later than MySQL4.1 also support the first algorithm, mySQL mainly compares the size of the system parameter max_length_for_sort_data and the sum of the Field Types retrieved by the Query statement to determine which sort algorithm to use. If max_length_for_sort_data is larger, the second optimized algorithm is used. Otherwise, the first algorithm is used. Therefore, if you want the order by operation to be as efficient as possible, you must set the max_length_for_sort_data parameter. A large number of sort waits have occurred in a colleague's database, resulting in a high system load and a long response time. Finally, it was discovered that MySQL used the traditional first sort algorithm, after the value of the max_length_for_sort_data parameter is increased, the system load is quickly mitigated and the response is much faster.
Let's take a look at the instances that MySQL needs to use filesort for sorting.
Assume that we change our Query to A. c2 sort, and then look at the situation:
Sky @ localhost: example 01:54:23> explain select a. * from a, B
-> Where a. c1> 2 and a. c2 <5 and a. c2 = B. c2 ORDER BY A. c2G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table:
Type: range
Possible_keys: PRIMARY
Key: PRIMARY
Key_len: 4
Ref: NULL
Rows: 3
Extra: Using where; Using filesort
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: B
Type: ref
Possible_keys: B _c2_ind
Key: B _c2_ind
Key_len: 7
Ref: example. A. c2
Rows: 2
Extra: Using where; Using index
MySQL extracts Qualified Data from Table A. Because the obtained data does not meet the order by condition, MySQL performs the filesort operation. The entire execution process is shown in:
In MySQL, The filesort operation has a strange restriction, that is, the data source must be from a Table. Therefore, if we sort two or more data items) table obtained through Join, MySQL must first create a Temporary Table and then sort the data of the Temporary Table, as shown in the following example:
Sky @ localhost: example 02:46:15> explain select A. * from A, B
-> Where A. c1> 2 and A. c2 <5 and A. c2 = B. c2 order by B. c3G
* *************************** 1. row ***************************
Id: 1
Select_type: SIMPLE
Table:
Type: range
Possible_keys: PRIMARY
Key: PRIMARY
Key_len: 4
Ref: NULL
Rows: 3
Extra: Using where; Using temporary; Using filesort
* *************************** 2. row ***************************
Id: 1
Select_type: SIMPLE
Table: B
Type: ref
Possible_keys: B _c2_ind
Key: B _c2_ind
Key_len: 7
Ref: example. A. c2
Rows: 2
Extra: Using where
The output of this execution plan is still A bit strange. I don't know why, MySQL Query Optimizer displays the "Using temporary" process in the first row's operation on Table, is it just to reduce the output of the execution plan?
The actual execution process is shown in: