Implementation of MySQL order

Source: Internet
Author: User

The MySQL version used in this article is 5.1.48

Select SQL _no_cache * from one where ID <20 order by ID; (ID No key)

#0 my_qsort2 (base_ptr=0x1b2176c0, count=4, size=8, cmp=0x869220<ptr_compare_1>,cmp_argument=0x4670ea38) at mf_qsort.c:115#1 0x00000000008690ea in my_string_ptr_sort (base=0x1b2176c0"Xw!\033", items=4, size=5)at mf_sort.c:37#2 0x00000000005c12bb in filesort (thd=0x1b1ef980, table=0x1b23a880,sortorder=0x1b23ccf0,   s_length=1,select=0x1b23cb50, max_rows=18446744073709551615, sort_positions=false,     examined_rows=0x4670ef18) atfilesort.cc:1035#3 0x000000000053c648 in create_sort_index (thd=0x1b1ef980, join=0x1b2198e0,order=0x1b23bfa0,filesort_limit=18446744073709551615, select_limit=18446744073709551615,is_order_by=false)    atsql_select.cc:13832#4 0x000000000054846a in JOIN::exec (this=0x1b2198e0) at sql_select.cc:2257#5 0x00000000005492be in mysql_select (thd=0x1b1ef980,rref_pointer_array=0x1b1f1a68,   tables=0x1b23b868, wild_num=1, fields=<value optimized out>,conds=0x1b23bd30, og_num=1,order=0x1b23bfa0, group=0x0, having=0x0,proc_param=0x0, select_options=2147764736, result=0x1b23c070, unit=0x1b1f1470,select_lex=0x1b1f1898) at sql_select.cc:2509#6 0x00000000005497e9 in handle_select (thd=0x1b1ef980, lex=0x1b1f13d0,result=0x1b23c070,    setup_tables_done_option=0) at sql_select.cc:269#7 0x00000000004c97b7 in execute_sqlcom_select (thd=0x1b1ef980,all_tables=0x1b23b868)   at sql_parse.cc:5075#8 0x00000000004d516a in mysql_execute_command (thd=0x1b1ef980) atsql_parse.cc:2271#9 0x00000000004d5dc4 in mysql_parse (thd=0x1b1ef980, inBuf=0x1b23b650 "select sql_no_cache * from one where id < 20order by id", length=56,found_semicolon=0x46710c30)at sql_parse.cc:5994

Through this bt, we can know that filesort is used in the join process, but before do_select [for not using temptabel], the qsort sorting algorithm is used.

1. Order by for a single table

Perform sorting in the create_sort_index of join: exec. First, it determines whether the required ordered data (create_sort_index) can be obtained directly through the index. If not, it performs filesort, that is, the real sorting process. In this function, we first call find_all_keys to obtain all the data that meets the conditions (test cond through create_sort_index). The obtained data is first saved in sort_keys, this size is affected by the system environment variable sort_buff_size. When the data to be sorted is greater than the buff, it is written to tempfile by write_keys (an io_cache structure is not necessarily an object file ), in this case, the data in each io_cache is sorted first. When all data can be saved to a sort_keys (that is, sort_buff_size is large enough), filesort will call save_index to sort the data after find_all_keys, in the previous case (split to multiple io_cache), merge_many_buff is called to complete the final merge. The real Sorting Algorithm (function) is implemented by my_string_ptr_sort. It has two policies:
<= 20 & items> = 1000 & items <100000 (the size of each record is smaller than 20 bytes, and the record data is []) use radixsort_for_str_ptr [radixsort for pointers to fixed length strings. A very quick sort for not to long (<20 char) strings. neads a extra buffers ofnumber_of_elements pointers
But is 2-3 times faster than quicksort] This is a comment in the source code.
In other cases, the my_qsort2 algorithm is used to determine the content. When the number of records is smaller than threshold_for_insert_sort (10 by default), insert sorting is used. Otherwise, the sorting is fast.
The above is the entire sorting process. They are finally saved to table-> sort. record_pointers or sort. io_cache. Output after sorting.

#0 evaluate_join_record (join=0x1b214c10, join_tab=0x1b23cb78, error=0) atsql_select.cc:11414#1 0x000000000052dc39 in sub_select (join=0x1b214c10, join_tab=0x1b23cb78,end_of_records=<value optimized out>) at sql_select.cc:11384#2 0x0000000000542f8a in do_select (join=0x1b214c10, fields=0x1b1f19a0,table=0x0, procedure=0x0) at sql_select.cc:11140#3 0x0000000000548613 in JOIN::exec (this=0x1b214c10) at sql_select.cc:2314#4 0x00000000005492be in mysql_select (thd=0x1b1ef980,rref_pointer_array=0x1b1f1a68, tables=0x1b2146e0, wild_num=1, fields=<valueoptimized out>, conds=0x0, og_num=1, order=0x1b214b20, group=0x0,having=0x0, proc_param=0x0, select_options=2147764736, result=0x1b214bf0,unit=0x1b1f1470, select_lex=0x1b1f1898) at sql_select.cc:2509

From this bt, we can see that after order by (2257) 2314, The do_select operation we analyzed in mysqljoin is called. At this time, there is only one layer of sub_select and evaluate_join_record, because it is a single table. Here, the data obtained by sub_select is not read from plugin, but calls rr_unpack_from_buffer; rr_unpack_from_tempfile (read aresult set record from a buffer after sorting) from record_pointers or io_cache, in addition, test (select_cond-> val_int () is not performed in evaluate_join_record. Because select_cond is set to null, [these operations are assigned after filesort]. finally, evaluate_join_record calls end_send and sends it to the client.

2. Join two tables, and order by is the driving table

Explain select * from titles t joinemployees e on T. emp_no = E. emp_no where T. Title = 'engineer 'order bye. first_name;

In this case, the operations performed are the same as those performed on a table, except that when do_select is executed for sending data (SQL _select.cc:2314In this case, the two tables are joined, that is, the two-layer sub_select ---- evaluate_join_record. Only when the driver table is in sub_select, the data is obtained from sort. record_pointers or sort. io_cache. The other data is the same as the MySQL join process.

3. Join two tables with orderNonDriver table

Explain select * from titles t joinemployees e on T. emp_no = E. emp_no where T. Title = 'engineer 'order by T. from_date;

Compared with the using temporary, [there is an article on the internet saying why this is displayed in the first line. In fact, it is not suitable for any line because of using temporary, using filesort is after the two joins], that is to say, in this case, a temporary table is required for implementation. Next we will analyze the role of the temp tab. The code here is as follows:

#0 sub_select (join=0x1b2270a0, join_tab=0x1b21a618,end_of_records=<value optimized out>)   at sql_select.cc:11401#1 0x000000000052da0e in evaluate_join_record (join=0x1b2270a0,join_tab=0x1b21a3c0, error=<value optimized out>) at sql_select.cc:11511#2 0x000000000052dc39 in sub_select (join=0x1b2270a0, join_tab=0x1b21a3c0, end_of_records=<value optimized out>) at sql_select.cc:11384#3 0x0000000000542f8a in do_select (join=0x1b2270a0, fields=0x0,table=0x1b23d670, procedure=0x0)    atsql_select.cc:11140#4 0x0000000000546e8c in JOIN::exec (this=0x1b2270a0) at sql_select.cc:1907#5 0x00000000005492be in mysql_select (thd=0x1b1ef980,rref_pointer_array=0x1b1f1a68,   tables=0x1b23b900, wild_num=1, fields=<value optimized out>,conds=0x1b23c978, og_num=1,order=0x1b23cc18, group=0x0, having=0x0,proc_param=0x0, select_options=2147764736, result=0x1b23ccf0, unit=0x1b1f1470,select_lex=0x1b1f1898) at sql_select.cc:2509

We can see that do_select (1907) The operation is in2257: Before create_sort_index (), This is different from the analysis process in our previous join document, that is, the join process is in advance, because there is an IF (need_tmp) Judgment in front, this need_tmp is used for order or group. And when you join a statement that meets the conditions, run join_tab-> next_select: end_write [two layers after evaluate_join_recordtest (select_cond-> val_int ()]. At this time, the write position is join-> tmp_table; its initialization starts with do_select, that is, assigning the temp_table value of need_tmp specified when entering do_select to it. Save all records after join to a temp
Table [join-> tmp_table].
Run change_to_use_tmp_fields to make order by use temp fields. Copy the temp table information to join.
Then execute create_sort_index and execute the sort operation. Because the select passed to filesort is empty, the cond will not be judged when find_all_keys is passed. (That is, select-> skip_record will not be executed) and then send data. At this time, do_select will be executed again. [If tmp_table is not available, this do_select will be executed only once, for example, if order by is set for a single table, join only has one temp table. Therefore, there is only one layer of sub_select and evaluate_join_record. The last join_tab-> next_select is end_send.

4. Why does MySQL think that using temp table is better?

First, let's look at the following data:
Select count (*) from e-mapreduce: 300024
Select count (*) from titles: 443308
Select count (*) from titles wheretitle = 'engine': 115003
Table E is smaller than table t, so it will be faster during full table scan, but table t has a where condition, after test cond, only 115003 is left, therefore, it may be faster for MS to use the T table to join the E table based on this data. In addition, the former uses temp table, which is hard to understand why MySQL does this. Therefore, we will compare the performance of the two through the following experiment.

A: Explain select SQL _no_cache * From titlest join employees e on T. emp_no = E. emp_no wheret. Title = 'engineer 'order by T. from_date;

Execution result: 115003 rows in SET (3.09 Sec)

B: Explain select SQL _no_cache * From titlest straight_join employees e on T. emp_no = E. emp_nowhere T. Title = 'engineer 'order by T. from_date;

Execution result: 115003 rows in SET (1.40 Sec)
The experimental results are as shown in the above analysis: using t as the driving table has better performance than using E (the default optimization result of MySQL.
In fact, this should be the same for situations where order by is not used, so we have tested the two cases below:
Select SQL _no_cache * from titles t join employees e on T. emp_no = E. emp_no wheret. Title = 'engine ';
115003 rows in SET (2.50 Sec)
Select SQL _no_cache * from titles t straight_join employees e on T. emp_no = E. emp_no wheret. Title = 'engine ';
Execution result: 115003 rows in SET (1.05 Sec)

Through the above experiment, we can see that MySQL optimization is not necessarily reliable, one reason may be that it cannot determine the data information after where cond. This information may have a significant impact on the join operation.

5. Summary

The entire process can be simply described as if an order can be performed on a single table, order by will be performed on its filter set first. If not, order by is performed only after join, at this time, an intermediate temporary table will be generated as the final order by object (it can also be said that order by will only operate on one table, but this table may be an entity table or a temporary table generated after multiple tables are joined ). In addition, there are two sorting algorithms: one is to retrieve only the field to order and one pointer that can return a record, and put them in sort_buff for sorting, after sorting, the corresponding fields of each record are retrieved in order; the other is to retrieve the order field and all the fields to be returned at one time. The two policies are determined by the system environment variable max_length_for_sort_data (get_addon_fields). When the sum of the field types to be obtained is greater than max_length_for_sort_data, the first is used. Otherwise, the second is used. Obviously, the first method requires one more Io, but the memory usage is less. So when order
When the by performance is poor, you can consider improving max_length_for_sort_data or sort_buffer_size appropriately. The former reduces Io and the latter reduces meger.

References:

Jane Sun: http://isky000.com/database/mysql_order_by_implement

Http://www.cnblogs.com/phper7/archive/2010/05/26/1744062.html

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.