Transfer from http://blog.csdn.net/tonyxf121/article/details/7796657
The implementation principle of join
Join is implemented by using the nested loop join algorithm, which is to use the result set of the driver table as the circular base data, then query the data in the next table with the data in the result set as a filter, then merge the results. If there are multiple joins, the preceding result set is used as the looping data, once again as a loop condition to query the data in the latter table.
Next, a three-table join query is followed to illustrate how MySQL's nested Loop join is implemented.
[SQL]View PlainCopy
- Select M.subject msg_subject, C.content msg_content
- From User_group g,group_message m,group_message_content C
- where g.user_id = 1
- and m.group_id = g.group_id
- and c.group_msg_id = M.id
Use explain to see the execution plan:
[SQL]View PlainCopy
- Explain select M.subject msg_subject, C.content msg_content from user_group g,group_message m,
- Group_message_content C where g.user_id = 1 and m.group_id = g.group_id and c.group_msg_id = m.id\g;
The results are as follows:
[Plain]View PlainCopy
- 1. Row ***************************
- Id:1
- Select_type:simple
- Table:g
- Type:ref
- Possible_keys:user_group_gid_ind,user_group_uid_ind,user_group_gid_uid_ind
- Key:user_group_uid_ind
- Key_len:4
- Ref:const
- Rows:2
- Extra:
- 2. Row ***************************
- Id:1
- Select_type:simple
- Table:m
- Type:ref
- Possible_keys:primary,idx_group_message_gid_uid
- Key:idx_group_message_gid_uid
- Key_len:4
- ref:g.group_id
- Rows:3
- Extra:
- 3. Row ***************************
- Id:1
- Select_type:simple
- Table:c
- Type:ref
- possible_keys:idx_group_message_content_msg_id
- key:idx_group_message_content_msg_id
- Key_len:4
- Ref:m.id
- Rows:2
- Extra:
As can be seen from the results, explain select User_group as the driver table, first through the index user_group_uid_ind to the const condition index ref lookup, and then use the User_group table filtered result set Group_ ID field as the query criteria, the Group_message loop query, and then with the filtered result set of Group_message ID as a condition and group_message_content group_msg_id loop comparison query, Get the final result.
This process can be represented by the following code:
For each record g_rec on table user_group that g_rec.user_id=1{
For the record M_rec in Group_message that m_rec.group_id=g_rec.group_id{
For the record C_rec in group_message_content that c_rec.group_msg_id=m_rec.id
Pass the (g_rec.user_id, M_rec.subject, c_rec.content) row
combination to output;
}
}
If you remove the index of the group_msg_id field above the Group_message_content table, the execution plan will vary.
[SQL]View PlainCopy
- Drop index idx_group_message_content_msg_id on group_message_content;
- Explain select M.subject msg_subject, C.content msg_content from user_group g,group_message m,
- Group_message_content C where g.user_id = 1 and m.group_id = g.group_id and c.group_msg_id = m.id\g;
The resulting implementation plan is as follows:
[Plain]View PlainCopy
- 1. Row ***************************
- Id:1
- Select_type:simple
- Table:g
- Type:ref
- Possible_keys:user_group_uid_ind
- Key:user_group_uid_ind
- Key_len:4
- Ref:const
- Rows:2
- Extra:
- 2. Row ***************************
- Id:1
- Select_type:simple
- Table:m
- Type:ref
- Possible_keys:primary,idx_group_message_gid_uid
- Key:idx_group_message_gid_uid
- Key_len:4
- ref:g.group_id
- Rows:3
- Extra:
- 3. Row ***************************
- Id:1
- Select_type:simple
- Table:c
- Type:all
- Possible_keys:null
- Key:null
- Key_len:null
- Ref:null
- rows:96
- Extra:using where; Using Join Buffer
Because the index was deleted, Group_message_content's access changed from ref to All,keys and the Null,extra information became the using where and using join buffer, That is to say, you need to get content that can only be obtained by where filtering the data in the entire table. The using join buffer refers to the use of the cache, which uses the join buffer only if the join type is All,index,rang or Index_merge, and its use can be expressed in the following code:
For each record g_rec in table user_group{
For the record M_rec in Group_message that m_rec.group_id=g_rec.group_id{
Put (G_rec, M_rec) into the buffer
if (buffer is full)
Flush_buffer ();
}
}
Flush_buffer () {
For the record C_rec in group_message_content that c_rec.group_msg_id = c_rec.id{
For each record in the buffer
Pass (g_rec.user_id, M_rec.subject, c_rec.content) row combination to output;
}
Empty the buffer;
}
In the implementation process, you can see that the result set of User_group and Group_message is placed in the join buffer without having to associate the User_group and group_message with each other immediately and Group_message_ Content correlation, which is also not necessary; It is important to note that only the column values that appear in the query results are preserved in the join buffer, and that the size does not depend on the size of the table, and we see in the pseudocode that when the join buffer is filled, MySQL will flush buffer.
Optimization of JOIN statements
1. Drive large result sets with small result sets to minimize the total number of nested loops in the join statement;
2. Optimize the inner loop of the nested loop, as the inner loop is the most executed in the loop, and the performance of each cycle can be improved very much in the whole cycle.
3. Index the Join field on the drive table;
4. When the index cannot be indexed on the join field of the drive table, set enough join Buffer Size.
Pack Gadgets
Http://www.linuxidc.com/Linux/2014-03/98553.htm
1110Nested Loop Join algorithm