Optimization of straight_join in MySQL: mysqlstraight_join
There are many hint types that can be specified for table join in oracle: ordered hint indicates that oracle is connected according to the table sequence following the from keyword; leading hint indicates that the query optimizer uses the specified table as the first table to be connected, that is, the driver table. use_nl hint indicates that the query optimizer uses the nested loops method to connect the specified table to other row sources, the specified table is forced as the inner table.
In mysql, there is a corresponding straight_join. because mysql only supports the connection mode of nested loops, The straight_join here is similar to use_nl hint in oracle. When the mysql optimizer is dealing with multi-Table associations, it is very likely that the wrong driver table will be selected for association, resulting in an increase in the number of associations, which makes SQL statement execution very slow, in this case, an experienced DBA is required to make a judgment and select the correct driver table. In this case, straight_join is used. Let's take a look at the case of using straight_join for optimization:
1. user instance: the execution of an SQL statement of spxxxxxx is very slow. The SQL statement is as follows:
73871 | root | 127.0.0.1:49665 | user_app_test | Query | 500 | Sorting result |SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM test_log a,USER bWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime)
2. view the execution plan:
mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM test_log a,USER bWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime);mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rows-> FROM test_log a,USER b-> WHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4-> GROUP BY DATE(practicetime)\G;*************************** 1. row ***************************id: 1select_type: SIMPLEtable: atype: ALLpossible_keys: ix_test_log_useridkey: NULLkey_len: NULLref: NULLrows: 416782Extra: Using filesort*************************** 2. row ***************************id: 1select_type: SIMPLEtable: btype: eq_refpossible_keys: PRIMARYkey: PRIMARYkey_len: 96ref: user_app_testnew.a.useridrows: 1Extra: Using where2 rows in set (0.00 sec)
3. view the index:
mysql> show index from test_log;+————–+————+————————-+————–+————-+———–+————-+———-++| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |+————–+————+————————-+————–+————-+———–+————-+———-++| test_log | 0 | ix_test_log_unique_ | 1 | unitid | A | 20 | NULL | NULL | | BTREE | || test_log | 0 | ix_test_log_unique_ | 2 | paperid | A | 20 | NULL | NULL | | BTREE | || test_log | 0 | ix_test_log_unique_ | 3 | qtid | A | 20 | NULL | NULL | | BTREE | || test_log | 0 | ix_test_log_unique_ | 4 | userid | A | 400670 | NULL | NULL | | BTREE | || test_log | 0 | ix_test_log_unique_ | 5 | serial | A | 400670 | NULL | NULL | | BTREE | || test_log | 1 | ix_test_log_unit | 1 | unitid | A | 519 | NULL | NULL | | BTREE | || test_log | 1 | ix_test_log_unit | 2 | paperid | A | 2023 | NULL | NULL | | BTREE | || test_log | 1 | ix_test_log_unit | 3 | qtid | A | 16694 | NULL | NULL | | BTREE | || test_log | 1 | ix_test_log_serial | 1 | serial | A | 133556 | NULL | NULL | | BTREE | || test_log | 1 | ix_test_log_userid | 1 | userid | A | 5892 | NULL | NULL | | BTREE | |+————–+————+————————-+————–+————-+———–+————-+———-+——–+——+——-+
4. Adjust the index. Table A adopts the covered index for optimization:
mysql>alter table test_log drop index ix_test_log_userid,add index ix_test_log_userid(userid,practicetime)
5. view the execution plan:
mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM test_log a,USER bWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime)\G*************************** 1. row ***************************id: 1select_type: SIMPLEtable: atype: indexpossible_keys: ix_test_log_useridkey: ix_test_log_useridkey_len: 105ref: NULLrows: 388451Extra: Using index; Using filesort*************************** 2. row ***************************id: 1select_type: SIMPLEtable: btype: eq_refpossible_keys: PRIMARYkey: PRIMARYkey_len: 96ref: user_app_test.a.useridrows: 1Extra: Using where2 rows in set (0.00 sec)
After the adjustment, the execution has a slight effect, but it is not obvious, and the key points are not found:
SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM test_log a,USER bWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime);……………….143 rows in set (1 min 12.62 sec)
6. the execution time still takes a long time, and the time consumption is mainly in Using filesort. The data volume involved in sorting is as large as, so the driver table needs to be converted. Try to use the user table as the driver table: use straight_join to force the join order:
mysql> explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM USER b straight_join test_log aWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime)\G;*************************** 1. row ***************************id: 1select_type: SIMPLEtable: btype: ALLpossible_keys: PRIMARYkey: NULLkey_len: NULLref: NULLrows: 42806Extra: Using where; Using temporary; Using filesort*************************** 2. row ***************************id: 1select_type: SIMPLEtable: atype: refpossible_keys: ix_test_log_useridkey: ix_test_log_useridkey_len: 96ref: user_app_test.b.useridrows: 38Extra: Using index2 rows in set (0.00 sec)
The execution time has changed to 2.56 seconds;
mysql>SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM USER b straight_join test_log aWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime);……..143 rows in set (2.56 sec)
7. In the first step of analyzing the execution plan: Using where; Using temporary; Using filesort, the user table can also use overwrite indexes to avoid the emergence of using where, so continue to adjust the index:
mysql> show index from user;+——-+————+——————+————–+————-+———–+————-+———-+——–+——+————+———+| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |+——-+————+——————+————–+————-+———–+————-+———-+——–+——+————+———+| user | 0 | PRIMARY | 1 | userid | A | 43412 | NULL | NULL | | BTREE | || user | 0 | ix_user_email | 1 | email | A | 43412 | NULL | NULL | | BTREE | || user | 1 | ix_user_username | 1 | username | A | 202 | NULL | NULL | | BTREE | |+——-+————+——————+————–+————-+———–+————-+———-+——–+——+————+———+3 rows in set (0.01 sec)mysql>alter table user drop index ix_user_username,add index ix_user_username(username,isfree);Query OK, 42722 rows affected (0.73 sec)Records: 42722 Duplicates: 0 Warnings: 0mysql>explain SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM USER b straight_join test_log aWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime);*************************** 1. row ***************************id: 1select_type: SIMPLEtable: btype: indexpossible_keys: PRIMARYkey: ix_user_usernamekey_len: 125ref: NULLrows: 42466Extra: Using where; Using index; Using temporary; Using filesort*************************** 2. row ***************************id: 1select_type: SIMPLEtable: atype: refpossible_keys: ix_test_log_useridkey: ix_test_log_useridkey_len: 96ref: user_app_test.b.useridrows: 38Extra: Using index2 rows in set (0.00 sec)
8. The execution time is reduced to 1.43 seconds:
mysql>SELECT DATE(practicetime) date_time,COUNT(DISTINCT a.userid) people_rowsFROM USER b straight_join test_log aWHERE a.userid=b.userid AND b.isfree=0 AND LENGTH(b.username)>4GROUP BY DATE(practicetime);。。。。。。。143 rows in set (1.43 sec)