Some optimizations that MySQL can make when the association is complex.

Last Update:2015-05-10 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Yesterday, an optimization of complex associated SQL statements was processed. The optimization of such SQL statements usually takes into account the following four points:

1. The result set returned by the query. Generally, there are few returned result sets in the query, and they are confident to be optimized;

2. The selection of the driver table is crucial. by viewing the execution plan, you can see the driver table selected by the optimizer. the rows in the execution plan can roughly reflect the problem;

3. Sort out the associations between tables and check whether appropriate indexes exist in the associated fields;

4. Use the straight_join keyword to strengthen the Association Sequence between tabulation, so that we can easily verify some conjecture;

SQL:
Execution time:

mysql> select c.yh_id,-> c.yh_dm,-> c.yh_mc,-> c.mm,-> c.yh_lx,-> a.jg_id,-> a.jg_dm,-> a.jg_mc,-> a.jgxz_dm,-> d.js_dm yh_js-> from a, b, c-> left join d on d.yh_id = c.yh_id-> where a.jg_id = b.jg_id-> and b.yh_id = c.yh_id-> and a.yx_bj = ‘Y'-> and c.sc_bj = ‘N'-> and c.yx_bj = ‘Y'-> and c.sc_bj = ‘N'-> and c.yh_dm = '006939748XX' ;1 row in set (0.75 sec)

In this SQL query, only one row of data is returned, but the execution takes 750 ms. view the execution plan:

mysql> explain-> select c.yh_id,-> c.yh_dm,-> c.yh_mc,-> c.mm,-> c.yh_lx,-> a.jg_id,-> a.jg_dm,-> a.jg_mc,-> a.jgxz_dm,-> d.js_dm yh_js-> from a, b, c-> left join d on d.yh_id = c.yh_id-> where a.jg_id = b.jg_id-> and b.yh_id = c.yh_id-> and a.yx_bj = ‘Y'-> and c.sc_bj = ‘N'-> and c.yx_bj = ‘Y'-> and c.sc_bj = ‘N'-> and c.yh_dm = '006939748XX' ;+—-+————-+——-+——–+——————+———+———+————–+——-+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——–+——————+———+———+————–+——-+————-+| 1 | SIMPLE | a | ALL | PRIMARY,INDEX_JG | NULL | NULL | NULL | 52616 | Using where || 1 | SIMPLE | b | ref | PRIMARY | PRIMARY | 98 | test.a.JG_ID | 1 | Using index || 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 98 | test.b.YH_ID | 1 | Using where || 1 | SIMPLE | d | index | NULL | PRIMARY | 196 | NULL | 54584 | Using index |+—-+————-+——-+——–+——————+———+———+————–+——-+————-+

There are two prominent performance bottlenecks in the execution plan:

| 1 | SIMPLE | a | ALL | PRIMARY,INDEX_JG | NULL | NULL | NULL | 52616 | Using where || 1 | SIMPLE | d | index | NULL | PRIMARY | 196 | NULL | 54584 | Using index |

Because d is a left join table, the drive table does not select d. Let's take a look at the size of Table a, B, and Table c:

mysql> select count(*) from c;+———-+| count(*) |+———-+| 53731 |+———-+mysql> select count(*) from a;+———-+| count(*) |+———-+| 53335 |+———-+mysql> select count(*) from b;+———-+| count(*) |+———-+| 105809 |+———-+

Because table B has a larger data volume than the other two tables, and table B has no query and filtering conditions, B may be excluded from the drive table;

The optimizer actually chooses Table a as the driving table, but why not table c as the driving table? Let's analyze:

Stage 1: Table a as the driving table
A-> B-> c-> d:
(1): a. jg_id = B. jg_id-> (B Index: PRIMARY KEY ('jg _ id', 'yh _ id '))

(2): B. yh_id = c. yh_id-> (c Index: PRIMARY KEY ('yh _ id '))

(3): c. yh_id = d. yh_id-> (d index: primary key ('js _ dm', 'yh _ id '))
Because there is no yh_id index on Table d, the index is added to table d:

alter table d add index ind_yh_id(yh_id);

Execution Plan:

+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+| 1 | SIMPLE | a | ALL | PRIMARY,INDEX_JG | NULL | NULL | NULL | 52616 | Using where || 1 | SIMPLE | b | ref | PRIMARY | PRIMARY | 98 | test.a.JG_ID | 1 | Using index || 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 98 | test.b.YH_ID | 1 | Using where || 1 | SIMPLE | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.YH_ID | 272 | Using index |+—-+————-+——-+——–+——————+———–+———+————–+——-+————-+

Execution time:

1 row in set (0.77 sec)

After adding an index to Table d, the number of rows scanned in table d dropped to 272 (initially: 54584)

| 1 | SIMPLE | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.YH_ID | 272 | Using index |

Stage 2: Drive Table c

D
^
|
C-> B->
Because yh_dm has high filtering conditions on Table c, we create an index on yh_dm:

mysql> select count(*) from c where yh_dm = '006939748XX';+———-+| count(*) |+———-+| 2 |+———-+

Add an index:

alter table c add index ind_yh_dm(yh_dm)

View the execution plan:

+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+| 1 | SIMPLE | a | ALL | PRIMARY,INDEX_JG | NULL | NULL | NULL | 52616 | Using where || 1 | SIMPLE | b | ref | PRIMARY | PRIMARY | 98 | test.a.JG_ID | 1 | Using index || 1 | SIMPLE | c | eq_ref | PRIMARY,ind_yh_dm | PRIMARY | 98 | test.b.YH_ID | 1 | Using where || 1 | SIMPLE | d | ref | ind_yh_id | ind_yh_id | 98 | test.b.YH_ID | 272 | Using index |+—-+————-+——-+——–+——————-+———–+———+————–+——-+————-+

Execution time:

1 row in set (0.74 sec)

After the index is added to Table c, the index is still not taken up, and the execution plan is still using Table a as the driving table. So let's analyze why Table a is used as the driving table?

1): c. yh_id = B. yh_id-> (PRIMARY KEY ('jg _ id', 'yh _ id '))

A. if table c is used as the driving table, table c is associated with table B because table B does not have the index of the yh_id field, because table B has a large amount of data, therefore, the optimizer considers that if Table c is used as the driving table, it will have a large association with table B (here we can use straight_join to force Table c as the driving table );
B. if table a is the driving table, table a is associated with table B because of the index of the jg_id field in table B, therefore, the optimizer considers that the cost of using a as the driving table is less than that of using c as the driving board;
Therefore, if we want to use table C as the driving table, we only need to add the yh_id index on Table B:

alter table b add index ind_yh_id(yh_id);

2): B. jg_id = a. jg_id-> (PRIMARY KEY ('jg _ id '))

3): c. yh_id = d. yh_id-> (KEY 'ind _ yh_id '('yh _ id '))
Execution Plan:

+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+| 1 | SIMPLE | c | ref | PRIMARY,ind_yh_dm | ind_yh_dm | 57 | const | 2 | Using where || 1 | SIMPLE | d | ref | ind_yh_id | ind_yh_id | 98 | test.c.YH_ID | 272 | Using index || 1 | SIMPLE | b | ref | PRIMARY,ind_yh_id | ind_yh_id | 98 | test.c.YH_ID | 531 | Using index || 1 | SIMPLE | a | eq_ref | PRIMARY,INDEX_JG | PRIMARY | 98 | test.b.JG_ID | 1 | Using where |+—-+————-+——-+——–+——————-+———–+———+————–+——+————-+

Execution time:

1 row in set (0.00 sec)

We can see that the rows in the execution plan has been greatly reduced, and the execution time has also been reduced from ms to 0 ms;

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Some optimizations that MySQL can make when the association is complex.

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support