一個複雜子查詢SQL最佳化，複雜子SQL最佳化

最後更新：2015-05-04 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

select *  from test.vmark vk where id in (select v.id                from usr_center.vmark_degree_update_log v, (select min(id) id                        from usr_center.vmark_degree_update_log                       where degree_update_cause = 0                         and degree_update_type = 0                       group by user_id) log               where v.id = log.id                 and v.degree_update_type = 0                 and v.degree_update_before between '2015-01-01 00:00:00' and                     '2015-01-10 00:00:00');
+----+--------------------+-------------------------+--------+----------------+---------+---------+------+----------+----------------------------------------------+| id | select_type        | table                   | type   | possible_keys  | key     | key_len | ref  | rows     | Extra                                        |+----+--------------------+-------------------------+--------+----------------+---------+---------+------+----------+----------------------------------------------+|  1 | PRIMARY            | vk                      | ALL    | NULL           | NULL    | NULL    | NULL | 66051213 | Using where                                  ||  2 | DEPENDENT SUBQUERY | <derived3>              | ALL    | NULL           | NULL    | NULL    | NULL |    82947 | Using where                                  ||  2 | DEPENDENT SUBQUERY | v                       | eq_ref | PRIMARY,idx_dd | PRIMARY | 4       | func |        1 | Using where                                  ||  3 | DERIVED            | vmar_degree_update_log | ref    | idx_dd         | idx_dd  | 1       |      | 12508106 | Using where; Using temporary; Using filesort |+----+--------------------+-------------------------+--------+----------------+---------+---------+------+----------+----------------------------------------------+4 rows in set (8.10 sec)

怎麼解讀這個執行計畫呢？1.首先執行id=3的標識為DERIVED的步驟，這個關鍵詞是衍生，出現在from後的子查詢會有這個標識。索引idx_dd(degree_update_type,degree_update_after)將被用於這一步，ref表名這是非唯一索引掃描，預計掃描12287942行。2.執行id=2的第一條計劃，也就是將第id=4的結果，group by成的82946行，對這個暫存資料表做一個全表掃描。這裡DEPENDENT SUBQUERY的意思是，這裡要掃描的行數或執行次數，取決於其他步驟，即依賴於第4步。這裡掃描82946行。3.得到log.id資訊後，根據v.id=log.id，去驅動v表，走的是primary，eq_ref也反映這是唯一索引掃描，從rows可以看出，每執行一次返回1行，執行多少次呢，執行82946次，因為是"DEPENDENT SUBQUERY"，所以執行次數或掃描行數依賴於第2步，也就是第2步每掃出一條，第3步就走一次索引。這裡我們也看出來了，MySQL的執行計畫並不是反映最終返回幾行，不是反映這步驟總共掃描幾行，也不告訴你執行多少次，而只是返回執行一次返回多少行。這裡掃描82946行。4.全表掃描test.vmar表，66050840行，然後Nest loop join之前步驟返回的結果集（實際有8萬多行）。這一步驟最耗時，讀66050840*80000行。這裡掃描行數成本是：12287942+82946*2+66050840*80000=7263409280000，7千億行，我假設1億行讀20分鐘，也需要2300多個小時，這個執行計畫很恐怖。結論：放在from的子查詢是非關聯子查詢，沒關係。但是放在where後的，卻要緊。
問題出在第4步，test.vmark表實際上id上有主鍵索引，我們如果能用前面三步的結果集，獲得id值，再去驅動test.vmark的id值，那麼就很容易得到想要的結果。
更改SQL如下，將in轉變成join。select *  from test.vmar vk join (select v.id                from usr_center.vmar_degree_update_log v,                     (select min(id) id                        from usr_center.vmar_degree_update_log                       where degree_update_cause = 0                         and degree_update_type = 0                       group by user_id) log               where v.id = log.id                 and v.degree_update_type = 0                 and v.degree_update_before between '2015-01-01 00:00:00' and                     '2015-01-10 00:00:00') child                  where vk.id = child.id;
+----+-------------+-------------------------+--------+----------------+---------+---------+----------+----------+----------------------------------------------+| id | select_type | table                   | type   | possible_keys  | key     | key_len | ref      | rows     | Extra                                        |+----+-------------+-------------------------+--------+----------------+---------+---------+----------+----------+----------------------------------------------+|  1 | PRIMARY     | <derived2>              | ALL    | NULL           | NULL    | NULL    | NULL     |    82371 |                                              ||  1 | PRIMARY     | vk                      | eq_ref | PRIMARY        | PRIMARY | 4       | child.id |        1 |                                              ||  2 | DERIVED     | <derived3>              | ALL    | NULL           | NULL    | NULL    | NULL     |    82948 |                                              ||  2 | DERIVED     | v                       | eq_ref | PRIMARY,idx_dd | PRIMARY | 4       | log.id   |        1 | Using where                                  ||  3 | DERIVED     |  vmar_degree_update_log | ref    | idx_dd         | idx_dd  | 1       |          | 11901368 | Using where; Using temporary; Using filesort |+----+-------------+-------------------------+--------+----------------+---------+---------+----------+----------+----------------------------------------------+
這個執行計畫的解釋，前面幾步，也就是對應這個執行計畫的第3,4,5行都比較像，就不再累述。得到的結果集有id值，第一行的執行計畫就讀這個結果集，有82371行，讀出的每一行取出id值去驅動test.vmark表的主鍵id。這裡讀取行數的成本是：11901368+82948*2+82371*2=48102006行。

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More