discuz熱帖翻頁的資料庫mysql最佳化

最後更新：2017-01-13 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

在我們的環境中，使用的是 MySQL-5.6.6 版本。

在查看文章並翻頁過程中，會產生類似下面這樣的SQL：

mysql> desc SELECT * FROM pre_forum_post WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline DESC LIMIT 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: ref
possible_keys: tid,displayorder,first
key: displayorder
key_len: 3
ref: const
rows: 593371
Extra: Using index condition; Using where; Using filesort

這個SQL執行的代價是：

-- 根據索引訪問行記錄次數，總體而言算是比較好的狀態
| Handler_read_key           | 16     |

-- 根據索引順序訪問下一行記錄的次數，通常是因為根據索引的範圍掃描，或者全索引掃描，總體而言也算是比較好的狀態
| Handler_read_next          | 329881 |

-- 按照一定順序讀取行記錄的總次數。如果需要對結果進行排序，該值通常會比較大。當發生全表掃描或者多表join無法使用索引時，該值也會比較大
| Handler_read_rnd           | 15     |

而當遇到熱帖需要往後翻很多頁時，例如：

mysql> desc SELECT * FROM pre_forum_post WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860, 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: ref
possible_keys: displayorder
key: displayorder
key_len: 3
ref: const
rows: 593371
Extra: Using where; Using filesort

這個SQL執行的代價則變成了（可以看到Handler_read_key、Handler_read_rnd大了很多）：

| Handler_read_key           | 129876 | -- 因為前面需要跳過很多行記錄
| Handler_read_next          | 329881 | -- 同上
| Handler_read_rnd           | 129875 | -- 因為需要先對很大一個結果集進行排序

可見，遇到熱帖時，這個SQL的代價會非常高。如果該熱帖被大量的訪問曆史回複，或者被搜素引擎一直反覆請求並且曆史回複頁時，很容易把資料庫伺服器直接壓垮。

小結：這個SQL不能利用 `displayorder` 索引排序的原因是，索引的第二個列 `invisible` 採用範圍查詢（RANGE），導致沒辦法繼續利用聯合索引完成對 `dateline` 欄位的排序需求（而如果是 WHERE tid =? AND invisible IN(?, ?) AND dateline =? 這種情況下是完全可以用到整個聯合索引的，注意下二者的區別）。

知道了這個原因，相應的最佳化解決辦法也就清晰了：
建立一個新的索引 idx_tid_dateline，它只包括 tid、dateline 兩個列即可（根據其他索引的統計資訊，item_type 和 item_id 的基數太低，所以沒包含在聯合索引中。當然了，也可以考慮一併加上）。

我們再來看下採用新的索引後的執行計畫：

mysql> desc SELECT * FROM pre_forum_post WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: ref
possible_keys: tid,displayorder,first,idx_tid_dateline
key: idx_tid_dateline
key_len: 3
ref: const
rows: 703892
Extra: Using where

可以看到，之前存在的 Using filesort 消失了，可以通過索引直接完成排序了。

不過，如果該熱帖翻到較舊的曆史回複時，相應的SQL還是不能使用新的索引：

mysql> desc SELECT * FROM pre_forum_post WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: ref
possible_keys: tid,displayorder,first,idx_tid_dateline
key: displayorder
key_len: 3
ref: const
rows: 593371
Extra: Using where; Using filesort

對比下如果建議最佳化器使用新索引的話，其執行計畫是怎樣的：

mysql> desc SELECT * FROM pre_forum_post use index(idx_tid_dateline) WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: pre_forum_post
type: ref
possible_keys: idx_tid_dateline
key: idx_tid_dateline
key_len: 3
ref: const
rows: 703892
Extra: Using where

可以看到，因為查詢最佳化工具認為後者需要掃描的行數遠比前者多了11萬多，因此認為前者效率更高。

事實上，在這個例子裡，排序的代價更高，因此我們要優先消除排序，所以應該強制使用新的索引，也就是採用後面的執行計畫，在相應的程式中指定索引。

最後，我們來看下熱帖翻到很老的曆史回複時，兩個執行計畫分別的profiling統計資訊對比：

1、採用舊索引（displayorder）：

mysql> SELECT * FROM pre_forum_post WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15;

#查看profiling結果
| starting             | 0.020203 |
| checking permissions | 0.000026 |
| Opening tables       | 0.000036 |
| init                 | 0.000099 |
| System lock          | 0.000092 |
| optimizing           | 0.000038 |
| statistics           | 0.000123 |
| preparing            | 0.000043 |
| Sorting result       | 0.000025 |
| executing            | 0.000023 |
| Sending data         | 0.000045 |
| Creating sort index | 0.941434 |
| end                  | 0.000077 |
| query end            | 0.000044 |
| closing tables       | 0.000038 |
| freeing items        | 0.000056 |
| cleaning up          | 0.000040 |

2、如果是採用新索引（idx_tid_dateline）：

mysql> SELECT * FROM pre_forum_post use index(idx_tid_dateline) WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY dateline LIMIT 129860,15;

#對比查看profiling結果
| starting             | 0.000151 |
| checking permissions | 0.000033 |
| Opening tables       | 0.000040 |
| init                 | 0.000105 |
| System lock          | 0.000044 |
| optimizing           | 0.000038 |
| statistics           | 0.000188 |
| preparing            | 0.000044 |
| Sorting result       | 0.000024 |
| executing            | 0.000023 |
| Sending data         | 0.917035 |
| end                  | 0.000074 |
| query end            | 0.000030 |
| closing tables       | 0.000036 |
| freeing items        | 0.000049 |
| cleaning up          | 0.000032 |

可以看到，效率有了一定提高，不過不是很明顯，因為確實需要掃描的資料量更大，所以 Sending data 階段耗時更多。

然後可以將這個SQL改寫成下面這樣：

mysql> EXPLAIN SELECT * FROM pre_forum_post t1 INNER JOIN (
SELECT id FROM pre_forum_post use index(idx_tid_dateline) WHERE
tid=8201301 AND `invisible` IN('0','-2') ORDER BY
dateline LIMIT 129860,15) t2
USING (id)\G
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table:
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 129875
Extra: NULL
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: t1
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: t2.id
rows: 1
Extra: NULL
*************************** 3. row ***************************
id: 2
select_type: DERIVED
table: pre_forum_post
type: ref
possible_keys: idx_tid_dateline
key: idx_tid_dateline
key_len: 3
ref: const
rows: 703892
Extra: Using where

再看下這個SQL的 profiling 統計資訊：

| starting             | 0.000209 |
| checking permissions | 0.000026 |
| checking permissions | 0.000026 |
| Opening tables       | 0.000101 |
| init                 | 0.000062 |
| System lock          | 0.000049 |
| optimizing           | 0.000025 |
| optimizing           | 0.000037 |
| statistics           | 0.000106 |
| preparing            | 0.000059 |
| Sorting result       | 0.000039 |
| statistics           | 0.000048 |
| preparing            | 0.000032 |
| executing            | 0.000036 |
| Sending data         | 0.000045 |
| executing            | 0.000023 |
| Sending data         | 0.225356 |
| end                  | 0.000067 |
| query end            | 0.000028 |
| closing tables       | 0.000023 |
| removing tmp table   | 0.000029 |
| closing tables       | 0.000044 |
| freeing items        | 0.000048 |
| cleaning up          | 0.000037 |

可以看到，效率提升了1倍以上，還是挺不錯的。

最後說明下，這個問題只會在熱帖翻頁時才會出現，一般只有1,2頁回複的文章如果還採用原來的執行計畫，也沒什麼問題。

因此，建議discuz!官方修改或增加下新索引，並且在代碼中判斷是否熱帖翻頁，是的話，就強制使用新的索引，以避免效能問題。

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

discuz熱帖翻頁的資料庫mysql最佳化

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support