Written in the front: discuz! as a leading community system, for the majority of webmasters to provide a one-stop web site solutions, but also open source (although part of the code is encrypted), it for the vertical field of the industry has made a great contribution. However, discuz! system source code, or more or less some pits. One of the most famous is the default use of MyISAM engine, and based on MyISAM engine Rob building function, the session table using memory engine, can refer to the following several historical articles. This time we are going to talk about another problem with discuz! in coping with the logic of the pages of hot posts.
In our environment, we are using the MySQL-5.6.6 version.
When you view a post and turn the page, you have SQL that resembles the following:
Mysql> desc SELECT * from Pre_forum_post WHERE
tid=8201301 and ' Invisible ' "(' 0 ', '-2 ') Order BY dateline Desc T 15\g
*************************** 1 row ***************************
id:1
select_type:simple
Table:pre_forum_post
type:ref
possible_keys:tid,displayorder,first
key:displayorder
Key_len:3
ref:const
rows:593371
extra:using index condition; The Using where; Using Filesort
The cost of this SQL execution is:
--based on the number of index access line records, generally a better state
--the number of times the next line of records is accessed based on the index order, usually because of the range of the index scan, or the full index scan, which is generally a better state
| Handler_read_next | 329881 |
--Reads the total number of row records in a certain order. This value is usually larger if you need to sort the results. This value is also larger when a full table scan occurs or a multiple table join cannot use the index
And when a hot post needs to be flipped back a lot of pages, for example:
Mysql> desc SELECT * from Pre_forum_post WHERE
tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 129 860, 15\g
*************************** 1. Row ***************************
id:1
select_type:simple
table:pre_forum_post
type:ref
possible_keys:displayorder
key:displayorder
Key_len:3
ref:const
rows:593371
extra:using where; Using Filesort
The cost of this SQL execution becomes (you can see Handler_read_key, handler_read_rnd a lot):
| Handler_read_key | 129876 | --because you need to skip a lot of line records before
| Handler_read_next | 329881 | --ditto
| Handler_read_rnd | 129875 | --because you need to sort a large set of results first
It can be seen that the cost of this SQL is very high when hot posts are encountered. If the hot post is answered by a large amount of access history, or if the search engine has been repeatedly requesting and historically replying to pages, it is easy to overwhelm the database server directly.
Summary: The reason this SQL cannot take advantage of the ' Displayorder ' index is that the second column ' invisible ' in the index takes a range query (range), which leads to the inability to continue to use the federated index to complete the sort requirements for the ' Dateline ' field (and if where Tid =? and invisible in (?,?) and Dateline =? In this case, the entire federated index can be used entirely, noting the difference between the two.
Knowing the reason, the corresponding optimization solution is clear:
Create a new index Idx_tid_dateline, which includes only Tid, Dateline two columns (statistics from other indexes, Item_type and item_id are too low, so they are not included in the Federated Index.) Of course, you may also consider adding them together.
Let's take a look at the implementation plan following the new index:
Mysql> desc SELECT * from Pre_forum_post WHERE
tid=8201301 and ' Invisible ' "(' 0 ', '-2 ') Order by Dateline LIMIT G
*************************** 1. Row ***************************
id:1
select_type:simple
table: Pre_forum_post
type:ref
possible_keys:tid,displayorder,first,idx_tid_dateline
key:idx_tid_dateline
key_len:3
ref:const
rows:703892
extra:using where
As you can see, the previously existing Using Filesort disappears and can be sorted directly through the index.
However, if the hot post turns to an older historical reply, the corresponding SQL will not be able to use the new index:
Mysql> desc SELECT * from Pre_forum_post WHERE
tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 129 860,15\g
*************************** 1 row ***************************
id:1
select_type:simple
Table:pre_forum_post
type:ref
possible_keys:tid,displayorder,first,idx_tid_dateline
key: Displayorder
key_len:3
ref:const
rows:593371
extra:using where; Using Filesort
Compare the implementation plan of the optimizer if it is recommended to use the new index:
Mysql> desc SELECT * FROM pre_forum_post Use index (idx_tid_dateline) WHERE
tid=8201301 and ' invisible ' in (' 0 ', '-2 ') Order by Dateline LIMIT 129860,15\g
*************************** 1. Row ***************************
id:1< C18/>select_type:simple
table:pre_forum_post
type:ref
possible_keys:idx_tid_dateline
key: Idx_tid_dateline
key_len:3
ref:const
rows:703892
extra:using where
As you can see, because the query optimizer believes that the latter needs to scan more than 110,000 rows, the former is more efficient.
In fact, in this case, the ordering is more expensive, so we have to prioritize the elimination of the order, so we should force the use of the new index, which is the following execution plan, in the appropriate program to specify the index.
Finally, when the hot post turns to the very old historical reply, two execution plans are compared to the profiling statistics:
1, using the old Index (displayorder):
Mysql> SELECT * from Pre_forum_post WHERE
tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 5;
#查看profiling结果 |
starting | 0.020203 |
| checking permissions | 0.000026 |
| Opening Tables | 0.000036 |
| | 0.000099 | |
System lock | 0.000092 |
| optimizing | 0.000038 |
| statistics | 0.000123 |
| preparing< c17/>| 0.000043 |
| Sorting result | 0.000025 |
| executing | 0.000023 |
| Sending Data | 0.000045 |
| Creating Sort Index | 0.941434 |
| End | 0.000077 |
| | | 0.000044 |
| closing tables | 0.000038 |
| freeing items
| 0.000056 |
| Cleaning up | 0.000040 |
2. If the new index is adopted (idx_tid_dateline):
Mysql> SELECT * from pre_forum_post Use index (idx_tid_dateline) WHERE
tid=8201301 and ' invisible ' in (' 0 ', '-2 ') ORD ER by Dateline LIMIT 129860,15;
#对比查看profiling结果 |
starting | 0.000151 |
| checking permissions | 0.000033 |
| Opening Tables | 0.000040 |
| | 0.000105 | |
System lock | 0.000044 |
| optimizing | 0.000038 |
| statistics | 0.000188 |
| preparing< c17/>| 0.000044 |
| Sorting result | 0.000024 |
| executing | 0.000023 |
| Sending Data | 0.917035 |
| | end | 0.000074 | | | |
closing Tables | 0.000036 |
| Freeing Items | 0.000049 |
| | cleaning up | 0.000032 |
As you can see, there's a certain increase in efficiency, but it's not obvious, because the amount of data you really need to scan is greater, so the sending data phase takes more time.
At this point, we can refer to an earlier optimization scenario: [MySQL Optimization case] Series-page optimization
You can then rewrite the SQL to do the following:
Mysql> EXPLAIN SELECT * from Pre_forum_post T1 INNER JOIN (
select ID to pre_forum_post use index (Idx_tid_datelin e) WHERE
tid=8201301 and ' invisible ' in (' 0 ', '-2 ') Order by
Dateline LIMIT 129860,15) T2
USING (ID) \g
* * * 1. Row ***************************
id:1
select_type:primary
table:
type:all
Possible_keys: NULL
key:null
key_len:null
ref:null
rows:129875
extra:null
********************* 2. Row ***************************
id:1
select_type:primary
table:t1
type:eq_ref
possible _keys:primary
key:primary
key_len:4
ref:t2.id
rows:1
extra:null 3. Row ***************************
id:2
select_type:derived
table:pre_forum_post
type:ref
possible_keys:idx_tid_dateline
key:idx_tid_dateline
key_len:3
ref:const
rows:703892
extra:using where
Take a look at this SQL profiling statistic:
| Starting | 0.000209 |
| checking permissions | 0.000026 |
| checking permissions | 0.000026 |
| Opening Tables | 0.000101 |
| | 0.000062 | |
System lock | 0.000049 |
| optimizing | 0.000025 |
| optimizing | 0.000037 |
| statistics< c15/>| 0.000106 |
| Preparing | 0.000059 |
| Sorting result | 0.000039 |
| statistics | 0.000048 |
Preparing | 0.000032 |
| executing | 0.000036 |
| Sending Data | 0.000045 |
| executing | 0.000023 |
| Sending Data | 0.225356 |
| | end | 0.000067 | | | |
closing Tables | 0.000023 |
| removing TMP table | 0.000029 |
| Closing Tables | 0.000044 |
| freeing items | 0.000048 |
| cleaning up | 0.000037 |
As you can see, efficiency has increased by more than 1 time times, or it's pretty good.
Finally, the problem will only appear when the hot post page, generally only 1, 2 pages reply to the Post if the original implementation plan, also no problem.
Therefore, it is recommended that the discuz! official modify or add a new index, and in the code to determine whether the hot post page, is the case, forcing the use of new indexes to avoid performance problems.