Optimize discuz! 's hot post-flip technique via Mysql _mysql

Source: Internet
Author: User

Written in the front: discuz! as a leading community system, for the majority of webmasters to provide a one-stop web site solutions, but also open source (although part of the code is encrypted), it for the vertical field of the industry has made a great contribution. However, discuz! system source code, or more or less some pits. One of the most famous is the default use of MyISAM engine, and based on MyISAM engine Rob building function, the session table using memory engine, can refer to the following several historical articles. This time we are going to talk about another problem with discuz! in coping with the logic of the pages of hot posts.

In our environment, we are using the MySQL-5.6.6 version.

When you view a post and turn the page, you have SQL that resembles the following:

Mysql> desc SELECT * from Pre_forum_post WHERE
 tid=8201301 and ' Invisible ' "(' 0 ', '-2 ') Order BY dateline Desc T 15\g
 *************************** 1 row ***************************
 id:1
 select_type:simple
 Table:pre_forum_post
 type:ref
 possible_keys:tid,displayorder,first
 key:displayorder
 Key_len:3
 ref:const
 rows:593371
 extra:using index condition; The Using where; Using Filesort

The cost of this SQL execution is:

--based on the number of index access line records, generally a better state

| Handler_read_key |  

--the number of times the next line of records is accessed based on the index order, usually because of the range of the index scan, or the full index scan, which is generally a better state

| Handler_read_next   | 329881 |

--Reads the total number of row records in a certain order. This value is usually larger if you need to sort the results. This value is also larger when a full table scan occurs or a multiple table join cannot use the index

| Handler_read_rnd |  

And when a hot post needs to be flipped back a lot of pages, for example:

Mysql> desc SELECT * from Pre_forum_post WHERE
 tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 129 860, 15\g
 *************************** 1. Row ***************************
 id:1
 select_type:simple
 table:pre_forum_post
 type:ref
 possible_keys:displayorder
 key:displayorder
 Key_len:3
 ref:const
 rows:593371
 extra:using where; Using Filesort

The cost of this SQL execution becomes (you can see Handler_read_key, handler_read_rnd a lot):

| Handler_read_key | 129876 | --because you need to skip a lot of line records before
| Handler_read_next | 329881 | --ditto
| Handler_read_rnd | 129875 | --because you need to sort a large set of results first

It can be seen that the cost of this SQL is very high when hot posts are encountered. If the hot post is answered by a large amount of access history, or if the search engine has been repeatedly requesting and historically replying to pages, it is easy to overwhelm the database server directly.

Summary: The reason this SQL cannot take advantage of the ' Displayorder ' index is that the second column ' invisible ' in the index takes a range query (range), which leads to the inability to continue to use the federated index to complete the sort requirements for the ' Dateline ' field (and if where Tid =? and invisible in (?,?) and Dateline =? In this case, the entire federated index can be used entirely, noting the difference between the two.

Knowing the reason, the corresponding optimization solution is clear:
Create a new index Idx_tid_dateline, which includes only Tid, Dateline two columns (statistics from other indexes, Item_type and item_id are too low, so they are not included in the Federated Index.) Of course, you may also consider adding them together.

Let's take a look at the implementation plan following the new index:

Mysql> desc SELECT * from Pre_forum_post WHERE
 tid=8201301 and ' Invisible ' "(' 0 ', '-2 ') Order by Dateline LIMIT G
 *************************** 1. Row ***************************
 id:1
 select_type:simple
 table: Pre_forum_post
 type:ref
 possible_keys:tid,displayorder,first,idx_tid_dateline
 key:idx_tid_dateline
 key_len:3
 ref:const
 rows:703892
 extra:using where

As you can see, the previously existing Using Filesort disappears and can be sorted directly through the index.

However, if the hot post turns to an older historical reply, the corresponding SQL will not be able to use the new index:

Mysql> desc SELECT * from Pre_forum_post WHERE
 tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 129 860,15\g
 *************************** 1 row ***************************
 id:1
 select_type:simple
 Table:pre_forum_post
 type:ref
 possible_keys:tid,displayorder,first,idx_tid_dateline
 key: Displayorder
 key_len:3
 ref:const
 rows:593371
 extra:using where; Using Filesort

Compare the implementation plan of the optimizer if it is recommended to use the new index:

Mysql> desc SELECT * FROM pre_forum_post Use index (idx_tid_dateline) WHERE
 tid=8201301 and ' invisible ' in (' 0 ', '-2 ') Order by Dateline LIMIT 129860,15\g
 *************************** 1. Row ***************************
 id:1< C18/>select_type:simple
 table:pre_forum_post
 type:ref
 possible_keys:idx_tid_dateline
 key: Idx_tid_dateline
 key_len:3
 ref:const
 rows:703892
 extra:using where

As you can see, because the query optimizer believes that the latter needs to scan more than 110,000 rows, the former is more efficient.

In fact, in this case, the ordering is more expensive, so we have to prioritize the elimination of the order, so we should force the use of the new index, which is the following execution plan, in the appropriate program to specify the index.

Finally, when the hot post turns to the very old historical reply, two execution plans are compared to the profiling statistics:

1, using the old Index (displayorder):

Mysql> SELECT * from Pre_forum_post WHERE
 tid=8201301 and ' invisible ' (' 0 ', '-2 ') Order by Dateline LIMIT 5;

#查看profiling结果 |
 starting |    0.020203 |
 | checking permissions | 0.000026 |
 | Opening Tables  | 0.000036 |
 | |     0.000099 | |
 System lock   | 0.000092 |
 | optimizing |   0.000038 |
 | statistics   | 0.000123 |
 | preparing< c17/>| 0.000043 |
 | Sorting result  | 0.000025 |
 | executing   | 0.000023 |
 | Sending Data   | 0.000045 |
 | Creating Sort Index | 0.941434 |
 | End     | 0.000077 |
 | | |   0.000044 |
 | closing tables  | 0.000038 |
 | freeing items
   
    | 0.000056 |
 | Cleaning up   | 0.000040 |


   

2. If the new index is adopted (idx_tid_dateline):

Mysql> SELECT * from pre_forum_post Use index (idx_tid_dateline) WHERE
 tid=8201301 and ' invisible ' in (' 0 ', '-2 ') ORD ER by Dateline LIMIT 129860,15;

#对比查看profiling结果 |
 starting |    0.000151 |
 | checking permissions | 0.000033 |
 | Opening Tables  | 0.000040 |
 | |     0.000105 | |
 System lock   | 0.000044 |
 | optimizing |   0.000038 |
 | statistics   | 0.000188 |
 | preparing< c17/>| 0.000044 |
 | Sorting result  | 0.000024 |
 | executing   | 0.000023 |
 | Sending Data   | 0.917035 |
 | | end     | 0.000074 | | | |
 closing Tables  | 0.000036 |
 | Freeing Items  | 0.000049 |
 | | cleaning up   | 0.000032 |

As you can see, there's a certain increase in efficiency, but it's not obvious, because the amount of data you really need to scan is greater, so the sending data phase takes more time.

At this point, we can refer to an earlier optimization scenario: [MySQL Optimization case] Series-page optimization

You can then rewrite the SQL to do the following:

Mysql> EXPLAIN SELECT * from Pre_forum_post T1 INNER JOIN (
 select ID to pre_forum_post use index (Idx_tid_datelin e) WHERE
 tid=8201301 and ' invisible ' in (' 0 ', '-2 ') Order by
 Dateline LIMIT 129860,15) T2
 USING (ID) \g
 * * * 1. Row ***************************
 id:1
 select_type:primary
 table: 
 type:all
 Possible_keys: NULL
 key:null
 key_len:null
 ref:null
 rows:129875
 extra:null
 ********************* 2. Row ***************************
 id:1
 select_type:primary
 table:t1
 type:eq_ref
 possible _keys:primary
 key:primary
 key_len:4
 ref:t2.id
 rows:1
 extra:null 3. Row ***************************
 id:2
 select_type:derived
 table:pre_forum_post
 type:ref
 possible_keys:idx_tid_dateline
 key:idx_tid_dateline
 key_len:3
 ref:const
 rows:703892
 extra:using where

Take a look at this SQL profiling statistic:

| Starting    | 0.000209 |
| checking permissions | 0.000026 |
| checking permissions | 0.000026 |
| Opening Tables  | 0.000101 |
| |     0.000062 | |
System lock   | 0.000049 |
| optimizing |   0.000025 |
| optimizing   | 0.000037 |
| statistics< c15/>| 0.000106 |
| Preparing   | 0.000059 |
| Sorting result  | 0.000039 |
| statistics |   0.000048 |
Preparing | 0.000032 |
| executing   | 0.000036 |
| Sending Data   | 0.000045 |
| executing   | 0.000023 |
| Sending Data   | 0.225356 |
| | end     | 0.000067 | | | |
closing Tables  | 0.000023 |
| removing TMP table | 0.000029 |
| Closing Tables  | 0.000044 |
| freeing items  | 0.000048 |
| cleaning up   | 0.000037 |

As you can see, efficiency has increased by more than 1 time times, or it's pretty good.

Finally, the problem will only appear when the hot post page, generally only 1, 2 pages reply to the Post if the original implementation plan, also no problem.

Therefore, it is recommended that the discuz! official modify or add a new index, and in the code to determine whether the hot post page, is the case, forcing the use of new indexes to avoid performance problems.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.