An interesting phenomenon: innodb_io_capacity and innodbiocapacity

Last Update:2015-07-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

In the past, a customer in the company experienced a situation where sysbench was used for stress testing and a phenomenon occurred during the testing process, as shown below:

The output result of the customer is as follows:

[1310s] threads: 600, tps: 2176.70, reads: 1087.10, writes: 1089.60, response time: 1076.07ms (95%), errors: 0.00, reconnects:  0.00[1320s] threads: 600, tps: 2292.10, reads: 1144.30, writes: 1147.80, response time: 805.14ms (95%), errors: 0.00, reconnects:  0.00[1330s] threads: 600, tps: 2205.90, reads: 1103.30, writes: 1102.60, response time: 969.33ms (95%), errors: 0.00, reconnects:  0.00[1340s] threads: 600, tps: 2038.20, reads: 1015.80, writes: 1022.40, response time: 920.41ms (95%), errors: 0.00, reconnects:  0.00[1350s] threads: 600, tps: 2002.90, reads: 998.90, writes: 1004.00, response time: 1096.88ms (95%), errors: 0.00, reconnects:  0.00[1360s] threads: 600, tps: 2002.90, reads: 1000.10, writes: 1002.80, response time: 1108.77ms (95%), errors: 0.00, reconnects:  0.00[1370s] threads: 600, tps: 2114.90, reads: 1057.60, writes: 1057.30, response time: 930.94ms (95%), errors: 0.00, reconnects:  0.00[1380s] threads: 600, tps: 2073.30, reads: 1033.90, writes: 1039.40, response time: 967.59ms (95%), errors: 0.00, reconnects:  0.00[1390s] threads: 600, tps: 2314.09, reads: 1153.99, writes: 1160.09, response time: 1016.58ms (95%), errors: 0.00, reconnects:  0.00[1400s] threads: 600, tps: 1850.91, reads: 924.21, writes: 926.71, response time: 1543.45ms (95%), errors: 0.00, reconnects:  0.00[1410s] threads: 600, tps: 2493.41, reads: 1243.81, writes: 1249.91, response time: 1124.14ms (95%), errors: 0.00, reconnects:  0.00[1420s] threads: 600, tps: 1628.29, reads: 815.40, writes: 812.60, response time: 1302.12ms (95%), errors: 0.00, reconnects:  0.00[1430s] threads: 600, tps: 1737.90, reads: 865.30, writes: 872.60, response time: 1128.86ms (95%), errors: 0.00, reconnects:  0.00[1440s] threads: 600, tps: 1576.90, reads: 787.60, writes: 789.30, response time: 1375.44ms (95%), errors: 0.00, reconnects:  0.00[1450s] threads: 600, tps: 1773.60, reads: 884.00, writes: 889.60, response time: 1374.20ms (95%), errors: 0.00, reconnects:  0.00[1460s] threads: 600, tps: 1845.71, reads: 922.71, writes: 923.01, response time: 1252.42ms (95%), errors: 0.00, reconnects:  0.00[1470s] threads: 600, tps: 2229.28, reads: 1111.89, writes: 1117.39, response time: 1001.47ms (95%), errors: 0.00, reconnects:  0.00[1480s] threads: 600, tps: 2510.32, reads: 1254.31, writes: 1256.71, response time: 918.75ms (95%), errors: 0.00, reconnects:  0.00[1490s] threads: 600, tps: 1908.09, reads: 951.79, writes: 955.59, response time: 1148.29ms (95%), errors: 0.00, reconnects:  0.00[1500s] threads: 600, tps: 2327.93, reads: 1161.71, writes: 1166.41, response time: 1395.34ms (95%), errors: 0.00, reconnects:  0.00[1510s] threads: 600, tps: 2329.08, reads: 1162.89, writes: 1165.99, response time: 988.08ms (95%), errors: 0.00, reconnects:  0.00[1520s] threads: 600, tps: 2036.43, reads: 1017.81, writes: 1018.61, response time: 938.21ms (95%), errors: 0.00, reconnects:  0.00[1530s] threads: 600, tps: 787.59, reads: 393.19, writes: 394.39, response time: 1060.72ms (95%), errors: 0.00, reconnects:  0.00[1540s] threads: 600, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00[2120s] threads: 600, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00[2130s] threads: 600, tps: 0.00, reads: 0.00, writes: 0.00, response time: 0.00ms (95%), errors: 0.00, reconnects:  0.00[2140s] threads: 600, tps: 219.00, reads: 108.30, writes: 110.70, response time: 615414.74ms (95%), errors: 0.00, reconnects:  0.00[2150s] threads: 600, tps: 2046.80, reads: 1023.90, writes: 1022.90, response time: 1158.65ms (95%), errors: 0.00, reconnects:  0.00[2160s] threads: 600, tps: 2560.12, reads: 1275.81, writes: 1284.31, response time: 854.55ms (95%), errors: 0.00, reconnects:  0.00[2170s] threads: 600, tps: 3093.08, reads: 1542.49, writes: 1550.59, response time: 783.97ms (95%), errors: 0.00, reconnects:  0.00[2180s] threads: 600, tps: 3234.00, reads: 1616.00, writes: 1618.00, response time: 698.42ms (95%), errors: 0.00, reconnects:  0.00[2190s] threads: 600, tps: 3709.84, reads: 1851.62, writes: 1858.62, response time: 772.09ms (95%), errors: 0.00, reconnects:  0.00[2200s] threads: 600, tps: 3492.39, reads: 1741.19, writes: 1750.79, response time: 762.67ms (95%), errors: 0.00, reconnects:  0.00[2210s] threads: 600, tps: 3282.96, reads: 1639.88, writes: 1643.08, response time: 889.00ms (95%), errors: 0.00, reconnects:  0.00[2220s] threads: 600, tps: 3922.43, reads: 1958.12, writes: 1964.32, response time: 690.32ms (95%), errors: 0.00, reconnects:  0.00[2230s] threads: 600, tps: 3949.69, reads: 1972.60, writes: 1977.10, response time: 836.58ms (95%), errors: 0.00, reconnects:  0.00[2240s] threads: 600, tps: 4091.38, reads: 2042.09, writes: 2049.29, response time: 617.39ms (95%), errors: 0.00, reconnects:  0.00

In the middle of the process, the TPS is zero. Why is the above problem? Because there are too many dirty pages, MySQL must first fl the dirty pages to the disk to continue working.

To understand the relationship between dirty pages and redo logs, see http://blog.csdn.net/yaoqinglin/article/details/46646267

When the dirty page is refreshed less quickly than the transaction commit speed, resulting in too many dirty pages, it will trigger the MySQL protection mechanism, stop writing operations, only fl the disk, it is okay until MySQL thinks it is OK.

The configuration file is as follows:

innodb_log_file_size = 1000M innodb_log_files_in_group = 4innodb_max_dirty_pages_pct = 75 innodb_io_capacity  = 200

The cause of the problem is found. How can this problem be solved?

In my opinion, the most important thing is to reduce innodb_io_capacity.

Principle Analysis:

First, see

When the Log Pad accounts for more than 75% of the redo log, MySQL asynchronously refreshes the dirty pages indicated by Log pad to the disk. However, MySQL does not stop transaction commit and write the redo log.

When the Log Pad accounts for 90% of the redo log, MySQL stops all write operations and refreshes the Log Pad to the disk.

The reason for this is that the refresh speed is not as fast as the transaction commit speed. however, when a problem occurs, the monitoring results indicate that disk I/O is not used in large quantities. Why is MySQL?

Use disk I/O to start flushing before a problem occurs to relieve the pressure on the problem.

The reason is that MySQL has an adaptive disk flushing method to control the entire refresh process. innodb_adaptive_flushing, innodb_io_capacity, innodb_max_dirty_pages_pct, and redo log size to determine when

Start to refresh the dirty page. How can this problem be solved? In general, MySQL will judge whether the update speed can be controlled based on innodb_io_capacity. If innodb_io_capacity is too large, it will cause MySQL to overestimate

Disk capacity, resulting in dirty page accumulation, the problems described in this article will occur. if this parameter is set too low, MySQL will underestimate the disk capacity, reducing the number of transactions (tps) committed by the database per unit time.

Our server disk is 7200 rpm, which is a relatively low-level disk. According to MySQL official recommendations, innodb_io_capacity should be reduced to 100.

[root@t1 bin]# ./sg_vpd /dev/sda --page=0xb1Block device characteristics VPD page (SBC):  <strong>Nominal rotation rate: 7200 rpm</strong>  Product type: Not specified  WABEREQ=0  WACEREQ=0  Nominal form factor: 3.5 inch  HAW_ZBC=0  FUAB=0  VBULS=0

Official suggestions:

For systems with individual 5400 RPM or 7200 RPM drives, you might lower the value to the former　default of 100.

After the modification, retest and find that there will be no problems. The solution is beautiful.

In MySQL5.6, you can also set an appropriate innodb_max_dirty_pages_pct_lwm to start refreshing as soon as possible.

Copyright Disclaimer: This article is an original article by the blogger and cannot be reproduced without the permission of the blogger.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

An interesting phenomenon: innodb_io_capacity and innodbiocapacity

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

An interesting phenomenon: innodb_io_capacity and innodbiocapacity

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support