Introduction to the principle and parameters of Linux IO kernel parameter tuning

Source: Internet
Author: User

  1. Page CacheLinux OS writes are write-cached by default, and direct IO is used to bypass the operating system's write cache. When you write a string of data, the system will open up a memory area to cache the data, which is what we often call the page cache (the caching of the operating system's pages). View system memory commonly used commands are: Vmstat, free, top and so on. You can use Cat/proc/meminfo to view detailed memory usage where the cached is around 140M (page cache). Note that there is a dirty:24kb that currently has 24KB of data cached in page cache, which waits for the background thread to flash into the disk. As the write data increases, this value also increases. 2. WritebackWith page cache There is a way to write writeback. A write Io is written to the page cache and then waits for background pdflush to swipe the dirty data from the page cache into the disk. If the system loses power before the disk is brushed, the data for page cache is lost. So for some high-reliability scenarios, this write cache will be banned. Writeback writing is a very general write mode provided by the Linux operating system. The writeback provides better throughput, and the cache also shortens the IO response time. But it also has drawbacks: (1) power loss may drop data (data security) (2) for a system such as a database such as self-caching, a layer of IO cache overhead. Because the database already has a layer of caching at the application level. So for such an application, you can use direct IO to reduce the data replication overhead between user space and page cache. (3) If the page cache is too large, then too much data will be cached, when the need to flush the disk in the time there will be an IO peak and bottleneck, in the meantime, the user's IO access has a significant impact. If you want to flattened this peak, you can set the page cache capacity to a smaller size, allowing Pdflush to refresh dirty data more evenly over time. 3. PdflushPdflush is a thread running in the background of a Linux system that is responsible for inputting the data of the dirty state of the page CAHCE into the disk periodically. A system will run a lot of this pdflush. Cat/proc/sys/vm/nr_pdflush_threads See how many pdflush the current system is running. When a period of time (typically 1s) does not have any pdflush in the working state, the system will remove a pdflush thread. Pdflush the maximum and minimum quantities are configured, but these configurations are rarely modified. 4. Several important IO write-related parameters 4.1 dirty_writeback_centisecscat/proc/sys/vm/dirty_writeback_centisecs View this value, the default is typically 500 (in 1/100 seconds). This parameter indicates that 5s of time Pdflush will be aroused to refresh dirty data. There is no official document stating that reducing this value will have more pdflush to participate in the brush data. For example, 2.6 or earlier kernel, Linux mm/page-writeback.c in the source code has such a paragraphdescribes "If Pdflush refreshes dirty data longer than this configuration time, pdflush will sleep 1s after the refresh is complete." This congestion protection mechanism description is written only in the source code, and is not written to the official documents or the formation of specifications, so it means that the mechanism in different versions may have different performance. so modifying the dirty_writeback_centisecs does not necessarily give you a lot of performance improvements, but you may have unexpected problems on the contrary. It is generally recommended that users use the default values.   4.2 dirty_expire_centisecscat/proc/sys/vm/dirty_expire_centicecs View this value, the default is 3000 (in 1/100 seconds). This value indicates how long the data in the page cache is marked as dirty data. Only data marked as dirty will be pdflush to disk when the next cycle arrives, which means that the data written by the user can be brushed into the disk after 30 seconds, during which time the power loss is lost. If you want to pdfflush the refresh frequency capitalization, you can reduce this value, such as: Echo >>/proc/sys/vm/dirty_expire_centicecs set to 10s a refresh period.   4.3 dirty_backgroud_ratioCat/proc/sys/vm/dirty_backgroud_ratio View this value, the default is 10 (in percent, different kernel versions may have different default values). Many of the descriptions in the document describe this value as a percentage of the total memory that represents the maximum amount of cache dirty data. In fact, view the description of the source code, its true meaning is accounted for (Percentage of Memfree + cached-mapped). When this limit is reached, Pdflush will wake up to flush the dirty data to disk, and all write Io will be blocked before the dirty data is entered into the disk. So if this value is too large, a write IO peak will occur for the period, and the peak continues for a long time, during which time the user's write Io is blocked. For some business scenarios it is necessary to set the value of the lowercase, the peak write Io divided into multiple small write Io. For example: Echo 5 >> cat/proc/sys/vm/dirty_backgroud_ratio Reduce the percentage to 5%.
4.4 Dirty_ratioCat/proc/sys/vm/dirty_ratio View this value, the default is 20 (in percent, different kernel versions may have different default values). Indicates that when dirty data takes up more than 20% of the total memory, the kernel blocks all writes and waits for Pdflush to flush the dirty data to disk before resuming normal IO writes. It is important to note that when this event occurs, all write operations are blocked. This will create a big problem, a long time large IO will preempt more IO write resources, may put other small IO starve. Because large IO produces more dirty data and quickly reaches this threshold, the system will block out all write io and the lowercase IO will not be able to write.

Introduction to the principle and parameters of Linux IO kernel parameter tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.