Better Linux Disk Caching & Performance with Vm.dirty_ratio & Vm.dirty_background_ratio

Source: Internet
Author: User

In previous posts on vm.swappiness and using RAM disks we talked about how the memory on a Linux guest are used for the OS itself (the kernel, buffers, etc), applications, and also for file cache. File caching is an important performance improvement, and read caching are a clear win in most cases, balanced against APPL Ications using the RAM directly. Write caching is trickier. The Linux kernel stages disk writes into cache, and over time asynchronously flushes them to disk. This have a nice effect of speeding disk I/O but it's risky. When the data isn ' t written to disk there was an increased chance of losing it.

There is also the chance, which a lot of I/O would overwhelm the cache, too. Ever written a lot of data to disk all at once, and seen large pauses on the system while it tries to deal with all that D Ata? Those pauses is a result of the cache deciding that there's too much data to be written asynchronously (as a non-blocking Background operation, letting the application process continue), and switches to writing synchronously (blocking and Maki ng the process wait until the I/O is committed to disk). Of course, a filesystem also have to preserve write order, so when it starts writing synchronously it first have to Destage The cache. Hence the long pause.

The nice thing was that these be controllable options, and based on your workloads & data can decide how do you want To set them up. Let's take a look:

 $ sysctl-a | grep   dirty vm.dirty_background_ratio  = 10   vm.dirty_background_bytes  = 0   Vm.dirty_ratio  = 20   vm.dirty_ Bytes  = 0   vm.dirty_writeback_centisecs  = 500   vm.dirty_expire_centisecs  = Span style= "COLOR: #800080" >3000  

Vm.dirty_background_ratio is the percentage of system memory so can be filled with "dirty" pages-memory pages That still need to being written to disk-before the Pdflush/flush/kdmflush background processes kick in to write it to dis K. My example is 10%, so if My virtual server has a memory that's 3.2 GB of data that can being sitting in RAM before Something is done.

Vm.dirty_ratio is the absolute maximum amount of system memory so can be filled with dirty pages before Everyth ING must get committed to disk. When the system gets to this point all new I/O blocks until dirty pages has been written to disk. This is often the source of long I/O pauses, but was a safeguard against too much data being cached unsafely in memory.

vm.dirty_background_bytes and vm.dirty_bytes are another-to specify these parameters. If you set the _bytes version the _ratio version would become 0, and Vice-versa.

Vm.dirty_expire_centisecs is what long something can in the cache before it needs to be written. In this case it's a seconds. When the Pdflush/flush/kdmflush processes kick in they would check to see how old a dirty page is, and if it ' s older than t His value it ' ll is written asynchronously to disk. Since holding a dirty page in memory are unsafe this is also a safeguard against data loss.

Vm.dirty_writeback_centisecs is what often the pdflush/flush/kdmflush processes wake up and check to see if work n Eeds to is done.

You can also see statistics on the page cache In/proc/vmstat:

Cat Egrep " Dirty|writeback "  87800

In my case I had 878 dirty pages waiting to being written to disk.

Approach 1:decreasing the Cache

As with more things in the computer world, how are you adjust these depends on the what you ' re trying to do. In many cases we has fast disk subsystems with their own big, battery-backed NVRAM caches, so keeping things in the OS PA GE Cache is risky. Let's try to send I/O to the array in a further timely fashion and reduce the chance our local OS would, to borrow a phrase fr Om the service industry, being "in the weeds." To does this we lower vm.dirty_background_ratio and vm.dirty_ratio by adding new numbers to/etc/sysctl.conf and reloading W ith "sysctl–p":

5Ten

This was a typical approach on virtual machines, as well as linux-based hypervisors. I wouldn ' t suggest setting these parameters to zero, as some background I/O are nice to decouple application performance fr Om short periods of higher latency on your disk array & SAN ("Spikes").

Approach 2:increasing the Cache

There is scenarios where raising the cache dramatically have positive effects on performance. These situations is where the data contained on a Linux guest isn ' t critical and can is lost, and usually where an applic Ation is writing to the same files repeatedly or in repeatable bursts. In theory, by allowing + dirty pages to exist in memory you'll rewrite the same blocks over and over in cache, and just Need to does one write every so often to the actual disk. To does this we raise the parameters:

about

Sometimes folks also increase the vm.dirty_expire_centisecs parameter to allow more time in cache. Beyond the increased risk of data loss, you also run the risk of long I/O pauses if that cache gets full and needs to dest Age, because on large VMs there would be a lot of the data in cache.

Approach 3:both Ways

There is also scenarios where a system have to deal with infrequent, bursty traffic to slow disk (batch jobs at the top of The hour, midnight, writing to an SD card on a Raspberry Pi, etc.). In this case an approach might is to allow all that write I/O to is deposited in the cache so the the Background flush op Erations can deal with it asynchronously over time:

5

Here the background processes would start writing right away then it hits that 5% ceiling but the system won ' t Force SYNCHR Onous I/O until it gets to 80% full. From there your just size your system RAM and vm.dirty_ratio to being able to consume all the written data. Again, there is tradeoffs with the data consistency on disk, and which translates into risk to data. Buy a UPS and make sure your can destage cache before the UPS runs out of power. :)

No matter the route you choose your should always being gathering hard data to support your changes and help you determine if You are improving things or making them worse. In this case you can get data from many different places, including the application itself,/proc/vmstat,/proc/meminfo, I Ostat, Vmstat, and many of the things in/proc/sys/vm. Good luck!

Note:

1, vm.dirty_background_ratio refers to the number of dirty pages occupy the ratio of the cache when the value is reached, the background call Pdflush/flush/kdmflush write process will be dirty page dropped. The write operation performed at this time is an asynchronous write that does not block the application's write.

2, vm.dirty_ratio refers to the number of dirty pages occupy the ratio of the cache when the value is reached, the newly generated IO operation will be blocked until the resulting dirty page is written to disk. The write at this time is a synchronous operation that blocks the application's write. Although there are already vm.dirty_background_ratio, some people think that the back of the vm.dirty_ratio is not necessary, in fact, when the program writes faster than the operating system disk speed, dirty page ratio is likely to reach Vm.dirty_ Ratio's. At this point, in order to reduce the downtime, data loss, then the synchronous write operation.

3, as for the meaning of Vm.dirty_background_bytes and vm.dirty_bytes and the above equivalent, only the ratio is changed to a value

4, the above two ratio parameters, is relative to the cache, rather than the operating system of physical memory, above the example of a 32GB memory, the results of the calculation is wrong.

The cache is calculated as:

Vmsize = memory + memory-mapped

Actually, a couple things. The article have the intent correct, but some of the details aren'T.The total size of the page cache, the LastI looked, was Memfree + cached-mapped,whichIs isn't equal to the size the physical memoryinchThe
System. You canFindThese amountsinch/proc/meminfo,whichAlso list a number of interesting things about the dirty cache behaviorincha system. [Email protected]:/proc#grep-I. FreeMeminfomemfree:3638108Kbswapfree:3999996Kbhugepages_free:0[email protected]:/proc#grep-I dirty Meminfodirty: AKb[email protected]:/proc#grep-I.WriteMeminfowriteback:0kbwritebacktmp:0Kb[email protected]:/Proc#i has read several times that the dirty caches behavior is used by ext2+ To MakeIndex and Data writes MoreEfficient. As I understand it,
The longer the data is heldinchCache before being written to the media, the fewer inode/index updates it'll actually Do. So, forFlash type
Media, Holding writes does MakeSenseifYou are interestedinchincreasing the longevity of the flash memory. Also,inchLinux, the cache can always be assumed to be the size of Memfree + cached-mapped, it'll never be any bigger or smaller than that.
The only tuning option is what much of that memory DoYou want holding data waiting forwriteback (Dirty). These is fairly minor, the intent of the article seems sound. For the sake of discussion, here's The dirty_* parameters off of that server.[Email protected]:/proc/sys/vm#grep. Dirty_*dirty_background_bytes:0Dirty_background_ratio:1dirty_bytes:0Dirty_expire_centisecs:360000Dirty_ratio: +Dirty_writeback_centisecs: -[email protected]:/proc/sys/vm#years ago, I did quite a bit of custom tuning forEach of my systems. I've somewhat settled on the above. These work well in nearly every
Situation, from Raspi's/Odroids, systems running off USB, to the largest general purpose DC boxes. There is a few situations that this wouldn't work well for , database servers in particular, but these settings is safer than the default,
and perform better.Take it forWhat are you would.

5. In addition, the units of Vm.dirty_writeback_centisecs and Vm.dirty_expire_centisecs are 1% seconds.

6, this is the operating system cache part of the disk, as for the database buffer is managed by itself, but because of the existence of the database cache and operating system cache double cache, in the case of reasonable database parameters, also pay attention to the operating system cache adjustment, although the database is the first to write the log, Whether the data in the cache can be dropped on time is less important. However, the setting of operating system parameters will affect the use of disk IO.

Reference:

https://lonesysadmin.net/2013/12/22/better-linux-disk-caching-performance-vm-dirty_ratio/

https://www.reddit.com/r/linux/comments/3h7w8f/better_linux_disk_caching_performance_with/

Http://www.cnblogs.com/xiaotengyi/p/6907190.html

Https://www.kernel.org/doc/Documentation/sysctl/vm.txt

Better Linux Disk Caching & Performance with Vm.dirty_ratio & Vm.dirty_background_ratio

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.