Linux I/O optimized disk read/write parameter settings

Source: Internet
Author: User

Reprint: http://wlservers.blog.163.com/blog/static/120622304201241715945256/

For information about page caching, you can use the
Cat/proc/meminfo
See. The cached refers to the memory size (Diskcache-swapcache) used for Pagecache. With the write cache page, the value of Dirty is incremented.
Once the cache page is written to the hard disk, the value of writeback is incremented until the end of the write.

Linux uses the Pdflush process to write data from a cache page to a hard disk to see how many pdflush processes
Cat/proc/sys/vm/nr_pdflush_threads

The behavior of Pdflush is controlled by the parameters in the/PROC/SYS/VM
/proc/sys/vm/dirty_writeback_centisecs (default 500):
1/100 seconds, how long to wake up Pdflush writes cached page data to the hard disk. The default of 5 seconds wakes up 2 (more) threads.
If the wrteback time is longer than dirty_writeback_centisecs, there may be a problem.

The first thing about Pdflush is to read
/proc/sys/vm/dirty_expire_centiseconds (default 3000)
1/100 seconds. The Expiration time (old data) of the data in the cached page, which is written to the hard disk in the next cycle. The default of 30 seconds is a very long time.

The second thing is to determine if the memory is up to the limit to write to the hard disk, as determined by the parameters:
/proc/sys/vm/dirty_background_ratio (default 10)
Percent, keep the maximum value of the expired page cache (Dirty page cache). is based on the value of the mmefree+cached-mapped.

Pdflush write to hard disk see two parameters:
1 if the data is in the page cache for more than 30 seconds, and if so, it is marked as dirty page cache;
2 whether the dirty page cache reaches 10% of the working memory;

The following parameters also affect the Pdflush
/proc/sys/vm/dirty_ratio (default 40)
The maximum percentage of total memory that the system can have for the maximum amount of dirty page cache. Above this value, turn on Pdflush write to hard disk. If the cache grows faster than Pdflush, then the entire system encounters an I/O bottleneck at 40%, and all I/O waits for the cache to be pdflush into the hard drive before restarting.

For systems with high write operations
Dirty_background_ratio: Main adjustment parameters. If you need to keep the cache persistent rather than a large amount of write to the hard disk, lower this value.
Dirty_ratio: Second adjustment parameter.

Swapping parameters
/proc/sys/vm/swappiness
By default, Linux tends to map from physical memory to hard disk cache, keeping the hard disk cache as large as possible. Unused page caches are placed in the swap area.
A value of 0 will avoid the use of swapping
100, will try to use swapping
Less use of swapping will increase the responsiveness of the program, and multi-use swapping will improve the usability of the system.

If you have a large number of write operations, to avoid long waits for I/O, you can set:
$ echo 5 >/proc/sys/vm/dirty_background_ratio
$ echo >/proc/sys/vm/dirty_ratio

File system data buffering requires frequent memory allocations. Increasing the value of reserved memory can increase the speed and stability of the system. Less than 8G of memory, reserved memory of 64M, greater than 8G set to 256M
$ echo 65536 >/proc/sys/vm/min_free_kbytes


I/O Scheduler
Cat/sys/block/[disk]/queue/scheduler

scheduling algorithm in 4
NoOp anticipatory deadline [CFQ]
The Deadline:deadline algorithm guarantees a minimum delay time for a given IO request.
Anticipatory: After an IO occurs, if another process requests Io, a default 6ms guessing time is generated, guessing what the next process request Io is doing. This can cause a large delay for random reads.
Bad for database applications, and for Web servers.
CFQ: An IO queue is maintained for each process, and IO requests from each process are handled by CFQ in a round-robin manner, which is fair to every IO request. Suitable for discrete-time reading applications.
NOOP: All IO requests are processed in FIFO queue format. There is no performance issue with default IO.

Change the Scheduler
$ echo Deadline >/sys/block/sdx/queue/scheduler
For database servers, the deadline algorithm is recommended.

To increase the scheduler request queue
$ echo 4096 >/sys/block/sdx/queue/nr_requests

With a large number of read requests, the default request queue cannot cope, and this value can be increased. The disadvantage is to sacrifice a certain amount of memory.
To increase the throughput of continuous reads, you can increase the amount of read-ahead data. The actual value of the read-ahead is adaptive, so using a higher value does not degrade the performance of small random access.
$ echo 4096 >/sys/block/sdx/queue/read_ahead_kb
If Linux determines that a process is reading a file sequentially, it will read the data of the files required by the process in advance and put it in the cache.



The server encountered a disk write activity spike that caused the request processing latency to be very large (more than 3 seconds). By adjusting the kernel parameters, the peak of write activity is distributed to frequent multiple writes, and less data is written each time. This can be used to flattened the peak write operation into multiple write operations. This is done in a less efficient way because the kernel does not have the opportunity to compose the write operation. But for busy servers, write operations will be more consistent and will greatly improve interactive performance.

/proc/sys/VM/dirty_ratio

The size of the write buffer that controls the file system, expressed as a percentage, represents the percentage of system memory that is used to write out data to disk when the write buffer uses the amount of system memory. Increased use of more system memory for disk write buffering can also greatly improve the write performance of the system. However, you should lower the values when you need constant, consistent write situations.

/proc/sys/VM/dirty_background_ratio    

Controls the Pdflush process of the file system, when the disk is refreshed. Units are percentages, which represent the percentage of system memory, Pdflush is used to synchronize the in-memory content with the file system, for example, when a file is modified in memory, Pdflush is responsible for writing it back to the hard disk. Whenever the in-Memory spam page (Dirty page) exceeds 10%, Pdflush will back up these pages back to the hard drive. Increased use of more system memory for disk write buffering can also greatly improve the write performance of the system. However, when you need a continuous, constant write situation, you should lower its value:

/proc/sys/VM/dirty_writeback_centisecs    

Controls the run interval of the dirty data refresh process Pdflush the kernel. The unit is 1/100 seconds. The default value is 500, which is 5 seconds. If your system is continuously writing to the action, then actually it is better to lower this value, so that the spike write operation can be flattened into multiple writes.
This value should be increased if your system is short-term, spike-type write operations, and the data is small (dozens of m/times) and the memory is more affluent.
The setting for this parameter should be less than dirty_expire_centisecs, but not too small, too small I/O is too frequent, but
Degrade the performance of the system. You may need to test on a production environment. It is said that 1:6 (dirty_expire_centisecs:dirty_writeback_centisecs) ratio is better.

/proc/sys/VM/dirty_expire_centisecs

After declaring that the data in the Linux kernel write buffer is "old", the Pdflush process begins to consider writing to disk. The unit is 1/100 seconds. The default is 30000, which means that 30 seconds of data is old and will flush the disk. For specially overloaded writes, it is good to shrink the value appropriately, but it does not shrink too much, because too much narrowing can cause the IO to improve too quickly.
Of course, if your system memory is large, and the write mode is intermittent, and the data written every time is small (such as dozens of M), then this value is better.

/proc/sys/VM/vfs_cache_pressure

Indicates that the kernel recycles the memory used by the directory and Inode caches, and the default value of 100 means that the kernel will keep the directory and inode caches at a reasonable percentage based on Pagecache and Swapcache, and lower the value below 100. Will cause the kernel to tend to retain the directory and Inode caches, and increasing this value by more than 100 will cause the kernel to tend to recycle directory and Inode caches

/proc/sys/VM/min_free_kbytes

Represents the minimum amount of free memory (Kbytes) that the Linux VM is forced to keep.
Default setting: 724 (512M physical memory)

/proc/sys/VM/nr_pdflush_threads

Indicates the number of Pdflush processes currently running, and the kernel will automatically add more Pdflush processes with high I/O load.

/proc/sys/VM/overcommit_memory    

Specifies the kernel's policy for memory allocation, which can be 0, 1, 2.

0, indicates that the kernel will check for sufficient available memory to be used by the process, and if sufficient memory is available, the memory request is allowed; otherwise, the memory request fails and the error is returned to the application process.

1, which means that the kernel allows all physical memory to be allocated regardless of the current memory state.

2, which indicates that the kernel allows allocating more memory than the sum of all physical memory and swap space (refer to Overcommit_ratio).

Default setting: 0

/proc/sys/VM/overcommit_ratio    

If overcommit_memory=2, the percentage of memory that can be overloaded, calculate the overall available memory for the system by using the following formula. System assignable Memory = Swap space + physical memory *overcommit_ratio/100
Default setting: 50 (%)

/proc/sys/VM/page-cluster

Represents the number of pages written once to the swap area, 0 for 1 pages, 1 for 2 pages, and 2 for 4 pages.
Default setting: 3 (2 of 3 parties, 8 pages)

/proc/sys/VM/swapiness    

Indicates the degree to which the system is exchanging behavior, and the higher the value (0-100), the more likely the disk exchange will occur.

Change:
/etc/sysctl.conf

VMS. Dirty_ratio =

Sysctl-p

View:

/proc/sys/-name Dirty*-Print|  while read name;  do;  Cat ${name};                    Done 

Linux I/O optimized disk read/write parameter settings

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.