Selection of I/O scheduling in Linux systems

Source: Internet
Author: User

I/O scheduling algorithm plays the role of referee when competing disk I/O for each process. He asked for the order and timing of the requests to be optimally handled in order to achieve the best possible overall I/O performance. There are 4 scheduling algorithms listed under LinuxCFQ (Completely Fair Queuing perfectly Fair line) (ELEVATOR=CFQ):This is the default algorithm, which is usually the best choice for a common server. It attempts to evenly distribute access to I/O bandwidth. In multimedia applications, audio and video are always guaranteed to read from disk in a timely manner. But it's also good for other types of applications. One queue per process, and each queue merges and sort according to the above rules. Round robin schedules between processes, executing 4 requests per process at a time. Deadline (elevator=deadline):This algorithm attempts to minimize the latency of each request. The algorithm re-ordered the request to improve performance. NOOP (elevator=noop):This algorithm implements a simple FIFO queue. He assumes that I/O requests are optimized or re-ordered by the driver or device (just like the work done by a smart controller). In some SAN environments, this option may be the best choice. For random access devices, no seek cost, non-mechanically randomly addressable disk. Anticipatory (elevator=as):the algorithm defers I/O requests, hoping to sort them for maximum efficiency. The difference with deadline is that each time a read request is processed, it is not returned immediately, but waits for several subtleties during which any request from the neighboring region is immediately executed. After the timeout, the original processing continues. Based on the following assumptions: Within a few subtleties, the program has a great opportunity to submit another request. The scheduler tracks IO read-write statistics for each process to get the best possible expectations.How to view and set the IO scheduling method in LinuxView current IO
Cat/sys/block/{device-name}/queue/scheduler
Cat/sys/block/sd*/queue/scheduler
Example: The output results are as followsNoOp anticipatory deadline [CFQ]Set Current IO
echo {Scheduler-name} >/sys/block/{device-name}/queue/scheduler
echo NoOp >/sys/block/hda/queue/scheduler
recommendations for the use of IO schedulingDeadline I/O Scheduler uses a polling scheduler, simple and compact, providing minimal read latency. Good throughput, especially for reading more environments (such as databases, Oracle 10G, etc.).anticipatory I/O Scheduler assume that a block device has only one physical lookup head (such as a single SATA hard disk), merging multiple random small write streams into an uppercase stream, with write latencies for maximum write throughput. For most environments, Especially for applications that write more environments (such as file servers) Web,app, we can adopt as scheduling.CFQ I/O Scheduler use QoS policies to allocate the same amount of bandwidth to all tasks, avoiding starvation of processes and achieving lower latency, which can be considered a tradeoff between the two schedulers. For multiuser systems with a large number of processesI tested a machine in the production environment, the flow only 350M, and sometimes the pressure is not, the flow is not going, because read more, so the use of deadline, traffic rose 50M, from the flow and other diagrams also see a lot of stability.Set default IO schedule at Linux startuplet the system start with the default Io method, just add a line similar to the following in the grub.conf file
kernel/vmlinuz-2.6.24 ro root=/dev/sda1 elevator=deadline
several kernel parameters related to IO

/proc/sys/vm/dirty_ratio

This parameter controls the size of the file system write buffer for the filesystem, in percent, representing the percentage of system memory, indicating how much of the memory is used to write data to disk. Increased use of more system memory for disk write buffering can also greatly improve the write performance of the system. However, when you need a continuous, constant write situation, you should lower its value, generally starting on the default is 10. Here's how to increase it:

Echo ' 40′>/proc/sys/vm/dirty_ratio

/proc/sys/vm/dirty_background_ratio

This parameter controls the Pdflush process of the file system and when the disk is refreshed. The unit is a percentage that represents the percentage of system memory, meaning that when the write buffer is used to the amount of system memory, Pdflush begins writing data to the disk. Increased use of more system memory for disk write buffering can also greatly improve the write performance of the system. However, when you need a continuous, constant write situation, you should lower its value, generally starting on the default is 5. Here's how to increase it:

Echo ' 20′>/proc/sys/vm/dirty_background_ratio

/proc/sys/vm/dirty_writeback_centisecs

This parameter controls the run interval of the kernel's dirty data refresh process Pdflush. The unit is 1/100 seconds. The default value is 500, which is 5 seconds. If your system is continuously writing to the action, then actually it is better to lower this value, so that the spike write operation can be flattened into multiple writes. The Setup method is as follows:

Echo ' 200′>/proc/sys/vm/dirty_writeback_centisecs

This value should be increased if your system is short-term, spike-type write operations, and the amount of data written is small (dozens of m/times) and the memory is more affluent:

Echo ' 1000′>/proc/sys/vm/dirty_writeback_centisecs

/proc/sys/vm/dirty_expire_centisecs

This parameter declares that the Pdflush process begins to consider writing to disk when the data in the Linux kernel write buffer is "old". The unit is 1/100 seconds. The default is 30000, which means that 30 seconds of data is old and will flush the disk. For specially overloaded writes, it is good to shrink the value appropriately, but it does not shrink too much, because too much narrowing can cause the IO to improve too quickly. The recommended setting is 1500, which is 15 seconds old.

Echo ' 1500′>/proc/sys/vm/dirty_expire_centisecs

Of course, if your system memory is large, and the write mode is intermittent, and the data written every time is small (such as dozens of M), then this value is better.

Transferred from: http://blog.chinaunix.net/uid-20618535-id-70900.html

Selection of I/O scheduling in Linux systems (RPM)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.