[Arrangement] Linux I/O Scheduling

Source: Internet
Author: User

I) Summary of the I/O Scheduler

1) when a data block is written to or read from the device, the request is placed in a queue waiting for completion.
2) each block device has its own queue.
3) the I/O scheduler is responsible for maintaining the order of these queues to make more effective use of media. the I/O scheduler converts unordered I/O operations into ordered I/O operations.
4) The kernel must first determine the total number of requests in the queue before scheduling starts.

 

2) Four I/O Scheduling Algorithms

1) CFQ (completely fair queuing, fully Fair Queuing)

Features:
In the latest kernel version and release version, CFQ is selected as the default I/O scheduler, which is also the best choice for general servers.
CFQ tries to evenly distribute access to the I/O bandwidth to prevent the process from starvation and achieve low latency, which is a compromise between deadline and as scheduler.
CFQ is the best choice for multimedia applications (video, audio) and desktop systems.
CFQ assigns an I/O Request priority, while an I/O priority request is independent of a process priority. read/write of a high-priority process cannot automatically inherit a high I/O priority.

Working principle:
CFQ creates a separate queue for each process/thread to manage the requests generated by the process. That is to say, each process has a queue, and scheduling between queues uses time slices for scheduling, to ensure that each process can be well allocated to the I/O bandwidth. the I/O scheduler executes four requests for a process each time.

2) Noop)

Features:
In the scheduling program of linux2.4 or earlier versions, there was only this I/O scheduling algorithm.
Noop implements a FIFO queue, which organizes I/O requests like the elevator master method. When a new request arrives, it combines requests to the nearest request to ensure that the request is in the same media.
Noop tends to starve to read and facilitate writing.
Noop is the best choice for flash memory devices, ram, and embedded systems.

Elevator algorithm starved Read Request explanation:
Because write requests are easier than read requests.
Write requests are cached by the file system. You do not need to wait for one write to complete. You can start the next write operation. Write requests are merged and stacked into the I/O queue.
The next read operation can be performed only after all the previous read operations are completed. there are several milliseconds between read operations, and write requests come between them, starving the subsequent read requests.

 

3) deadline (deadline Scheduler)

Features:
The time and hard disk area are used for classification. This classification and merging requires a scheduler similar to Noop.
Deadline ensures that the service request is made within the deadline. The deadline is adjustable, and the default read period is shorter than the write period. this prevents the write operation from getting starved to death because it cannot be read.
Deadline is the best choice for database environments (such as Oracle RAC and MySQL.

4) as (I/O Scheduler)

Features:
It is essentially the same as deadline, but after the last read operation, it takes 6 ms to continue scheduling other I/O requests.
You can reserve a new Read Request from the application to improve the execution of read operations, but at the cost of some write operations.
It inserts a new I/O operation in each 6 ms, and merges some lower-case streams into one upper-case stream, in exchange for the maximum write throughput.
As is suitable for writing a large number of environments, such as file servers.
As performs poorly on the database environment.

 

3) view and set the I/O Scheduling Method

1) view the I/O scheduling of the current system

[Root @ test1 TMP] # Cat/sys/block/SDA/queue/Scheduler
Noop anticipatory deadline [CFQ]

2) temporarily change I/O Scheduling
For example, to change to the Noop elevator scheduling algorithm:
Echo Noop>/sys/block/SDA/queue/schedue

3) permanent change of I/O Scheduling
Modify the kernel boot parameter and add elevator = scheduler name
[Root @ test1 TMP] # vi/boot/GRUB/menu. lst
Change to the following content:
Kernel/boot/vmlinuz-2.6.18-8.el5 Ro root = label =/elevator = deadline rhgb quiet

After the restart, view the scheduling method:
[Root @ test1 ~] # Cat/sys/block/SDA/queue/schedue
Noop anticipatory [deadline] CFQ
It's already deadline.

4) Test the I/O Scheduler

This test is divided into read-only, write-only, read and write at the same time, respectively for a single file 600 mb, each read and write 2 m, a total of 300 read and write.

1) test disk read
[Root @ test1 TMP] # echo deadline>/sys/block/SDA/queue/schedline
[Root @ test1 TMP] # TIME dd If =/dev/sda1 F =/dev/null BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 6.81189 seconds, 92.4 MB/S

Real 0m6. 833 S
User 0m0. 001 S
Sys 0m4. 556 s

[Root @ test1 TMP] # echo Noop>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # TIME dd If =/dev/sda1 F =/dev/null BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 6.61902 seconds, 95.1 MB/S

Real 0m6. 645 s
User 0m0. 002 s
Sys 0m4. 540 s

[Root @ test1 TMP] # echo anticipatory>/sys/block/SDA/queue/schedory
[Root @ test1 TMP] # TIME dd If =/dev/sda1 F =/dev/null BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 8.00389 seconds, 78.6 MB/S

Real 0m8. 021 s
User 0m0. 002 s
Sys 0m4. 586 s

[Root @ test1 TMP] # echo CFQ>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # TIME dd If =/dev/sda1 F =/dev/null BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 29.8 seconds, 21.1 MB/S

Real 0m29. 826 s
User 0m0. 002 s
Sys 0m28. 606 s

Result:
Noop 1: It takes 6.61902 seconds and the speed is 95.1 Mb/s.
Second deadline: It takes 6.81189 seconds and the speed is 92.4 Mb/s.
Third, anticipatory: It takes 8.00389 seconds and the speed is 78.6 Mb/s.
Fourth CFQ: It takes 29.8 seconds and the speed is 21.1 Mb/s.

2) test Disk Writing
[Root @ test1 TMP] # echo CFQ>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # TIME dd If =/dev/zero f =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 6.93058 seconds, 90.8 MB/S

Real 0m7. 002 s
User 0m0. 001 S
Sys 0m3. 525 s

[Root @ test1 TMP] # echo anticipatory>/sys/block/SDA/queue/schedory
[Root @ test1 TMP] # TIME dd If =/dev/zero f =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 6.79441 seconds, 92.6 MB/S

Real 0m6. 964 s
User 0m0. 003 s
Sys 0m3. 489 s

[Root @ test1 TMP] # echo Noop>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # TIME dd If =/dev/zero f =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 9.49418 seconds, 66.3 MB/S

Real 0m9. 855 s
User 0m0. 002 s
Sys 0m4. 075 s

[Root @ test1 TMP] # echo deadline>/sys/block/SDA/queue/schedline
[Root @ test1 TMP] # TIME dd If =/dev/zero f =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 6.84128 seconds, 92.0 MB/S

Real 0m6. 937 s
User 0m0. 002 s
Sys 0m3. 447 s

Test results:
The first anticipatory takes 6.79441 seconds and the speed is 92.6 Mb/s.
The second deadline takes 6.84128 seconds and the speed is 92.0 Mb/s.
The third CFQ takes 6.93058 seconds and the speed is 90.8 Mb/s.
The fourth Noop took 9.49418 seconds and the speed was 66.3 Mb/s.

3) test simultaneous read/write

[Root @ test1 TMP] # echo deadline>/sys/block/SDA/queue/schedline
[Root @ test1 TMP] # dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 15.1331 seconds, 41.6 MB/S

[Root @ test1 TMP] # echo CFQ>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 36.9544 seconds, 17.0 MB/S

[Root @ test1 TMP] # echo anticipatory>/sys/block/SDA/queue/schedory
[Root @ test1 TMP] # dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 23.3617 seconds, 26.9 MB/S

[Root @ test1 TMP] # echo Noop>/sys/block/SDA/queue/Scheduler
[Root @ test1 TMP] # dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300
300 + 0 records in
300 + 0 records out
629145600 bytes (629 MB) Copied, 17.508 seconds, 35.9 MB/S

Test results:
The first deadline takes 15.1331 seconds and the speed is 41.6 Mb/s.
The second Noop takes 17.508 seconds and the speed is 35.9 Mb/s.
The third anticipatory takes 23.3617 seconds and the speed is 26.9 ms/s.
The fourth CFQ takes 36.9544 seconds and the speed is 17.0 Mb/s.

 

5) ionice

Ionice can change the Task Type and priority, but only the CFQ scheduler can use ionice.
There are three examples to illustrate the functions of ionice:
Real-time scheduling using CFQ with a priority of 7
Ionice-C1-N7-ptime dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300 &
Use the default disk I/O Scheduling with a priority of 3
Ionice-C2-N3-ptime dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300 &
Idle disk scheduling with a priority of 0
Ionice-C3-N0-ptime dd If =/dev/sda1 F =/tmp/test BS = 2 m COUNT = 300 &

The three scheduling methods of ionice have the highest real-time scheduling, followed by default I/O scheduling, and finally idle disk scheduling.
There are eight disk scheduling priorities for ionice, with a maximum of 0 and a minimum of 7.
Note that the disk scheduling priority has nothing to do with the nice process priority.
One is the priority of process I/O, and the other is the priority of process CPU.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.