Linux Io Introduction

Source: Internet
Author: User
ArticleDirectory
    • Memory
    • 5 IO
Memory 4.1 virtual memory

 

Linux Kernel uses the virtual memory mechanism to use disks to expand the memory space. The kernel writes the memory that is currently not used to the disk to release more available memory. When the data is used again, It is reloaded into the memory. The disk space used as the virtual memory is called the swap space.

The read/write speed of the hard disk is much slower than that of the memory. Therefore, the virtual memoryProgram.

The use of virtual memory is often regarded as a sign of memory bottleneck.

Question N: Does the memory bottleneck occur when swap space is used?

 

KswapdAndPage frame Reclaim Algorithm

When the available memory of the system is lower than the threshold value (page_low, page_high), The kswpad service scans the space that can be swap out at a time and tries to swap out 32 memory pages at a time. This process repeatedly knows that the available memory reaches the page_high horizontal line. The memory that is swap out is also stored in SWAp SPAE.

Kswapd reclaim memoryAlgorithmCalled page frame reclaim algorithm, the following memory types can also be recycled:

    • ?Swappable-Anonymous memory pages
    • ?Syncable-pages backed by a disk file
    • ?Discardable-static pages, discarded pages

 

Memory recycling adopts the LRU policy. Memory pages that are not frequently used recently should be recycled first.

Now let's answer the above question:

The use of swap space demonstrates the rationality of Linux memory usage and does not indicate memory bottlenecks.

The switching rate of the swap space is an important indicator of memory bottlenecks.

5 IO

I/O subsystem architecture

 

5.1-page high-speed cache

Page cache is the main disk cache technology used by Linux kernel. Disk high-speed cache is a software mechanism that allows the system to keep some data stored on the disk in the memory so that the system no longer needs to access the disk for re-Access of that data.

When the kernel reads data from a disk, if the data page is no longer cached, it will fill in the disk data read to the page cache. By storing data pages in the cache, the process no longer needs to access the disk when using this page.

Before writing a page of data to a disk, kernel first checks whether the page is already in the cache. If not, it first fills the page data in the cache. The update to disk I/O operation is not performed immediately, but has a little latency, so that the process has the opportunity to further modify the written data. This is the kernel's delayed write operation.

Dirty Data Synchronization

After the process modifies the data in the page high-speed buffer, the data page is marked as "dirty data", that is, the pg_dirty flag is set. The Linux system allows delayed operations on dirty data written to disk Block devices. It is considered to be a mechanism that significantly increases the system I/O capability.

Dirty data is written to the disk under the following conditions:

    1. Insufficient page cache space
    2. It has not been updated for too long since it has become dirty.
    3. The process forcibly synchronizes updates to the disk through system calls (Sync (), fsync (), and fdataasync. The msync system call is used to refresh dirty data in the memory ing status to the disk.

 

Pdflush

The pdflush kernel thread is responsible for ding Qi to scan the dirty data in the cache and update it to the disk whenever appropriate. The timer is generally one second from 500. You can adjust this value through the/proc/sys/Vm/dirty_writeback_centisecs file.

The number of pdflush threads is dynamically adjusted as needed:

    1. There must be at least 2 and at most 8 pdflush threads. You can modify the values of these variables through the/proc/sys/Vm/nr_pdflush_threads parameter.
    2. If no idle pdflush thread exists in the last 1 s, a new one will be created.
    3. If the pdflush idle time exceeds 1 s, delete a pdflush.

 

Buffer/Cache

Cache-Full name of page cache. When a process initiates a read/write operation on the disk, it is mainly used to cache data on the disk in the memory, and pdflush is responsible for synchronization with the disk.

Buffer-block buffer. Block buffer stores bio struct data. Bio struct is an excuse between VFS and block layer. In general, block buffer is a layer of cache that deals with page cache and disk drives.

The usage of system page cache can be analyzed through the monitoring data of buffer/cache.

From the perspective of file reading and writing, buffer is mostly used to cache file management information, such as directory location and inode information. The cache caches the file content.

Since the CPU cannot directly process data on peripherals, the buffer is used to mark descriptions such as the location of those files. Cache is used to increase Transmission Performance and cache file content.

In terms of operating system operating principles, the buffer/cache aims to reduce MPFS and increase mnpfs.

MPF

When the kernel needs to read data, it first looks for the CPU cache and then the physical memory. If no major page fault (MPF) is sent ). An MPF can be seen as a request sent by the kernel to load disk data to the memory.

Mnpf

When the disk data is loaded into the memory and the kernel reads the data again, a minor page fault (mnpf) is sent ).

The following example shows the MPF and mnpf generated by the first access to a program and the second access to the same program.

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 103

Minor (Reclaiming a frame) page faults: 2356

Second access:

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 0

Minor (Reclaiming a frame) page faults: 2581

Demonstrate high-speed disk cache Process

Page cache is the main disk cache technology used by Linux kernel. Disk high-speed cache is a software mechanism that allows the system to keep some data stored on the disk in the memory so that the system no longer needs to access the disk for re-Access of that data.

 

When the kernel reads data from a disk, if the data page is no longer cached, it will fill in the disk data read to the page cache. By storing data pages in the cache, the process no longer needs to access the disk when using this page.

 

Before writing a page of data to a disk, kernel first checks whether the page is already in the cache. If not, it first fills the page data in the cache. The update to disk I/O operation is not performed immediately, but has a little latency, so that the process has the opportunity to further modify the written data. This is the kernel's delayed write operation.

 

Dirty Data Synchronization

After the process modifies the data in the page high-speed buffer, the data page is marked as "dirty data", that is, the pg_dirty flag is set. The Linux system allows delayed operations on dirty data written to disk Block devices. It is considered to be a mechanism that significantly increases the system I/O capability.

 

Dirty data is written to the disk under the following conditions:

    1. Insufficient page cache space
    2. It has not been updated for too long since it has become dirty.
    3. The process forcibly synchronizes updates to the disk through system calls (Sync (), fsync (), and fdataasync. The msync system call is used to refresh dirty data in the memory ing status to the disk.

 

Pdflush

The pdflush kernel thread is responsible for ding Qi to scan the dirty data in the cache and update it to the disk whenever appropriate. The timer is generally one second from 500. You can adjust this value through the/proc/sys/Vm/dirty_writeback_centisecs file.

 

The number of pdflush threads is dynamically adjusted as needed:

    1. There must be at least 2 and at most 8 pdflush threads. You can modify the values of these variables through the/proc/sys/Vm/nr_pdflush_threads parameter.
    2. If no idle pdflush thread exists in the last 1 s, a new one will be created.
    3. If the pdflush idle time exceeds 1 s, delete a pdflush.

 

 

 

Buffer/Cache

Cache-Full name of page cache. When a process initiates a read/write operation on the disk, it is mainly used to cache data on the disk in the memory, and pdflush is responsible for synchronization with the disk.

 

Buffer-block buffer. Block buffer stores bio struct data. Bio struct is an excuse between VFS and block layer. In general, block buffer is a layer of cache that deals with page cache and disk drives.

 

The usage of system page cache can be analyzed through the monitoring data of buffer/cache.

From the perspective of file reading and writing, buffer is mostly used to cache file management information, such as directory location and inode information. The cache caches the file content.

Since the CPU cannot directly process data on peripherals, the buffer is used to mark descriptions such as the location of those files. Cache is used to increase Transmission Performance and cache file content.

 

In terms of operating system operating principles, the buffer/cache aims to reduce MPFS and increase mnpfs.

MPF

When the kernel needs to read data, it first looks for the CPU cache and then the physical memory. If no major page fault (MPF) is sent ). An MPF can be seen as a request sent by the kernel to load disk data to the memory.

Mnpf

When the disk data is loaded into the memory and the kernel reads the data again, a minor page fault (mnpf) is sent ).

The following example shows the MPF and mnpf generated by the first access to a program and the second access to the same program.

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 103

Minor (Reclaiming a frame) page faults: 2356

Second access:

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 0

Minor (Reclaiming a frame) page faults: 2581

Demonstrate high-speed disk cache Process

 

 

Forcibly synchronize all pages to cache

This section describes how to forcibly synchronize all memory cache data to the disk and shows the free output result after the release.

Free pagecache

Echo 1>/proc/sys/Vm/drop_caches;

Free dentries and inodes

Echo 2>/proc/sys/Vm/drop_caches;

Free pagecache and dentries and inodes

Echo 3>/proc/sys/Vm/drop_caches;

Direct I/O

There are some more complex programs (usually called self-Cache applications) that prefer to control the I/O transmission process () by moving the o_direct flag location, the transfer of I/O data bypasses the page cache.

The system page cache technology is harmful for the following reasons:

    1. The redundant commands used to process page cache reduce the read () and write () efficiency.
    2. Read (), write () system calls are not directly transmitted on the disk and user space, but are divided into two steps: In the disk and kernel space, in the kernel space to the user space.

 

System parameters related to page Cache

    1. /Proc/sys/Vm/dirty_background_ratio

Indicates the maximum percentage of dirty data available in the system memory. The maximum of percentage of (Cache + free)-mapped ). 10% by default.

    1. /Proc/sys/Vm/dirty_ratio

Indicates the percentage of dirty data generated by the process to the overall memory of the system. At this time, the pdflush process is triggered to write the data back to the disk. The default value is 40.

    1. /Proc/sys/Vm/dirty_expire_centisecs

Indicates that the dirty data resident in the memory exceeds this value. pdflush will write the data back to the disk. The default value is 3000.

    1. /Proc/sys/Vm/dirty_writeback_centisecs

It indicates the interval between pdflush and processing dirty data write-back.

Forcibly synchronize all pages to cache

This section describes how to forcibly synchronize all memory cache data to the disk and shows the free output result after the release.

 

Free pagecache

Echo 1>/proc/sys/Vm/drop_caches;

 

Free dentries and inodes

Echo 2>/proc/sys/Vm/drop_caches;

 

Free pagecache and dentries and inodes

Echo 3>/proc/sys/Vm/drop_caches;

 

 

Direct I/O

There are some more complex programs (usually called self-Cache applications) that prefer to control the I/O transmission process () by moving the o_direct flag location, the transfer of I/O data bypasses the page cache.

 

The system page cache technology is harmful for the following reasons:

    1. The redundant commands used to process page cache reduce the read () and write () efficiency.
    2. Read (), write () system calls are not directly transmitted on the disk and user space, but are divided into two steps: In the disk and kernel space, in the kernel space to the user space.

 

System parameters related to page Cache

    1. /Proc/sys/Vm/dirty_background_ratio

Indicates the maximum percentage of dirty data available in the system memory. The maximum of percentage of (Cache + free)-mapped ). 10% by default.

 

    1. /Proc/sys/Vm/dirty_ratio

Indicates the percentage of dirty data generated by the process to the overall memory of the system. At this time, the pdflush process is triggered to write the data back to the disk. The default value is 40.

 

    1. /Proc/sys/Vm/dirty_expire_centisecs

Indicates that the dirty data resident in the memory exceeds this value. pdflush will write the data back to the disk. The default value is 3000.

 

    1. /Proc/sys/Vm/dirty_writeback_centisecs

It indicates the interval between pdflush and processing dirty data write-back.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.