Linux Io Introduction

Last Update:2018-12-06 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

ArticleDirectory

Memory
5 IO

Memory 4.1 virtual memory

Linux Kernel uses the virtual memory mechanism to use disks to expand the memory space. The kernel writes the memory that is currently not used to the disk to release more available memory. When the data is used again, It is reloaded into the memory. The disk space used as the virtual memory is called the swap space.

The read/write speed of the hard disk is much slower than that of the memory. Therefore, the virtual memoryProgram.

The use of virtual memory is often regarded as a sign of memory bottleneck.

Question N: Does the memory bottleneck occur when swap space is used?

KswapdAndPage frame Reclaim Algorithm

When the available memory of the system is lower than the threshold value (page_low, page_high), The kswpad service scans the space that can be swap out at a time and tries to swap out 32 memory pages at a time. This process repeatedly knows that the available memory reaches the page_high horizontal line. The memory that is swap out is also stored in SWAp SPAE.

Kswapd reclaim memoryAlgorithmCalled page frame reclaim algorithm, the following memory types can also be recycled:

?Swappable-Anonymous memory pages
?Syncable-pages backed by a disk file
?Discardable-static pages, discarded pages

Memory recycling adopts the LRU policy. Memory pages that are not frequently used recently should be recycled first.

Now let's answer the above question:

The use of swap space demonstrates the rationality of Linux memory usage and does not indicate memory bottlenecks.

The switching rate of the swap space is an important indicator of memory bottlenecks.

5 IO

I/O subsystem architecture

5.1-page high-speed cache

Page cache is the main disk cache technology used by Linux kernel. Disk high-speed cache is a software mechanism that allows the system to keep some data stored on the disk in the memory so that the system no longer needs to access the disk for re-Access of that data.

When the kernel reads data from a disk, if the data page is no longer cached, it will fill in the disk data read to the page cache. By storing data pages in the cache, the process no longer needs to access the disk when using this page.

Before writing a page of data to a disk, kernel first checks whether the page is already in the cache. If not, it first fills the page data in the cache. The update to disk I/O operation is not performed immediately, but has a little latency, so that the process has the opportunity to further modify the written data. This is the kernel's delayed write operation.

Dirty Data Synchronization

After the process modifies the data in the page high-speed buffer, the data page is marked as "dirty data", that is, the pg_dirty flag is set. The Linux system allows delayed operations on dirty data written to disk Block devices. It is considered to be a mechanism that significantly increases the system I/O capability.

Dirty data is written to the disk under the following conditions:

Insufficient page cache space
It has not been updated for too long since it has become dirty.
The process forcibly synchronizes updates to the disk through system calls (Sync (), fsync (), and fdataasync. The msync system call is used to refresh dirty data in the memory ing status to the disk.

Pdflush

The pdflush kernel thread is responsible for ding Qi to scan the dirty data in the cache and update it to the disk whenever appropriate. The timer is generally one second from 500. You can adjust this value through the/proc/sys/Vm/dirty_writeback_centisecs file.

The number of pdflush threads is dynamically adjusted as needed:

There must be at least 2 and at most 8 pdflush threads. You can modify the values of these variables through the/proc/sys/Vm/nr_pdflush_threads parameter.
If no idle pdflush thread exists in the last 1 s, a new one will be created.
If the pdflush idle time exceeds 1 s, delete a pdflush.

Buffer/Cache

Cache-Full name of page cache. When a process initiates a read/write operation on the disk, it is mainly used to cache data on the disk in the memory, and pdflush is responsible for synchronization with the disk.

Buffer-block buffer. Block buffer stores bio struct data. Bio struct is an excuse between VFS and block layer. In general, block buffer is a layer of cache that deals with page cache and disk drives.

The usage of system page cache can be analyzed through the monitoring data of buffer/cache.

From the perspective of file reading and writing, buffer is mostly used to cache file management information, such as directory location and inode information. The cache caches the file content.

Since the CPU cannot directly process data on peripherals, the buffer is used to mark descriptions such as the location of those files. Cache is used to increase Transmission Performance and cache file content.

In terms of operating system operating principles, the buffer/cache aims to reduce MPFS and increase mnpfs.

MPF

When the kernel needs to read data, it first looks for the CPU cache and then the physical memory. If no major page fault (MPF) is sent ). An MPF can be seen as a request sent by the kernel to load disk data to the memory.

Mnpf

When the disk data is loaded into the memory and the kernel reads the data again, a minor page fault (mnpf) is sent ).

The following example shows the MPF and mnpf generated by the first access to a program and the second access to the same program.

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 103

Minor (Reclaiming a frame) page faults: 2356

Second access:

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 0

Minor (Reclaiming a frame) page faults: 2581

Demonstrate high-speed disk cache Process

Dirty Data Synchronization

Dirty data is written to the disk under the following conditions:

Insufficient page cache space
It has not been updated for too long since it has become dirty.
The process forcibly synchronizes updates to the disk through system calls (Sync (), fsync (), and fdataasync. The msync system call is used to refresh dirty data in the memory ing status to the disk.

Pdflush

The number of pdflush threads is dynamically adjusted as needed:

There must be at least 2 and at most 8 pdflush threads. You can modify the values of these variables through the/proc/sys/Vm/nr_pdflush_threads parameter.
If no idle pdflush thread exists in the last 1 s, a new one will be created.
If the pdflush idle time exceeds 1 s, delete a pdflush.

Buffer/Cache

Buffer-block buffer. Block buffer stores bio struct data. Bio struct is an excuse between VFS and block layer. In general, block buffer is a layer of cache that deals with page cache and disk drives.

The usage of system page cache can be analyzed through the monitoring data of buffer/cache.

From the perspective of file reading and writing, buffer is mostly used to cache file management information, such as directory location and inode information. The cache caches the file content.

In terms of operating system operating principles, the buffer/cache aims to reduce MPFS and increase mnpfs.

MPF

Mnpf

When the disk data is loaded into the memory and the kernel reads the data again, a minor page fault (mnpf) is sent ).

The following example shows the MPF and mnpf generated by the first access to a program and the second access to the same program.

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 103

Minor (Reclaiming a frame) page faults: 2356

Second access:

/Usr/bin/time-V Java

Major (requiring I/O) page faults: 0

Minor (Reclaiming a frame) page faults: 2581

Demonstrate high-speed disk cache Process

Forcibly synchronize all pages to cache

This section describes how to forcibly synchronize all memory cache data to the disk and shows the free output result after the release.

Free pagecache

Echo 1>/proc/sys/Vm/drop_caches;

Free dentries and inodes

Echo 2>/proc/sys/Vm/drop_caches;

Free pagecache and dentries and inodes

Echo 3>/proc/sys/Vm/drop_caches;

Direct I/O

There are some more complex programs (usually called self-Cache applications) that prefer to control the I/O transmission process () by moving the o_direct flag location, the transfer of I/O data bypasses the page cache.

The system page cache technology is harmful for the following reasons:

The redundant commands used to process page cache reduce the read () and write () efficiency.
Read (), write () system calls are not directly transmitted on the disk and user space, but are divided into two steps: In the disk and kernel space, in the kernel space to the user space.

System parameters related to page Cache

/Proc/sys/Vm/dirty_background_ratio

Indicates the maximum percentage of dirty data available in the system memory. The maximum of percentage of (Cache + free)-mapped ). 10% by default.

/Proc/sys/Vm/dirty_ratio

Indicates the percentage of dirty data generated by the process to the overall memory of the system. At this time, the pdflush process is triggered to write the data back to the disk. The default value is 40.

/Proc/sys/Vm/dirty_expire_centisecs

Indicates that the dirty data resident in the memory exceeds this value. pdflush will write the data back to the disk. The default value is 3000.

/Proc/sys/Vm/dirty_writeback_centisecs

It indicates the interval between pdflush and processing dirty data write-back.

Forcibly synchronize all pages to cache

This section describes how to forcibly synchronize all memory cache data to the disk and shows the free output result after the release.

Free pagecache

Echo 1>/proc/sys/Vm/drop_caches;

Free dentries and inodes

Echo 2>/proc/sys/Vm/drop_caches;

Free pagecache and dentries and inodes

Echo 3>/proc/sys/Vm/drop_caches;

Direct I/O

The system page cache technology is harmful for the following reasons:

The redundant commands used to process page cache reduce the read () and write () efficiency.
Read (), write () system calls are not directly transmitted on the disk and user space, but are divided into two steps: In the disk and kernel space, in the kernel space to the user space.

System parameters related to page Cache

/Proc/sys/Vm/dirty_background_ratio

Indicates the maximum percentage of dirty data available in the system memory. The maximum of percentage of (Cache + free)-mapped ). 10% by default.

/Proc/sys/Vm/dirty_ratio

/Proc/sys/Vm/dirty_expire_centisecs

Indicates that the dirty data resident in the memory exceeds this value. pdflush will write the data back to the disk. The default value is 3000.

/Proc/sys/Vm/dirty_writeback_centisecs

It indicates the interval between pdflush and processing dirty data write-back.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux Io Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux Io Introduction

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support