Computer Bottom Knowledge Supplements (vii) page cache data Synchronization and page recycling mechanism

Source: Internet
Author: User

This article is about the Linux page cache data synchronization and page recycling mechanism. Data synchronization and page recycling are two independent concepts, data synchronization is the memory/cache data and backup device data consistency problem, page recycling is the memory space when the allocation of the allocated physical memory pages, to obtain enough space to allocate clean pages, support higher priority work . Data synchronization can be triggered at any time, and page recycling is triggered when physical memory usage reaches a certain threshold.


Data synchronization means that the dirty pages in the physical memory and the page cache are written back to the files in the backup device. There are two ways to call data synchronization

1. Periodic calls, mainly pdflush mechanisms

2. Forced calls, such as Call sync, fsync system calls. When the number of dirty pages, the kernel will also force data synchronization, to control the number of dirty pages, so that data synchronization caused by the IO as smooth as possible


Pdflush is a set of kernel threads, equivalent to the kernel maintaining a pdflush thread pool, allocating Pdflush threads based on the load of data synchronization, a Pdflush thread can correspond to a block device, so that multiple pdflush threads correspond to multiple block devices, You can avoid excessive IO load on individual block devices that affect data synchronization for other block devices.

cat/proc/sys/vm/nr_pdflush_threads can view the number of Pdflush threads currently booted by the system


Data integrity synchronization triggered by system calls such as sync (that is, synchronizing all dirty pages), and kernel functions called by Pdflush triggered periodic brush out synchronization are given.

1. You can see that the target of data synchronization is mainly the object of file system, such as file system Super block, file inode metadata, file inode data block.

2. Whether it is data integrity synchronization or flush synchronization, the final call path is pooled into the Sync_sb_inodes function, which synchronizes all dirty inodes for a given super block


Synchronizing all Dirty inodes of a super block If you want to traverse all the inode lists every time to filter the dirty inode, the efficiency is quite low. In fact, the kernel maintains a dirty inode list, pointing to the dirty Inode list with the Super_block---s_dirty pointer to the super block, so that the inode of the linked list is synchronized in turn.


For an inode synchronization consists of two parts, metadata synchronization and data block synchronization, the kernel provides a number of flags to refine the operation details of data synchronization.


Compare several system calls that force synchronization

Sync: Synchronizing all dirty pages is data integrity synchronization. When an IO request is sent to the request queue, it is returned without waiting for the disk operation to complete. data loss can occur when a disk fails

Fsync: Synchronization of metadata and data blocks of a single file, waiting until the disk operation is complete before returning, ensuring the reliability of the data

Fdatasync: Data block synchronization for a single file, waiting until the disk operation is complete before returning to ensure the reliability of the data

Msync: Synchronizing dirty pages generated by mmap


The page recycling mechanism consists of three parts, data brush out flush, swap swap, releasing release.

Data brush flush and data synchronization are similar, which is to synchronize the page cache with backup files to disk, so that these pages can be recycled

Swap swap is primarily for anonymous mappings, private mappings, malloc dynamically allocated memory pages that do not have backup files, and swap them to swap areas located on disk to reclaim these pages

Release releases are primarily for some LRU read-only memory pages, which are released directly in the case of high pressure, so that the page can be recycled


The kernel's page recycling mechanism mainly solves several problems:

1. What recovery algorithm is used to ensure maximum benefit

2. Which pages are recycled

3. How to organize the swap area, how to access the swap area in the page

4. How to avoid page bumps in the case of high recovery pressure


Computer Bottom Knowledge Supplements (vii) page cache data Synchronization and page recycling mechanism

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.