Computer Basics Supplements (vii) page cache data synchronization and recovery mechanism page

Source: Internet
Author: User

This session of the Linux page caches data on the synchronization and recovery mechanism page. Data synchronization and recovery is a two separate page concept. data synchronization is a memory/data consistency problem cache data and backup device. Page recycling is how to reclaim a physical memory page that allocates insufficient memory space. In order to get enough space to allocate a clean page, higher priority work is supported . Steps can be triggered at random moments, and page recycling is triggered when physical memory is used to reach a certain threshold.


Data synchronization means that the dirty pages in the physical memory and the page cache are written back to the files in the backup device. There are two ways to call data synchronization

1. Periodic calls, mainly pdflush mechanisms

2. Forced invocation, for example, call sync, fsync system call.

When the number of dirty pages is very large, the kernel will also force data synchronization to control the number of dirty pages, so that data synchronization caused by the IO as smooth as possible


Pdflush is a set of kernel threads. Equivalent to the kernel maintaining a pdflush thread pool. According to the load of data synchronization to allocate the Pdflush thread, a Pdflush thread can corresponding a block device, so that multiple pdflush threads corresponding to multiple block devices, can avoid the IO load of a single block device is too large to affect the data synchronization of other block devices.

Cat/proc/sys/vm/nr_pdflush_threads Ability to view the number of Pdflush threads currently booted by the system


Data integrity synchronization triggered by system calls such as sync (that is, synchronizing all dirty pages), and kernel functions called by Pdflush triggered periodic brush-out synchronization are given.

1. Can see the target of data synchronization is mainly the object of the file system, such as file system Super block, file inode metadata, file inode data block.

2. Whether it is data integrity synchronization or flush synchronization. The final call path is pooled into the Sync_sb_inodes function. This function synchronizes all dirty inodes for a given super block

watermark/2/text/ahr0cdovl2jsb2cuy3nkbi5uzxqvsvrlcl9aqw==/font/5a6l5l2t/fontsize/400/fill/i0jbqkfcma==/ Dissolve/70/gravity/center ">


Synchronizing all Dirty inodes of a super block assumes that each time you traverse the entire Inode list to filter the Dirty inode. That efficiency is quite low. In fact, the kernel maintains a dirty inode list specifically. This dirty Inode list is pointed to by the Super_block---s_dirty pointer of the super block. It is only possible to synchronize the inode of the linked list in turn.


Synchronization of an inode consists of two parts, metadata synchronization and data block synchronization. The kernel provides a very many flag bits to fine-tune the operation details of data synchronization.


A few more system calls that force synchronization

Sync: Synchronize all the dirty pages. Is data integrity synchronization. When the IO request is sent to the request queue, it returns without waiting for the disk operation to complete.

Data loss can occur when there is a problem with the disk

Fsync: The metadata and data blocks of a single file are synchronized, waiting until the disk operation is complete before returning, ensuring the reliability of the data

Fdatasync: Data block synchronization for individual files. Wait until the disk operation is complete before returning, ensuring the reliability of the data

Msync: Synchronizing dirty pages generated by mmap


The page recycling mechanism consists of three parts, data brush out flush, swap swap, releasing release.

Data brush flush and data synchronization are similar, that is, the backup file of the page cache is synchronized to disk, so as to be able to reclaim these pages

Swap swap is primarily for anonymous mappings, private mappings, malloc dynamically allocated memory pages that do not have backup files, and swap them to swap areas located on disk to reclaim these pages

Releasing release is primarily a read-only memory page for some LRU, which is released directly in the event of a high pressure, so that the page can be recycled


The kernel's page recycling mechanism mainly solves several problems:

1. What recovery algorithm is used to ensure maximum benefit

2. Which pages are recycled

3. How to organize the swap area, how to access the swap area in the page

4. How to avoid page bumps in the case of high recovery pressure


Copyright notice: This article Bo Master original articles, blogs, without consent may not be reproduced.

Computer Basics Supplements (vii) page cache data synchronization and recovery mechanism page

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.