Lock_page of node page in F2FS

Last Update:2016-05-14 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Node page lock_page First is to change the state of the page: Set_page_dirty, as well as Set_nid operation will also set the parent node Nid, but this setting node-page granularity is not too small!

Node_page first does not have the user-state process to operate it, because node is transparent to the user state, so the Lock_page Node page processing race condition is:

Node is divided into two categories: Dnode and Indirect node, and for Dnode, the GC and f2fs_write_data_page that need to be processed

Dnode's Lock_page deals with the mutual exclusion of gc&f2fs_write_data_page;

GC is a very simple process, found the file node, directly move_data, the original GC is the entire F2FS file system the ultimate big boss! There is a very important mutex in the bread: GC and truncate mutually exclusive!

Because the entire GC process is done through SSAS, the SSA holds the dnode of the block, and then through this NID finds the information of the Ino that the nid belongs to;

Since the SSA is no longer modified once it is written back, the most important reference to the SSA to reclaim data is the sit bitmap, which is a super-important, if the file has not been operated fortunately, once the file write operation, resulting in an offsite update, then this block is invalid, But sit's judgment is over! Another problem, that is truncate part of the data, and truncate the file all the data, will find that the address of the block is meaningless, but the SSA is still in the data!

How do you tell if an inode is still in the data?

First, according to summary in the NID get Node_page, the information in Node_page enriched (this other), and then according to the NID information, we find the nid_root base tree, according to this tree, we can find this nid richer information, Including this Nid's Blkaddr, Ino, version, and flag, of course, this flag refers to checkpointed those, is to fsync, recovery mechanism used! That is, according to Nid in the SSA, the basic information for this file is internationals !

So here are the various conditions:

1) When the file has been updated offsite. At this point, the NID is still strong, I have to use, but I node_page in the index has changed, so I need to determine whether it is node_page[ofs] is equal to BLK_ADDR,

2) When the file has truncate operation. In two cases, whether the Data_block Dnode is deleted, if not, then it can be attributed to the situation 1), if deleted, then this nat_entry version will be + 1, and has no dnode, even the address is no better than! [ It seems that lock_page is still right, look, version has been the unsolved mystery has finally been solved! ]

3) When the entire file is truncate off, it can be attributed to the situation 2);

There should be no other situation, because F2FS code inside also said 1 and 2), hey, I also don't want to, heart tired!

But, no! There are more senior bosses waiting, and that's how it deals with mutual exclusion,

As you can see, there are two parts to the mutex-related code, first look at the GC section:

The code in the GC mainly comes up first to hold the Dnode Lock (Lock_page) , which is important, because we find that when the site is updated and truncate , It is also the first to try to get dnode lock to carry out the following operations (offsite update:do_write_data_page, truncate:truncate_dnode)! The Data Update section is done between these two parts!

But the entire code is debatable, inside the is_alive function, in the Dnode in Lock_page under the premise of the Inode inspection, after the inspection, then directly back out, assuming that at this time, the occurrence of do_write_data_page or Truncate_dnode, then is_alive a step of the inspection is also filtered out, in fact, to maximize the filtering out of useless segment!

-----------------

At this point, the F2fs file system, the basic all the meta-data are the end of the study!

So, to see the lock in the file system at high level, here are some tips:

1) The file system exposes the external interface for use by the user:write/read/truncate/mkdir/unlink/fallocate and so on, these operations are Inode->i_mutex Protection, the size of this lock is very large, can shield off many of our concerns. Inode->i_mutex Lock, is to achieve user file operation level of atomicity, this is very important, because it involves the file metadata updates, must not enter this shared resource area at the same time!

2) Many important metadata changes are introduced in the external file system interface changes, such as the F2fs file system, truncate, will involve a large polygon nat_entry The node data changes in the tree, node page Invalid index The changes

3) The Write_back process involves the following points: ① metadata is updated because it involves the update of the sit, SSA, and global data, including the number of valid blocks, etc., so f2fs_lock_op is needed to ensure that the Write_ Checkpoint's mutual exclusion! ② because the data page has been lock_page, so we are most afraid of truncate operation must not worry about, because truncate this page will lock failure, so at this moment, no process will touch this page, This will not touch the page Dnode will not have any problems, so this page is safe, at this time Dnode page has been created, but at this moment, Dnode should also be safe [will there be a state? Truncate, is not all the first to get Dnode, and then go to truncate inside of the data? should not, truncate will truncate off all the pages in Page-cache!]

4) After Write_back, it is likely to trigger the truncate operation immediately, at this time is very embarrassing, just write back the thing is invalid, but does not affect our file system behavior!

Suddenly think of one thing, in here White pull, is in write_checkpoint, how can guarantee the file system always sex? First, we will write back all the dirty dentry, and then write back the dirty node_page, and then some basic meta-data is written back in the same way, and then there are some basic data, including NAT, Sit,ssa and other information, but this time there is a very awkward problem, That's NAT, including many of the indexes in Dnode are NEW_ADDR (-1), which means the data hasn't been written back yet!

First NAT is not the problem, because when write_checkpoint, node's page will be forced to brush back, so that all the assigned NID, its corresponding node will correspond, so this is irrelevant, But Dnode still may exist new_addr situation, this time that the last data did not write back to force power off!

Lock_page of node page in F2FS

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Lock_page of node page in F2FS

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Lock_page of node page in F2FS

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support