Cache Mechanism of Network File System

Source: Internet
Author: User

Caching in network file systems
OSR staff | published: 09-may-03 | modified: 09-may-03
Cache Mechanism of Network File System

Typically, it is possible (and quite easy) for a file system filter driver to determine the caching policy of a local file system such as NTFS or fat by simply examining the state of the I/O Request Packet (IRP ). the irp_nocache bit in the flags field will tell the file system (and, of course, the filter) that the file I/O in question is not to be cached. normally, this is the clue to the file system driver that this data shocould not be cached.
For the file system filter driver, it is quite easy to determine the NTFS or fat Cache Policy of the local file system. Just check the I/O Request Packet (IRP) status. If the irp_nocache flag is detected, the file system does not allow file cache.

Network file systems are a bit more complex than this. while they also use the irp_nocache bit, they may also need to disable caching as a result of their own internal policy-perhaps directed by the State of the remote file on the file s
Erver, as well as other clients in the network that might be using the file. the rdbss. sys, which implements part of the "Mini redirector" model allows the redirector (for example mrxsmb. sys, which is the driver that implements CIFS or LAN
Manager functionality in Windows 2000 and more recent) to change the caching policy on a per-file basis. in this case, a normal irp_mj_read IRP, which wowould normally be cached, may be treated as non-cached.
The Network File System is a bit complicated in terms of cache policies. Although they also use the ipr_nocache flag, they also need to disable file cache in their internal policies. These internal cache policies are determined by the status of remote files on the file server, other clients on the network will use these files. The rdbss. sys Driver implements Part of the functionality called the "Mini redirector" model, which allows each redirector to change its buffer policy based on each file. In this case, an ordinary irp_mj_read type IRP may be treated as buffered or non-buffered.

For a filter driver that is modifying the data, the usual technique is to look for and operate on non-cached I/O operations. this will capture both paging I/O operations as well as user level non-cached I/O operations. however, if the filter wishes to also filter any of the mini-redirectors (there are two shipped in Windows XP for example) it needs to look at the fields of the file control block (FCB ).
For a filter driver that is modifying File Read data (for example, a transparent encryption/Decryption driver for file content ), it is usually implemented by checking and intercepting non-buffered I/O requests. In this way, they capture paging I/O operations or non-buffered I/O operations on the user layer. However, if the filter driver wants to intercept mini-Redirector, it must check the domain of the file control block (FCB.

For most file systems, the format of this structure is mostly under the control of the file system (partition t for the common header structure) but for mini-redirectors the format of the file control block is defined by the mini-Redirector Mode
L. see mrxfcb. h In the IFS kit for the full definition. the key data structure here (for a filter) is the mrx_fcb. the fcbstate field will indicate if the current state of the file is cached or non-cached. if the file allows caching
Fcb_state_readcaching_enabled bit will be set. Otherwise, I/O to the given file will be treated as non-cached.
For most file systems, the format of the FCB structure is mainly determined by the file system. Apart from the general header structure, the format of the FCB structure of mini-redirector is defined by the mini-Redirector mode, for the complete FCB definition, refer to mrxfcb. h. The most critical data structure for the filter driver is mrx_fcb. The fcbstate field of this structure describes whether the file needs to be cached or not. If the file can be cached, The fcb_state_readcaching_enabled flag is set. Otherwise, the specified file will be considered as non-cached. It is worth noting that in Windows Server 2003 ifs kit, the spelling of this sign has changed, and the current spelling is fcb_state_readcaching_enabled.

Note: In the Windows Server 2003 ifs kit the spelling of this flag has been Chan
Ged so that it is now fcb_state_readcaching_enabled.

While this allows a filter to determine the current state of the file, there does not appear to be any simple way for a filter to ensure that the state of this field does not change between the time the Filter checks it and the time the Cal
L is actually processed by the file system. thus, it is possible that the file State might change to disallow caching after this check is made. similarly, if the check is done after the I/O has been processed, it is possible the file stat
E might change to indicate that caching is now allowed once again. sample Code for this can be seen in the IFS kit (see smbmrx/wnet/sys/openclos. c) To demonstrate one potential implementation model.
By checking the status of the fcbstatus field of the FCB structure, the filter driver is allowed to check the status of the current file, however, there is no effective and simple way to ensure that the file status remains unchanged between the filtering Driver Check status and the file system actually processes the file. Therefore, this situation is likely to happen-when the file filter driver checks the File status, the file allows cache, but the cache is not allowed during actual processing. The opposite is true. For detailed examples, see the IFS example code (smbmrx/wnet/sys/openclos. C ).

To prevent the State from changing, the caller must acquire the FCB resource; in order to avoid deadlocks while calling the redirector, it must be owned exclusive (using the eresource In the FCB itself ). again, to do this requies relying u
Pon the implementation and published Interface available in the IFS kit.
To prevent such status changes, the caller must obtain FCB resources. To avoid deadlocks during redirecalls, the caller must exclusive FCB resources.

Note: This synchronization is only needed for user level cached requests, since paging I/O or user level non-cached requests will already not be cached as a matter of course. this is important because this lock cannot be safely acquired when processing paging I/O-This wowould violate the existing lock hierarchy and introduce the possibility of deadlocks.

Eventually I figured out this was because network redirectors like to set an internal flag called srvopen_flag_dontuse_write_cacheing when a file is opened for write-only, which causes the redirector to send all writes within ss the network as soon as it gets them, bypass the NT cache. this means any layered filter will see the ordinary write request, but never a corresponding paging-I/O Request. to get around this, my filter now has to forcibly turn every write-only network file open into aread/Write open.
When a file is opened in write-only mode, the network redirector sets an internal flag-srvopen_flag_dontuse_write_cacheing, this will cause the redirector to send all write requests to the network file server, bypassing the NT cache mechanism. This means that all the layered filter drivers can only see normal write requests, but not any corresponding paging I/O requests. In order to be able to filter the read/write paging requests of network files, my filter driver had to force all the requests to open the network files in write-only mode.

The following describes how to convert a write-only file into a read/write file:
The reason why I'm ranting in public is that seems that I can never know a-priori whether I will see a read or write request as both paging and non-Paging I/O, or one or the other, for a given filesystem. instead, I must special case my code for each filesystem and pray that I 've covered every scenario that can result in my not handling a read/write or
Handling it twice. the only alternative I can come up with is to force all reads/writes to a filtered file to be non-cached, with the corresponding performance penalties. is there an elegant way out of this mess?

I think I sent a message about this before. the long and the short of it is that you cannot force the redirector to cache writes for files that are open write-only. you must instead tweak the file permissions to 'convert' a write-only open of a network file into a read/write open. use code like the following to do so:

If (0 = (desiredaccess & (file_execute | file_read_data) & (0! = (Desiredaccess & (file_write_data |
File_append_data ))))
{
Pirpsp-> parameters. Create. securitycontext-> desiredaccess | = file_read_data;
}

The following explains why write-only files cannot be cached.
Write only handle in redirector doesn't Cache

--------------------------------------------------------------------------------

I got bitten by this a few months ago, so here is my take on the situation...

This is because the NT cache Needs read access to the file in order to work. the cache works on page-sized chunks. if you open a file and write 1 byte at location 0, and caching is enabled, the NT cache will page-in the Memory Page representing the first 4096 bytes of the file, which requires it to issue a paging I/O read for the first 4096 bytes. then it will paste in your new byte and mark the page dirty, which causes the page to be written out later by the lazy writer.

For local files, paging I/O is allowed to bypass all security checks (since only trusted kernel components can issue paging I/O requests ), so that paging reads are allowed on all opens for all files. for Network files, there is no reason the remote PC shoshould 'true' your PC and grant it read access if you only have write-access to the file. therefore the NT cache cannot be used on write-only remote files.

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.