Introduction to Windows 2000/XP File Cache implementation (http://webcrazy.yeah.net)

Source: Internet
Author: User
Introduction to File Cache implementation in Windows 2000/XP
WebSphere (http://webcrazy.yeah.net)

Disclaimer: The content described in this article is the cache management content for Windows NT/2000/XP, And the content involved is Microsoft unreceivented. I have analyzed a large amount of data and found that there may be more or less errors in the content, or even errors at all. But for the purpose of learning and communicating, right when the notes are written are posted here (http://webcrazy.yeah.net), I want to add more to the communication (tsu00@263.net) so that I can update the errors later.

I remember that I was deeply impressed by the efficacy of smartdrv.exe under dos. I don't remember when I was interested in this small file and learned it. This is my initial impression of File Cache. The cache Management Section in Windows NT/2000/XP gives me the overall concept of implementing File Cache in modern operating systems. I certainly have a deep understanding of either "Inside Windows 2000" or "Windows NT File System internals". I am not prepared to be too rigid on the content they have covered, I found that many of the things they involved are not in-depth (although there are some points), and maybe there is no need to talk about the specific implementation. This will make some improvements with the emergence of the new version of Windows. I have benefited from these two books and many other materials. I also suggest you read them carefully before reading this article.

We know that the most important part of the cache is to cache files (including remote network files) and other time-consuming Io operations at a high speed (put in relatively fast memory ), since the object to be operated is mainly file (many objects in windows are used as file objects at the underlying layer), our description has started with file objects. Each file object has a section_object_pointers definition (for detailed definitions, see file_object in ntddk. h or ntifs. h ). I pointed out its definition in "exploring the Windows NT/2000 copy on write mechanism:

Typedef struct _ section_object_pointers {
Pvoid datasectionobject; // Control Area
Pvoid sharedcachemap;
Pvoid imagesectionobject; // Control Area
} Section_object_pointers;

Because both datasectionobject and imagesectionobject point to an internal structure called control area, and the control area structure points to an internal structure called segment. Segment consists of one or more prototype PTE (PPTE. PPTE is a software structure. I mentioned PPTE at the end of Windows NT/2000 paging mechanism, and also mentioned that PPTE is different from the conventional hardware PTE introduced there. One purpose of PPTE is to share pages. One or more PPTE (s) form a subsection. The output results of these CA commands Based on Kernel debug are analyzed.

Sharedcachemap is the structure used by the real file cache. Unlike privatecachemap, another member of file_object, any sharedcachemap that opens to the file_object structure pointed to by the handle of the file in the same file is the same value, which is also a guarantee for shared files. Let's take a look at the two file_object pointing to shdocvw. dll at a certain point in the system (the addresses are 80d37550 and 80db4860 respectively:

// Point to the first file_object of shdocvw. dll:
: Fobj 80d37550
Deviceobject *: 80ee2888
VPB *: 80ee2800
Fscontext *: e14d2d90
Fscontext2 *: e14d2ee8
Secobjpointer *: 80dd15f4
Privatecachemap *: 80ea70a0
.
.
.
Filename:/Windows/system32/shdocvw. dll
.
.
.

// Point to the second file_object of shdocvw. dll:
: Fobj 80db4860
Deviceobject *: 80ee2888
VPB *: 80ee2800
Fscontext *: e14d2d90
Fscontext2 *: e1183678
Secobjpointer *: 80dd15f4
Privatecachemap *: 80d6f2d8
.
.
.
Filename:/Windows/system32/shdocvw. dll
.
.
.

From the above SoftICE results, we can easily find that section_object_pointers points to the same memory region, so sharedcachemap must point to the same value, while privatecachemap is two completely different values. For further analysis, fscontext also points to the same value, while fscontext2 is different. In fact, when FSD is implemented, one of the two members is file Control Buffer (see ntifs. define fsrtl_common_fcb_header in H), which points to context Control Buffer. For specific usage, refer to fastfat or CDFs implementation in ntifs, which is not covered in this article.

After file_object analysis, we can draw a conclusion: Through file_object, We can get one or two control areas of the file object, and through control area, we can get the segment, both the subsection and pptes under the segment, and the sharedcachemap of all instances of this file can be obtained through file_object. In fact, sharedcachemap also has a pointer to file_object. I will continue to explain it below.

I remember that I defined this structure in Analysis of Windows NT/2000 heap memory and Virtual Memory organization long ago:

Typedef struct VAD {
Void * startingaddress;
Void * endingaddress;
Struct VAD * parentlink;
Struct VAD * leftlink;
Struct VAD * rightlink;
Ulong flags;
Ulong mmci;
Ulong protopte;
} Vad, * pvad;

In fact, mmci is also called control area. I have explained it in "exploring the Windows NT/2000 copy on write mechanism. Protopte is PPTE.

With this concept, we can easily find out how PPTE achieves page sharing. When the system first accesses the page pointed to by the PPTE, because the bit 0 of the PPTE is invalid, the page fault occurs, which is handled by the int e, that is, the control permission is handled by kitrap0e in ntoskrnl.exe by mmaccessfault, and then by querying, you can use the control area and PPTE to locate the PFN database, read the content on the disk, and update the PTE for sharing.

After talking about file_object in detail, we turn to file cache. Windows 2000/XP retains two system virtual memory areas directed by mmsystemcachestart and misystemcachestartextra for system cache. This specialized area is divided into views with a size of vacb_mapping_granularity (defined by ntifs. h with a value of 0x40000, or KB. The usage of each view is represented by an internal structure called vacb in the system. In this way, the system has a vacb array, which is specified by ccvacbs. Inside Windows 2000 lists four members of vacb and points out that vacb contains an important member, sharedcachemap! Result of DSO command output:

Kd>! DSO vacb // kernel debug extension build 2167 free
Structure vacb-size: 0x18
000 baseaddress 004 sharedcachemap
008 overlay 010 lrulist

However, through analysis, I found that vacb in Windows XP is actually composed of six DWORD values, and its second DWORD is sharedcachemap. Windows XP uses six Dwords, probably because it supports the specific implementation of larger mapped Files (file_offset uses large_integer, although the fileoffset member is also pointed out in inside Windows 2000, however, I do not know why the output result of the DSO command does not explicitly provide such a member. I am also wondering, is it a difference between versions ?). The following is an analysis under XP:

Kd> dd ccvacbs L 1
80542fec 80ecc000
Kd> dd 80ecc000 // dd vacb
80ecc000 cb440000 80eaac78 00300000 00000000
80ecc010 80eceda0 80ecf598 d4040000 80eaac78
80ecc020 01000000 00000000 80ecd6a8 80ecede8
80ecc030 c1100000 80eefed0 00000000 00000000
80ecc040 80ece050 80542fe0 deb40000 ffa55538
80ecc050 00000000 00000000 80ecd558 80ecc958

Kd> dd 80eaac78 L 12 // dd vacb-> sharedcachemap
80eaac78 013002ff 00000001 01a95400 00000000
80eaac88 80eaac68 80eaaac0 01b00000 00000000
80eaac98 ffffffff 7 fffffff ffffffff 7 fffffff
80eaaca8 00000000 00000000 00000000 00000000
80eaacb8 80eaa910 80ee2a80

I mentioned above that sharedcachemap also has a pointer pointing to file_object. I found that it is located on the 18th DWORD, and below is the dump of file_object.
Kd> dd 80ee2a80 L 6 // dd file_object
80ee2a80 00700005 80ee2888 80ee2800 80e7f9b0
80ee2a90 00000000 80ee23a4

The definition of file_object has been given in ntddk. H, so it is easy to get its section_object_pointers.
Kd> dd 80ee23a4 L 3 // dd file_object-> section_object_pointers
80ee23a4 80e7f710 80eaac78 00000000
Kd>! CA 80e7f710 //! CA section_object_pointers-> datasectionobject

Controlarea @ 80e7f710
Segment: e127a548 flink 0 blink 0
Section Ref 1 PFN ref 275 mapped views 5f
User ref 0 waitfordel 0 flush count 0
File object 80ee2a80 modwritecount 0 system views 5f
Flags (8088) nomodifiedwriting file waspurged

File:/$ MFT
.
.
.

In this way, dump the content of the first vacb in the system cache to which ccvacbs points, from the output results of the CA command, we can see that the first view of system cache is used by NTFS metadata $ MFT. Kernel debug also provides a command called filecache, which I am using! Filecache commands have been reviewed in this document (windbg: 6.0.20.7.0 ):

Each line of this extension's output represents a virtual address control block (vacb ). when named files are mapped into the vacb, the names of these files are displayed. if "no name for file" is specified, this means that this vacb is being used to cache metadata.

From the above annotations, I thought that the filecache command would traverse the vacb array and dump all control areas, just like the vacb analysis process above. But in fact, this command output not only this content. By analyzing the implementation of filecache in kdexts. dll (Windows XP), we find that it uses psloadedmodulelist to implement dump filecache (the specific content needs to be further learned ).

Finally, the system uses the ccisfilecached macro to determine whether the file is cached. It is defined in ntifs. h and is easy to understand after the above analysis:

# Define ccisfilecached (FO )(/
(FO)-> sectionobjectpointer! = NULL )&&/
(Pse_object_pointers) (FO)-> sectionobjectpointer)-> sharedcachemap! = NULL )/
)

This article does not involve another very important data structure PFN database, which is specified by the matrix database. The shared page in the PFN database has the PPTE definition. In fact, after page fault pointing to the page of the PPTE, after finding the VAD, the system completes the next step based on the PPTE of the PFN database.

In fact, I am more interested in the specific definitions of control area, subsection, PPTE, vacb, sharedcachemap, and so on, but I still have only a few concepts about this. This is also my purpose of this article, but also hope to achieve the role of throwing bricks and jade (tsu00@263.net ).

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.