Memory ing to modify large files

Source: Internet
Author: User
Tags rounds
Overview

This article describes how to use a memory ing file to modify a large file: Add a piece of data before the memory of a large file. To use a memory ing file, perform the following steps:

  1. Create or open a file kernel object to identify the file on the disk that you want to use as the memory ing file;
  2. Create a file ing kernel object to tell the system the file size and how you plan to access the file;
  3. Let the system map all or part of the file ing object to your process address space;

When using the memory ing file, you must perform the following steps to clear it:

  1. Tell the system to undo the image mapped to the kernel object from the address space of your process;
  2. Disable file ing kernel objects;
  3. Disable file kernel objects;

The following describes the procedure using an instance. (The purpose of this instance is to add some content before the content of file A to file B, I think you will encounter this situation in program development ).

Open the kernel object of file A and create A kernel object of file B.

To create or open a file kernel object, you must always call the CreateFile function:

HANDLE CreateFile(PCSTR pszFileName,DWORD dwDesiredAccess,DWORD dwShareMode,PSECURITY_ATTRIBUTES psa,DWORD dwCreationDisposition,DWORD dwFlagsAndAttributes,HANDLE hTemplateFile);

The CreateFile function has several parameters. Here we only focus on the first three parameters, namely pszFileName, dwDesiredAccess, and dwShareMode. You may guess that the first parameter pszFileName is used to specify the name of the file to be created or opened (including an option path), and the second parameter dwDesiredAccess is used to set how to access the file, you can set one of the four values listed in the following table.

Value Description
0 Cannot read or write the file content. If you only want to obtain the file attributes, set 0.
GENERIC_READ Data can be read from files.
GENERIC_WRITE Data can be written to a file.
GENERIC_READ | GENERIC_WRITE You can read data from a file or write data to a file.

When creating or opening a file and using it as a memory ing file, please select one or more of the most meaningful access signs to show how you plan to access the file data, for memory ing files, the files used for read-only access or read/write access must be opened. Therefore, you can set GENERIC_READ or GENERIC_READ | GENERIC_WRITE, the third parameter dw1_mode tells the system how to share the file. You can set one of the four values listed in the following table for dw1_mode:

Value Description
0 Any attempt to open the file will fail.
File_pai_read Other attempts to open the file using GENERIC_WRITE will fail.
File_pai_write Other attempts to open the file using GENERIC_READ will fail.
File_pai_readfile_pai_write Other attempts to open the file will be successful.

If the CreateFile function successfully creates or opens the specified file, the system returns the handle of the file kernel object. Otherwise, the system returns INVALID_HANDLE_VALUE. Note that most Windows functions that can return the handle fail to run, then NULL is returned. However, the CreateFile function returns INVALID_HANDLE_VALUE, which is defined as (HANDLE)-1)

HANDLEhFile=CreateFile(".\\first.txt",GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FILE_FLAG_SEQUENTIAL_SCAN,NULL);HANDLEhmyfile=CreateFile("E:\\my.txt",GENERIC_READ|GENERIC_WRITE,0,NULL,OPEN_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);
We need to create two file ing kernel objects respectively.

By calling the CreateFile function, you can tell the operating system the location of the physical storage of the file image. The path name you pass is used to specify the physical storage that supports the file image on the disk (or network or optical disk) in this case, you must tell the system how much physical storage is required for the file ing object. to perform this operation, you can call the CreateFileMapping function:

HANDLE CreateFileMapping(HANDLE hFile,PSECURITY_ATTRIBUTES psa,DWORD fdwProtect,DWORD dwMaximumSizeHigh,DWORD dwMaximumSizeLow,PCTSTR pszName);

The first hFile parameter identifies the file handle you want to map to the process address space. The handle is returned by the CreateFile function called earlier, the psa parameter is a pointer to the SECURITY_ATTRIBUTES structure of the file ing kernel object. Generally, the passed value is NULL (it provides the default security feature and the returned handle cannot be inherited ).

As mentioned at the beginning of this chapter, creating a memory ing file is just like retaining an address space area and then submitting the physical storage to this area, because the physical storage of the memory ing file comes from a file on the disk, instead of the space allocated from the page files of the system, when a file ing object is created, the system does not reserve the address space area for it, it also does not map the file memory to this region (the next section describes how to perform this operation). However, when the system maps the memory to the address space of the process, the system must know what protection attributes should be assigned to the pages of the physical storage. The fdwProtect parameter of the CreateFileMapping function enables you to set these protection attributes. In most cases, you can set one of the three Protection attributes listed in the following table. Partial protection attributes set using the fdwProtect Parameter

Protection attributes Description
PAGE_READONLY When a file ing object is mapped, the file data can be read and must have been passed to the CreateFile function.
PAGE_READWRITE When a file ing object is mapped, the data of the file can be read and written. GENERIC_READ | GENERIC_WRITE must have been passed to CreateFile.
PAGE_WRITECOPY When a file ing object is mapped, you can read and write file data. If you write data, the private copy of the page will be created. You must have passed GENERIC_READ or GENERIC_WRITE to CreateFile.

In Windows 98, you can pass the PAGE_WRITECOPY flag to CreateFileMapping, which tells the system to submit the storage from the page file. This page file storage is reserved for copying data files, only modified pages are written into the page file. No modification made to the file will be made to the original data file. The final result is, the PAGE_WRITECOPY flag works the same in Windows2000 and Windows98.

In addition to the preceding page protection attributes, there are also four section protection attributes. You can use or to connect them to the fdwprotect parameter of the createfilemapping function. The Section is just another term used for memory ing.

The first protection attribute in section is sec_nocache, which tells the system that no memory ing page of the file is put into the cache. Therefore, when data is written to the file, the system will update the file data on the disk more often. This flag is the same as the page_nocache protection attribute mark and is used by Device Driver developers. Applications are usually not used, the sec_nocache flag is ignored in Windows 98.

The second protection attribute in section is sec_image, which tells the system that the file you map is a portable executable (PE) file image, when the system maps the file to the address space of your process, the system needs to view the file content to determine which protection attributes are assigned to each page of the file image, for example, code section of the PE file (. text) The page_execute_read attribute is usually used for ing, while the data section of the PE file (. data) is ing through the commonly used page_readwrite attribute. If the set attribute is sec_image, the system is told to map the file image and set the corresponding page protection attribute. The sec_image flag is ignored in Windows 98.

The last two protection attributes are sec_reserve and sec_commit. They are mutually exclusive attributes. They cannot be used when memory is used to map data files. These two labels will be described later in this chapter, when creating a memory ing data file, do not set any of these flags. createfilemapping ignores these flags.

The other two parameters of CreateFileMapping are dwMaximumSizeHigh and dwMaximumSizeLow. They are two of the most important parameters. The main function of the CreateFileMapping function is to ensure that the file ing object can obtain enough physical storage, these two parameters will tell the system the maximum number of bytes of the file, which requires two 32-bit values, because the file size supported by Windows can be expressed by a 64-bit value, the dwMaximumSizeHigh parameter is used to set a higher 32-bit value, while the dwMaximumSizeLow parameter is used to set a lower 32-bit value. For files with 4 GB or less than 4 GB, The dwMaximumSizeHigh value is always 0.

The 64-bit value means that Windows can process files up to 16EB (1018 bytes). If you want to create a file ing object so that it can reflect the current file size, you can pass 0 for the above two parameters. If you only want to read the file or access the file without changing its size, then pass 0 for the two parameters, if you want to attach the data to the file, you can select the maximum file size to leave you some rich space. If the file on the current disk contains 0 bytes, therefore, we can pass two zeros to the dwMaximumSizeHigh and dwMaximumSizeLow functions of the CreateFileMapping function. In this way, we can tell the system that the memory in the file ing object you want is 0 bytes, which is an error, createFileMapping returns NULL.

If you have been very concerned about the content, you must think that there is a serious problem here. Windows supports files with a maximum of 16 EB and file ing objects, which is of course good,, how can we map such a large file to the address space of a 32-bit process (the 32-bit address space is the upper limit of a 4 GB file)? The next section describes how to solve this problem, of course, a 64-bit process has 16 EB address space, so it can perform larger file ing operations. However, if the file is a very large-scale file, it will still encounter similar problems.

To really understand how the CreateFile and CreateFileMapping functions run, we recommend that you do the following experiment, create the following code, and compile and translate it, then run it in a debugging program. When you execute each statement step by step, you will jump to a command to explain the program and execute the "dir" command in the C: \ directory, when executing each sentence in the debugging program, pay attention to the changes in the directory.

int WINAPI _tWinMain(HINSTANCE hinstExe, HINSTANCE,   PTSTR pszCmdLine, int nCmdShow){   //Before executing the line below, C:\ does not have   //a file called "MMFTest.Dat."   HANDLE hfile = CreateFile("C:\\MMFTest.dat",       GENERIC_READ | GENERIC_WRITE,      FILE_SHARE_READ | FILE_SHARE_WRITE, NULL, CREATE_ALWAYS,      FILE_ATTRIBUTE_NORMAL, NULL);   //Before executing the line below, the MMFTest.Dat   //file does exist but has a file size of 0 bytes.   HANDLE hfilemap = CreateFileMapping(hfile, NULL, PAGE_READWRITE,      0, 100, NULL);   //After executing the line above, the MMFTest.Dat   //file has a size of 100 bytes.   //Cleanup   CloseHandle(hfilemap);   CloseHandle(hfile);   //When the process terminates, MMFTest.Dat remains   //on the disk with a size of 100 bytes.   return(0);}

If you call the CreateFileMapping function and pass the PAGE_READWRITE flag, the system tries to ensure that the size of the data files on the disk is at least the same as that set in the dwMaximumSizeHigh and dwMaximumSizeLow parameters, if the file size is smaller than the set size, the CreateFileMapping function will expand the size of the file to make the file on the disk larger. This extension is necessary, when the file is used as a memory ing file in the future, the physical storage will already exist. If you are using the PAGE_READONLY or PAGE_WRITECOPY flag to create the file ing object, therefore, the size of the specified CreateFileMapping file must not be greater than the physical size of the disk file, because you cannot attach any data to the file.

The last parameter of the CreateFileMapping function is pszName, which is a string ending with 0 and is used to assign a name to the ing object of the file, this name is used to map objects to shared files with other processes. (This chapter describes an example and Chapter 3rd describes the sharing operations of kernel objects in detail ), memory ing data files do not need to be shared, so this parameter is usually NULL.

The system creates a file ing object and returns the handle used to identify the object to the calling thread. If the system cannot create a file ing object, it returns a NULL handle value. Remember, when CreateFile fails to run, it returns INVALID_HANDLE_VALUE (defined as-1). When CreateFileMapping fails to run, it returns NULL. Do not confuse these error values.

The code for creating a file ing kernel object in this instance is as follows:

HANDLE hFileMapping = CreateFileMapping (hFile, NULL, PAGE_READONLY, 0, 0, NULL); DWORD dwFileSizeHigh; _ int64 qwFileSize = GetFileSize (hFile, & dwFileSizeHigh ); qwFileSize + = (_ int64) dwFileSizeHigh) <32); char AddMsg [] = "Girl, you love me ?, I love you very much! "; // Added file content _ int64 myFilesize = qwFileSize + sinf. dwAllocationGranularity; // the size of the merged file HANDLE hmyfilemap = CreateFileMapping (hmyfile, NULL, PAGE_READWRITE, // memory ing object (DWORD) of the merged file size (myFilesize> 32 ), (DWORD) (myFilesize & 0 xFFFFFFFF), NULL );
Map File data to address space

After a file ing object is created, the system must retain an address space area for the file data and submit the file data as physical storage mapped to the region, you can call the MapViewOfFile function to perform this operation:

PVOID MapViewOfFile(HANDLE hFileMappingObject,DWORD dwDesiredAccess,DWORD dwFileOffsetHigh,DWORD dwFileOffsetLow,SIZE_T dwNumberOfBytesToMap);

The hFileMappingObject parameter is used to identify the handle of the file ing object, which is returned by the CreateFileMapping or OpenFileMapping (described later in this chapter) function. The dwDesiredAccess parameter is used to identify how to access the data, you must set again how to access the file data. You can set one of the four values listed in the following table. The values and their meanings are as follows:

Value Description
FILE_MAP_WRITE You can read and write file data. The CreateFileMapping function must be called by passing the PAGE_READWRITE flag.
FILE_MAP_READ File data can be read. The CreateFileMapping function can be called by passing any of the following protection attributes: PAGE_READONLY, PAGE_READWRITE or PAGE_WRITECOPY.
FILE_MAP_ALL_ACCESS Same as FILE_MAP_WRITE
FILE_MAP_COPY You can read and write file data. If you write file data, you can create a private copy of the page. in Windows2000, The createfilemapping function can use page_readonly, for protection attributes such as page_readwrite or page_writecopy, createfilemapping must be called using page_writecopy in Windows98.

Windows requires that all these protection attributes be set again and again, which is strange and annoying. I think this will allow applications to control more data protection attributes, the remaining three parameters are related to the reserved address space area and the physical memory mapped to this area. When you map a file to the address space of your process, you do not need to map the entire file at a time. You can map only a small part of the file to the address space. This part of the file mapped to the address space of the process is called a view, this shows how MapViewOfFile is named. When you map a File View to the address space of a process, you must specify two things. First, you must tell the system, which byte in the data file should be mapped as the first byte in the view. You can use the dwFileOffsetHigh and dwFileOffsetLow parameters to perform this operation. Because Windows supports files up to 16 EB, therefore, a 64-bit value must be used to set the displacement value of this byte. In this 64-bit value, a higher 32-bit value is passed to the dwFileOffse parameter. THigh, the lower 32-bit value is passed to the dwFileOffsetLow parameter. Note that the displacement value in the file must be a multiple of the system's allocation granularity (so far, all the implementation codes in Windows are distributed at a granularity of 64 KB. Chapter 14th describes how to obtain the distribution granularity of a system.

Second, you must tell the system how many bytes of the data file is mapped to the address space, which is the same as setting the size of the address space area, you can use the dwNumberOfBytesToMap parameter to set this value. If the value is set to 0, the system will try to map the View starting from the specified displacement in the file to the end of the entire file to the address space.

In Windows 98, if MapViewOfFile cannot find a large enough area to store the entire file ing object, MapViewOfFile returns NULL no matter how large the view is. In Windows2000, MapViewOfFile only needs to find an area that is large enough for the necessary view, regardless of the size of the entire file ing object.

If the FILE_MAP_COPY flag is set when the MapViewOfFile function is called, The system submits the physical storage from the system page file. The number of submitted address spaces is determined by the dwNumberOfBytesToMap parameter, as long as you do not perform other operations, if you only read data from the image view of a file, the system will never use these submitted pages in the page file. However, if any thread in the process writes data to any memory address in the Image view of the file, the system will capture a page on the submitted page from the page file, copy the original data page to the page swap file, and then map the copied page to the address space of your process. From then on, the thread in your process needs to access the local copy of data and cannot read or modify the original data.

When the system creates a copy of the original page, the system changes the page protection attribute from PAGE_WRITECOPY to PAGE_READWRITE. The following code snippet illustrates this situation:

// Open the file that we want to map.HANDLE hFile = CreateFile(pszFileName, GENERIC_READ | GENERIC_WRITE, 0, NULL,   OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);// Create a file-mapping object for the file.HANDLE hFileMapping = CreateFileMapping(hFile, NULL, PAGE_WRITECOPY,   0, 0, NULL);// Map a copy-on-write view of the file; the system will commit// enough physical storage from the paging file to accommodate// the entire file. All pages in the view will initially have// PAGE_WRITECOPY access.PBYTE pbFile = (PBYTE) MapViewOfFile(hFileMapping, FILE_MAP_COPY,   0, 0, 0);// Read a byte from the mapped view.BYTE bSomeByte = pbFile[0];// When reading, the system does not touch the committed pages in// the paging file. The page keeps its PAGE_WRITECOPY attribute.// Write a byte to the mapped view.pbFile[0] = 0;// When writing for the first time, the system grabs a committed// page from the paging file, copies the original contents of the// page at the accessed memory address, and maps the new page// (the copy) into the process's address space. The new page has// an attribute of PAGE_READWRITE.// Write another byte to the mapped view.pbFile[1] = 0;// Because this byte is now in a PAGE_READWRITE page, the system// simply writes the byte to the page (backed by the paging file).// When finished using the file's mapped view, unmap it.// UnmapViewOfFile is discussed in the next section.UnmapViewOfFile(pbFile);// The system decommits the physical storage from the paging file.// Any writes to the pages are lost.// Clean up after ourselves.CloseHandle(hFileMapping);CloseHandle(hFile);

As mentioned earlier in Windows98, Windows98 must submit the storage in the page file for the memory ing file in advance. However, it only writes the modified page to the page file when necessary.

Undo file data ing from process address space

When you no longer need to retain the file data mapped to your process address space, you can release it by calling the following function:

BOOLUnmapViewOfFile(PVOIDpvBaseAddress);

The unique parameter pvBaseAddress of this function is used to set the base address of the returned region. The value must be the same as the value returned by calling the MapViewOfFile function. Remember to call the UnmapViewOfFile function. If this function is not called, the reserved region will not be released until your process stops running. Whenever you call MapViewOfFile, the system always keeps a new region in your process address space, all previously reserved regions will not be released.

To increase the speed, the system caches the data page of the file and does not immediately update the disk image of the file when operating on the ing view of the file, if you need to ensure that your update is written to the disk, you can force the system to re-write part or all of the modified data to the disk image by calling the FlushViewOfFile function:

BOOLFlushViewOfFile(PVOIDpvAddress,SIZE_TdwNumberOfBytesToFlush);

The first parameter is a byte address of the view contained in the memory ing file. This function rounds the address you passed here into a page boundary value, the second parameter is used to specify the number of bytes you want to refresh. The system rounds up the number so that the total number of bytes is an integer on the page. If you call the flushviewoffile function and do not modify any data, then, this function only returns data without writing any information to the disk.

For memory ing files on the network, flushviewoffile can ensure that the file data has been written to the memory from the workstation. However, flushviewoffile cannot ensure that the server that is sharing files has written data to a remote disk, because the server may cache the file data at a high speed, to ensure that the server writes the file data, whenever you create a file ing object for the file and map the view of the file ing object, you should pass the file_flag_write_through flag to the createfile function. If you use this flag to open the file, the flushviewoffile function returns only when all the file data is stored in the disk drive of the server.

Remember a special feature of the unmapviewoffile function. If you used the file_map_copy flag to map views, you can modify the file data, it is actually a modification made to copy the file data stored in the system's page file. In this case, if you call the unmapviewoffile function, there is nothing to update this function on the disk file, instead, only the pages in the page file will be released, resulting in data loss.

If you want to retain the modified data, you must take other measures. For example, you can use the same file to create another file ing object (using PAGE_READWRITE ), then, use the FILE_MAP_WRITE flag to map the new file ing object to the address space of the process. Then, you can scan the first view to find a page with PAGE_READWRITE protection property, when you find a page with this attribute, you can view its content and determine whether to write modified data to the file. If you do not want to update the file with new data, scan the remaining pages in the view until the end of the view. However, if you want to save the modified data page, you only need to call the MoveMemory function, copy the data page from the first view to the second view. Since the second view uses PAGE_READWRITE to protect attribute ing, the MoveMemory function updates the actual file content on the disk, you can use this method to determine file changes and save the data of your files.

Windows 98 does not support copy-on-write (copy upon write) Protection attributes. Therefore, when scanning the first view of the memory ing file, the page marked with PAGE_READWRITE cannot be tested, you must design a method to determine which pages in the first view have been modified.

Disable file ing objects and file objects

Needless to say, you always need to close the kernel object you opened. If you forget to close it, the resource leakage will occur when your process continues to run. Of course, when your process stops running, the system will automatically shut down any objects that your process has opened but you forgot to close. However, if your process has not ended yet, you will accumulate many resource handles, so you should always write clear and "correct" code to close any opened object, to disable the file ing object and file object, you only need to call the CloseHandle function twice. Each handle is called once:

Let's take a closer look at this process. The following pseudo code shows an example of a memory ing file:

HANDLEhFile=CreateFile(...);HANDLEhFileMapping=CreateFileMapping(hFile,...);PVOIDpvFile=MapViewOfFile(hFileMapping,...);//Usethememory-mappedfile.UnmapViewOfFile(pvFile);CloseHandle(hFileMapping);CloseHandle(hFile);

The code above shows the "expected" method used to operate the memory ing file, but it does not show, when MapViewOfFile is called, the number of file objects and file ing objects increases progressively. This side effect is very large, because it means we can rewrite the above code snippet as follows:

HANDLEhFile=CreateFile(...);HANDLEhFileMapping=CreateFileMapping(hFile,...);CloseHandle(hFile);PVOIDpvFile=MapViewOfFile(hFileMapping,...);CloseHandle(hFileMapping);//Usethememory-mappedfile.UnmapViewOfFile(pvFile);

When operating on memory ing files, you usually need to open the file, create a file ing object, and then map the data view of the file to the address space of the process using the file ing object, because the system increases the internal usage count of file objects and file ing objects, you can disable these objects when your code starts to run to eliminate the possibility of resource leakage, if you use the same file to create more file ing objects or map multiple views of the same file ing object, you cannot call the CloseHandle function earlier. You may need to use their handles to call the CreateFileMapping and MapViewOfFile functions separately. The code for Steps 3 to 6 in this instance is as follows:

Closehandle (hfile); // wenolongerneedaccesstothefileobject 'shandle. closehandle (hmyfile); counter = (pbyte) mapviewoffile (hmyfilemap, file_map_write, // memory ing view 0, // startingbyte0, // infilesizeof (addmsg); memcpy (pbmyfile, addmsg, sizeof (addmsg); // Add the content unmapviewoffile (pbmyfile) ;__ int64qwfileoffset = 0; // offset of a file view _ int64qwmyfileoffset = sinf. dwallocationgranularity; // merge file views while (qwfilesize> 0) {// determinethenumberofbytestobemappedinthisviewdworddwbytesinblock = sinf. dwallocationgranularity; If (qwfilesize <sinf. dwallocationgranularity) // file smaller than the system allocation granularity dwbytesinblock = (DWORD) qwfilesize; // The offset is the file size pbytepbfile = (pbyte) mapviewoffile (hfilemapping, file_map_read, (DWORD) (qwfileoffset> 32), // startingbyte (DWORD) (qwfileoffset & 0 xffffffff), // optional); // # Signature = (pbyte) mapviewoffile (hmyfilemap, file_map_write, (DWORD) (qwmyfileoffset> 32), // startingbyte (DWORD) (qwmyfileoffset & 0 xffffffff), // infiledwbytesinblock); memcpy (pbmyfile, pbfile, dwbytesinblock ); // unmaptheview; wedon 'twantmultipleviews // inouraddressspace. unmapviewoffile (pbfile); unmapviewoffile (pbmyfile); // skiptothenextsetofbytesinthefile. qwmyfileoffset + = dwbytesinblock; qwfileoffset + = dwbytesinblock; qwfilesize-= dwbytesinblock;} closehandle (hfilemapping); closehandle (hmyfilemap );

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.