Http://blog.csdn.net/edwardlulinux/article/details/8604400
Many articles have analyzed the implementation principle of MMAP. From the logic of the code, I always feel that the read/write ing area after MMAP is not associated with the common read/write areas. Have to have questions:
1. What is the difference between reading and writing in the ing areas after the common read/write and MMAP.
2. Why do I sometimes choose MMAP and discard the common read/write statements.
3. If the content in the article is incorrect or inappropriate, you are welcome to correct it.
Looking at these two problems, we can't help but interact with many other system mechanisms while considering these problems. Although MMAP is explained, a lot of knowledge is necessary to clarify the problem. This knowledge is also the tedious part of Linux. An application often interacts with multiple mechanisms in the system. In this article, we try to minimize the reference and Analysis of source code. Leave the job to detail analysis in the future. However, the theoretical basis of many analyses still comes from the source code. It can be seen that the source code is important.
Basic knowledge:
1. After each process switchover, it will re-load the base address of each process in The TLB base register. There is a current macro in the processes currently running the CPU to indicate the information of the current process. The hardware architecture should be involved in this code implementation. In order to avoid the difference, the X86 architecture should be specified when the hardware knowledge is used in the article, after all, there are many researchers who use x86 data and analysis. In fact, arm also has other chips similar to the Proteus chip. As long as MMU is supported, there will be similar base address registers.
2. Each process is allocated its own runtime space before the system runs the process. The validity of this space depends on the content in TLB base. In a 32-bit system, the access space is 4 GB. In this space, the process is "free. The so-called "free" does not mean that any 4G address or space can be accessed. If you want to access the table, you still need to follow the address validity, that is, the physical address converted from any page table pointed to in TLB base. Among them, the validity is checked out of bounds, permissions, and so on.
3. Any user process runs in the space allocated by the system. This space can have
VMA: struct vm_area_struct. This struct can be used to describe all runtime spaces. User processes can be divided into text data segments. The specific locations of these segments in 4G have different VMA descriptions. VMA management has other mechanisms to ensure that these mechanisms involve algorithm and physical memory management. Take a look at two images:
Figure 1:
Figure 2:
Write and read in the system call:
The exact file system type is not specified as the analysis object. Find the system call number, and then determine the file operation carried by the specific file system. In a specific file operation, each file system has its own set of operation functions. There are read and write.
Figure 3:
Before reading or writing user data to a disk or a storage device, the kernel also manages the data in the page cache. These pages effectively manage user data and read/write efficiency. User data is not directly from the application layer, read or write disks and storage media, but is divided by applications on a layer-by-layer basis, different functions correspond to each layer. In the final interaction, the disk operation is triggered at the most appropriate time. Write Data to disks and storage media using an I/O driver. The management of page cache is emphasized here. The cache should be designed for page management. These caches are managed in the unit of page. Before I/O operations, it is temporarily stored in the system space and not directly written to the disk or storage media.
MMAP in system call:
When a process is created or switched, the system information of the current process is loaded. The system information contains the running space of the current process. After the user program calls MMAP. The function will find the appropriate VMA in the space of the current process to describe the region to be mapped. This region maps the content of the specific file pointed to by the file descriptor in the MMAP function.
The principle is: MMAP execution only establishes the correspondence between files and virtual memory space in the kernel. When a user accesses these virtual memory spaces, there is no table item in the page table. When a user program attempts to access these mapped spaces, a page exception occurs. The kernel gradually Loads files by capturing these exceptions. For the so-called loading process, the specific operation is that read and write are managing pagecache. The VMA struct contains a very file operation set. The VMA operation set has its own page cache operation set. In this way, although there are two different system calls, the operation and call trigger paths are different. However, the page Cache Management is still implemented. Implements file content operations.
PS:
File page Cache Management is also good. The address space operation is involved. Many of them are related to file operations.
Efficiency Comparison:
The previous article on the internet is applied here. For better analysis, refer to it here.
MMAP:
# Include <stdio. h>
# Include <stdlib. h>
# Include <sys/types. h>
# Include <sys/STAT. h>
# Include <unistd. h>
# Include <sys/Mman. h>
VoidMain ()
{
IntFD = open ("test. File", 0 );
StructStat statbuf;
Char* Start;
CharBuf [2] = {0 };
IntRet = 0;
Fstat (FD, & statbuf );
Start = MMAP (null, statbuf. st_size, prot_read, map_private, FD, 0 );
Do{
* Buf = start [RET ++];
}While(Ret <statbuf. st_size );
}
Read:
# Include <stdio. h>
# Include <stdlib. h>
VoidMain ()
{
File* PF = fopen ("test. File", "R ");
CharBuf [2] = {0 };
IntRet = 0;
Do{
Ret = fread (BUF, 1, 1, Pf );
}While(RET );
}
Running result:
[[Email protected] test_read] $ time./fread
Real 0m0. 901 s
User 0m0. 892 s
Sys 0m0. 010 s
[[Email protected] test_read] $ time./MMAP
Real 0m0. 112 s
User 0m0. 106 S
Sys 0m0. 006 s
[[Email protected] test_read] $ time./read
Real 0m15. 549 s
User 0m3. 933 s
Sys 0m11. 566 s
[[Email protected] test_read] $ ll test. File
-RW-r -- 1 xiangy svx8004 23955531 Sep 24 test. File
After MMAP is used, it can be seen that the time consumed by system calling is much less than that of common read.
MMAP implementation principle and application