Java deep understanding of memory-mapped file principles _android

Source: Internet
Author: User

Memory-Mapped file principle

First of all, what is the problem to be solved in this article?
1. The difference and connection between virtual memory and memory-mapped file.
2. The principle of the memory-mapped file.
3. The efficiency of the memory-mapped file.
4. Traditional IO and memory mapping efficiency comparisons.

The difference and connection between virtual memory and memory-mapped file

The connection between the two

Both virtual memory and memory-mapped files are a mechanism that loads part of the content onto a disk, both of which are the basis for application dynamics, and are transparent to users because of their virtualization.

Virtual memory is actually part of the hard drive, is the computer RAM and the hard disk data exchange area, because the actual physical memory may be much smaller than the process address space, this need to put the memory temporarily not to the data in a special place on the hard disk, when the requested data is not in memory, the system produces but page interrupt, The memory manager then moves the corresponding page of memory back from the hard disk into physical memory.

Memory-mapped files are mapped by a single file to a piece of memory, allowing the application to access files on the disk through a memory pointer, as is the process of accessing the memory that loaded the file, so the memory file map is ideal for managing large files.

The difference between the two

1. Virtual memory using a hard disk can only be a paging file, and the disk portion of the memory map can be any disk file.

2. The architecture of the two is different, or the application of the scenario is different, virtual memory is the architecture on the physical memory, the introduction is because the actual physical memory to run the space required by the program, even if the computer is now more and more physical memory, the size of the program is growing. It is not economical and very unrealistic to load all running programs into memory. Memory-mapped file schema in the program's address space, the 32-bit address space is only 4 G, and some large file size can be far beyond this value, so the address space in a section of the application file to solve the problem of large files, in 32, You can use a memory-mapped file to handle 2 of 64 (64EB) sized files. Reason memory-mapped files, in addition to processing large files, can also be used as interprocess communication.

The principle of memory-mapped files

"Mapping" is to establish a corresponding relationship, here is mainly refers to the location of the hard disk files and the process of the logical address space in the same area between the one by one correspondence, this relationship is purely a logical concept, the physical is not exist, because the process of the logical address space itself is not exist, in the memory mapping process, There is no actual copy of the data, the file is not loaded into memory, but logically put in memory, specific to the code, is to establish and initialize the relevant data structure, the process has a system call MMAP () implementation, so the mapping is highly efficient.


Memory Mapping principle
The above mentioned that the establishment of memory mapping does not carry out the actual copy of the data, then how the process can be finally through the memory operation to access the files on the hard disk?
See figure:

1. Call Mmap (), the equivalent of allocating virtual memory to a memory-mapped file, which returns a pointer ptr, which points to a logical address in which the data must be manipulated by MMU the logical address into a physical address, as shown in procedure 2 in Figure 1.

2. The establishment of memory mapping and no actual copy of data, at this time, MMU in the address map is unable to find the corresponding physical address ptr, that is, MMU failure, will produce a page break, the interrupt response function of the missing pages interrupt will look for the corresponding pages in swap, If it is not found (that is, the file has never been read into memory), it will read the file to physical memory from the hard disk through the mapping relationship established by Mmap (), as shown in procedure 3 in Figure 1.

3. If physical memory is found to be insufficient when copying data, the virtual memory mechanism (swap) is used to swap the temporarily unused physical pages to the hard disk, as shown in procedure 4 in Figure 1.

Efficiency of memory-mapped files

As you know, memory-mapped files are much faster than traditional IO read and write data, so, why is it so fast, from the code level, to read the file from the hard disk into memory, is to go through the data copy, and the data copy operation is implemented by file system and hardware drive, in theory, the efficiency of copy data is a Kind of. In fact, the reason is that read () is a system call, where data is copied, it first copies the contents of the file from the hard disk to a buffer in the kernel space, as in Figure 2, Process 1, and then copies the data to the user space, as in Figure 2, Process 2, in which the data copy is actually completed two times; and Mmap () is also a system call, as described earlier, mmap () there is no copy of the data, the real copy of the data is in the page fault interrupt processing, because mmap () mapping the file directly to user space, so the interrupt processing function according to this mapping relationship, directly copy files from the hard disk to user space, only to Copy of the data at once. Therefore, the memory mapping efficiency is higher than the read/write efficiency.

read系统调用原理

Traditional IO and memory-mapped efficiency comparisons.

Here, using Java traditional IO, plus buffer io, memory maps read 10M data separately. Use the following:

public class Mapbufdelete {public static void main (string[] args) {try {FileInputStream fis=new fileinputs
      Tream ("./largefile.txt");
      int sum=0;
      int n;
      Long T1=system.currenttimemillis (); try {while (N=fis.read ()) >=0) {//Data Processing}} catch (IOException e) {E.printstac
      Ktrace ();
      Long T=system.currenttimemillis ()-t1;
    System.out.println ("Traditional ioread file, do not use buffer, spents:" +t);
    catch (FileNotFoundException e) {e.printstacktrace ();
      try {fileinputstream fis=new fileinputstream ("./largefile.txt");
      Bufferedinputstream bis=new bufferedinputstream (FIS);
      int sum=0;
      int n;
      Long T1=system.currenttimemillis (); try {while (N=bis.read ()) >=0) {//Data Processing}} catch (IOException e) {E.printstac
      Ktrace ();
      Long T=system.currenttimemillis ()-t1;
    System.out.println ("Traditional ioread file, using buffer, when:" +t); } catch (FileNotFoundException e) {e.printstacktrace ();
    } Mappedbytebuffer Buffer=null; try {buffer=new randomaccessfile ("./largefile.txt", "RW"). Getchannel (). Map (FileChannel.MapMode.READ_WRITE, 0, 12532
      44);
      int sum=0;
      int n;
      Long T1=system.currenttimemillis ();
      for (int i=0;i<1024*1024*10;i++) {//Data processing} long T=system.currenttimemillis ()-t1;
    SYSTEM.OUT.PRINTLN ("Memory-mapped file reading file, when:" +t);
    catch (FileNotFoundException e) {e.printstacktrace ();
    catch (IOException e) {e.printstacktrace (); }finally {}}}

Run results

传统IOread文件,不使用缓冲区,用时:4739

传统IOread文件,使用缓冲区,用时:59

内存映射文件读取文件,用时:11

Finally, explain why reading a file with a buffer is faster than not using it:

The reason is that every IO operation, from the user state into the kernel state, from the kernel to read data from the disk to the kernel buffer, and then from the kernel buffer to the user buffer, if there is no buffer, read all need to switch from the user state to the kernel state, and this switch is time-consuming, so the use of prefetching, reduce IO times, If there is buffer, according to the local principle, will read the data once, put in the buffer, reducing the number of IO.

Thank you for reading, I hope to help you, thank you for your support for this site!

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.