ARM principle MMU

Source: Internet
Author: User
Tags in domain

Category: Embedded

The original address: Embedded Linux i line--arm MMU working principle Analysis y307921462

First, the creation of the MMU

       Xu years ago, When people are still using DOS or older operating systems, the computer's memory is very small, usually in K for the calculation, the corresponding, then the program size is not big, so the memory capacity is small, but still can accommodate the program at that time. But with the rise of the graphical interface with the increasing demand of users, the size of the application expands, and finally a problem appears in front of the programmer, that is, the application is too large to fit the memory of the program, usually the solution is to divide the program into a lot of pieces called the Overlay Block (overlay). Overlay Block 0 runs first, and at the end he will call another overlay block. Although the swap of the overlay block is done by the OS, the programmer must first split the program first, which is a laborious and tedious task. People must find a better way to solve the problem fundamentally. Soon people found a way, which is virtual memory. The basic idea of virtual memory is the program, the data, the total size of the stack can exceed the size of the physical memory, the operating system to keep the current use of the part in memory, And the other unused parts on disk such as to a 16MB program and a memory only 4MB of the machine, the OS through the choice, you can decide at each moment which 4M of content in memory, and when needed to swap memory and disk between the program fragments, This allows the 16M program to run on a memory machine with only 4 m. This 16M program does not have to be split by programmers before it is run.


At any time, there is a collection of addresses on the computer that a program can produce, which we call the address range. The size of this range is determined by the number of bits of the CPU, such as a 32-bit CPU, its address range is 0~0xffffffff (4G), and for a 64-bit CPU, its address range is 0~0xffffffffffffffff (64T), This range is the range of addresses that our program can generate, and we call this address range a virtual address space, which is called a virtual address by one of the addresses in that space. Corresponding to the virtual address space and the virtual address is the physical address space and physical address, most of the time our system has a physical address space is only a subset of the virtual address space, here is a simple example to illustrate the two visually, for a 256MB 32bit x86 host, Its virtual address space range is 0~0xffffffff (4G), while the physical address space range is 0x000000000~0x0fffffff (256MB).


On a machine that does not use virtual memory, the virtual address is sent directly to the memory bus so that the physical memory with the same address is read and written. In the case of using virtual memory, the virtual address is not sent directly to the memory address bus, but to the Memory management Unit--mmu (the protagonist finally appeared). He consists of one or a group of chips, which are generally present in the coprocessor, and its function is to map virtual addresses to physical addresses.

Ii. The working process of the MMU

Most systems that use virtual storage use one called paging (paging). The virtual address space is divided into units called pages (page), and the corresponding physical address space is partitioned, in the form of a page frame. The page and page boxes must be the same size. Next, I'll take a picture with an example of how the page and page boxes are mapped under the MMU Dispatch:

In this example we have a machine that can generate a 16-bit address, its virtual address range from 0X0000~0XFFFF (64K), and this machine only 32K physical address, so he can run 64K program, but the program cannot be transferred into memory run. The machine must have an external memory (such as a disk or flash) that can hold a 64K program to ensure that the program fragments can be called when needed. In this example, the page size is 4K, the page box size is the same as the page (this is guaranteed, the transfer between memory and peripheral memory is always in pages), corresponding to the 64K virtual address and 32K of physical memory, they contain 16 pages and 8 page boxes respectively.


Let's start by explaining a few of the terms that we'll use after paging, where we've touched the page and page boxes, and the green part is the physical space where each one represents a physical page box. The orange part is the virtual space, each box represents a page, it is composed of two parts, namely Frame index (page box index) and bit p (present existence bit), frame index is very obvious, it indicates which physical page box This page is mapped to, The significance of bit p is to indicate whether the mapping on this page is valid, for example, when a page is not mapped (or the map is invalid, Frame index is x), the bit is 0, and the mapping is valid the bit is 1.


We execute the following instructions (the instructions in this example are not for any particular model, they are pseudo-directives)
Example 1:
Move reg,0//The value of address number No. 0 is passed into register REG.
Virtual address 0 will be sent to Mmu,mmu to see that the virtual address falls within the range of page 0 (page 0 ranges from 0 to 4095), the page box corresponding to page 0 (map) is 2 (the Address range of page box 2 is 8192 to 12287), so the MMU translates the virtual address into physical address 8192. and send the address 8192 to the address bus. Memory has no knowledge of the MMU mapping, it only sees a read request to address 8192 and executes it. The MMU then maps 0 to 4096 of the virtual address to a physical address of 8192 to 12287.


Example 2:
MOVE reg,8192
be converted to
MOVE reg,24576
Because virtual address 8192 is in page 2, and page 2 is mapped to page box 6 (Physical address from 24576 to 28671)


Example 3:
MOVE reg,20500
be converted to
MOVE reg,12308
Virtual address 20500 on virtual page 5 (virtual address range 20480 to 24575) is at the beginning of 20 bytes, the virtual page 5 maps to page box 3 (the Address range of page Box 3 is 12288 to 16383), and is mapped to the physical address 12288+20=12308.


By properly setting the MMU, you can insinuate 16 virtual pages to any one of the 8 page boxes, but this method does not effectively solve the problem that the virtual address space is larger than the physical address space. From what we can see, we only have 8 page boxes (physical addresses), but we have 16 pages (virtual addresses), so we can only make a valid mapping of 8 of the 16 pages. Let's see what happens in Example 4.

MOV reg,32780
Virtual address 32780 falls in the range of page 8, from the total we see page 8 is not valid mapping (the page is hit X), this is what happens? The MMU notices that the page is not mapped and notifies the CPU that a fault has occurred (page Fault). In this case the operating system must handle this page failure, it must find 1 currently rarely used page boxes from 8 physical page boxes and write the contents of the page box to the peripheral memory (this action is called page copy), The page to be referenced (example 4, page 8) is then mapped to the page box that was just released (this action is called modifying the mapping relationship) and then the failed instruction (MOV reg,32780) is executed from the new one. Assuming the operating system decides to release page box 1, it will load the virtual page 8 into the physical address of the 4-8k, and make two modifications: first, the markup virtual page 1 is not mapped (the original virtual page 1 is the page box 1), So that any subsequent access to the virtual address 4K to 8 K caused the page fault and the operating system to make the appropriate action (this is what we are talking about now), and then he put the virtual page 8 corresponding to the page box number from X to 1, so re-execute MOV reg,32780, The MMU will map 32780 to 4108.


We have a rough idea of what role MMU plays in our machines and what its basic work is, and here's an example of how it works (note that the MMU in this example is not specific to a particular model, it is an abstraction of all the MMU work).


First, it is clear that the main task of the MMU is to map the virtual address to the physical address.
As we already know, most systems using virtual memory use a technique called paging (paging), as we have just mentioned, where the virtual address space is divided into a set of pages of the same size, each page has a page number to mark it (this page number is usually its index in that group, and the C + + Similar to the array in the. In the example above, the page number for page number 0,4~8k is 2, and so on, page number is 1,8~12k. 0~4k. The virtual address (note: Is a definite address, not a space) is divided into 2 parts by the MMU, the first part is the page number index, the second part is relative to the top address of the page offset (offset). 。 We're still using that 16-bit machine as an example, in which the virtual address 8196 is sent into the Mmu,mmu to map it to a physical address. A total of 16-bit CPUs can produce an address range of 0~64k, calculated as the size of 4K per page, and the space must be divided into 16 pages. The first part of our virtual address must also be equal to 16 (so that it can be indexed to every page in the page group), which means that it requires at least 4 bits. The size of a page is 4K (4096), which means that the offset portion must be represented by 12 bits (2^12=4096 to access all addresses in a page), and the binary code of 8196 is as follows:

The page number index of this address is 0010 (binary), the index page is page 2, the second part is 000000000100 (binary), and the offset is 4. Page 2 of the page box number is 6 (page 2 map in page box 6, see), we see the Physical Address of page Box 6 is 24~28k. So the MMU calculates that the virtual address 8196 should be mapped to Physical Address 24580 (page-Box header + offset =24576+4=24580). Similarly, if we read the virtual address 1026, 1026 of the binary code is 0000010000000010,page index= "0000" =0,offset=010000000010=1026. The page number is 0, the page is mapped with a page frame number of 2, and the Physical Address range for Page Box 2 is 8192~12287, so the MMU maps virtual address 1026 to Physical Address 9218 (page box header + offset =8192+1026=9218). The above is the working process of the MMU.

Three, s3c24xx of the MMU work process


below we explain to S3c2410 's MMU (note 1).
s3c2410 A total of 4 memory mapping methods, respectively:
1. Fault (no mapping)
2. Coarse Page (rough table)
3. Section (segment)
4. Fine Page (fine table)
we describe it in section (paragraph).
arm920t is a 32bit CPU with a virtual address space of 2^32=4g. In the section mode, the 4G virtual space is divided into a unit called segment (section), which is essentially the same as the page we talked about, and the length of each segment is 1M (and the length of the page we used earlier is 4K). 4G of virtual memory can be divided into 4,096 segments (1M*4096=4G), so we have to use 4,096 descriptors to describe this set of paragraphs, each descriptor occupies 4 byte, so the size of this set of descriptors is 16KB (4byte*4096), These 4,096 descriptions are Fu Yi as a table, which we call Tralaton table.

Is the structure of the descriptor
Section Base Address: Subgrade addresses (equivalent to the first page box number)
AP: Access control bit access Permission
Domain: The index of the access control register. Domain is used in conjunction with the AP to check access rights
C: When C is set 1 o'clock is Write-through (WT) mode
B: When B is placed 1 o'clock for Write-back (WB) mode (c,b Two bits can only have one at the same time 1)
The following is a s3c2410 memory-mapped one:

The Sdrsam size configured on my s3c2410 is 64M, the SDRAM's physical address range is 0x3000 0000~0x33ff FFFF (which belongs to Bank 6), because the size of 1 sections is 1M, so the physical space can be divided into 64 physical segments (page box ).


In section mode, the virtual address sent to the MMU (note 1) is divided into two parts (this is the same as the example above), which is Descriptor Index (page index equivalent to the example above) and Offset,descript Index length is 12bit (2^12=4096, what can you tell from this relationship?) :), offset length 20bit (2^20=1m, what can you see?) :). Look at the section Base address part of a descriptor (descriptor), which is a length of one, and the value inside is the physical address of the physical segment (page box) mapped to the Virtual segment (page) 12bit, because each physical segment is 1M in length, So the last 20bit of the first address of the physical segment is always 0x00000 (each section is aligned in 1M), and the method of determining a physical address is the offset portion of the physical page frame base address + virtual addresses =section base Address<<20+offset , hehe, maybe you are a little confused, or give a practical example of it.

Assuming now execute instructions mov REG, 0x30000012, virtual address binary code is 00110000 00000000 00000000 00010010, the first 12 bits are descriptor index= 00110000 0000=7 68, so found in translation table No. 768 descriptor, the description of the section Base address= "0x0300", that is, the descriptor described by the virtual Segment (page) Mapping of the Physical Segment (page box) The first address is 0x3000 0000 (Physical Segment (page box) base Address =section base address left shift 20bit=0x0300<<20=0x3000 0000), while offset=000000 00000000 00010010=0x12, Therefore, the virtual address 0x30000012 mapped to the physical address =0x3000 0000+0x12=0x3000 0012 (Physical page frame base Address + offset in virtual addresses). You might ask how this virtual address is the same as the mapped physical address? This is determined by the mapping rules that we define. In this example, we define a mapping rule that maps a virtual address to a physical address equal to him. We write the code of the mapping relationship like this:
void mem_mapping_linear (void)
{
     unsigned long descriptor_index, section_base, Sdram_base, sdram_size;
    sdram_base=0x30000000;
    sdram_size=0x 4000000;
    for (sections _base= sdram_base,descriptor_index = section _base>>20;
         Section _base < sdram_base+ sdram_size; 
          descriptor_index+=1;section _base +=0x100000)
    {
         * (Mmu_tlb_base + (descriptor_index)) = (Section _base>>20) | Mmu_other_secdesc;
   }
}


The above snippet of code maps the virtual space 0x3000 0000~0x33ff FFFF to the physical space 0x3000 0000~0x33ff FFFF, because the virtual space coincides with the physical space space, so the virtual address is consistent with their respective physical address on the value. When the translation table is finished, remember to load the translation table's first address (the address of the No. 0 descriptor) into the control Register2 (2nd) of the coprocessor CP15, The name of the control register is called the translation table base (TTB) register.


The above discussion is about the section Base address in descriptor and the mapping of virtual and physical addresses, but the MMU also has an important function, which is access control mechanism (Permission). Simply speaking, the access control mechanism is the CPU by some means to determine whether the current program access to memory is legitimate (whether there is permission to access the memory), if the current program does not have permissions to the area of memory to be accessed, then the CPU will throw an exception, s3c2410 that the exception is permission Fault,x86 architecture that is called the generic protection exception (General Protection), what will cause permission fault? For example, in a user-level program to write to a system-level memory area, this operation is ultra vires, should cause a permission fault, the x86 architecture of friends should have heard the protection mode (Protection mode), Protection mode is based on this idea of work, so we can also say: s3c2410 access control mechanism is actually a kind of protection mechanism. What elements of the s3c2410 access control mechanism are involved in this process? How do they coordinate their work? These elements are as a total:
1. Coprocessor CP15 Control Register3:domain ACCESS control REGISTER
2. The AP bit and domain bit in the segment descriptor
3. The S bit and r bit in the control Register1 (controlling register 1) in the coprocessor CP15
4. Coprocessor CP15 Control REGISTER5 (Controller register 5)
5. Coprocessor CP15 Control REGISTER6 (Controller register 6)
DOMAIN access Control register, which is a valid bit of 32, is divided into 16 regions, each of which consists of two bits, which describes the level of access checking for the current memory, as shown in:


Each area can be filled with 4 values, respectively 00,01,10,11 (binary), and their meanings are as follows:


00: The memory area is not allowed to be accessed at the current level, and any access will cause a domain fault
01: Access to the memory area must be checked with the AP bit in the segment descriptor of the memory area at the current level
10: Reserved State (we'd better not fill in this value to avoid causing an indeterminate problem)
11: Access to this memory area is not checked for permissions at the current level.
Let's take a look at the domain area in Discriptor, which has a total of 4 bits, the value of which is the index of 16 zones in domain ACCESS CONTROL register. While AP bit mates s bit and a Bit the description of the memory area that the current descriptor describes as access permission, their mates as shown in:


The AP bit also has four values, which I illustrate with an example.
In the following example, our domain ACCESS CONTROL Register is initialized to 0xFFFF BDCF, as shown in:




Suppose now I want to access the memory area described by this descriptor:
because of domain=4, domain access The value of field 4 in CONTROL Register is 01, and the system checks for access to that access.
Assuming the current CPU is in supervisor mode, the program can read and write to the memory area described by that descriptor.
Assuming that the current CPU is in user mode, the program can read access to the memory described by that descriptor. A permission fault is generated if the write operation is performed on it.

Example 2:
Discriptor in domain=0,ap=10 (in this case the S bit, A bit is ignored)
Domain=0, and the value of field 0 in domain Access CONTROL Register is 11, the system does not have access to any of the memory areas.
Since access to any memory area is not checked by the system, the memory described by the program can be read and written smoothly regardless of whether the CPU is in a combined mode (supervisor mode or user mode).

Example 3:discriptor in domain=4,ap=11 (in which case s bit, A bit is ignored)
Because of domain=4, and the value of field 4 in Domain Access CONTROL Register is 01, the system checks for access to that access.
Because of the ap=11, the program can read and write the memory that the descriptor describes, regardless of whether the CPU is in a combined mode (supervisor mode or user mode).


domain=4,ap=00 in Discriptor, S bit= "0", A bit= "0"
Because of domain=4, and the value of field 4 in Domain Access CONTROL Register is 01, the system checks for access to that access.
because ap=00,s bit= "0", A bit= "0", So regardless of whether the CPU is in a combined mode (supervisor mode or user mode), the program can only read the memory described by the descriptor, otherwise it will cause permission fault.

1. Whether access to a memory area requires a permission check is determined by the domain field in the descriptor for that memory region.
2. Access to a memory area is determined by the AP bit in the descriptor of the memory region and by the S bit and r bit in the control Register1 (controlling register 1) in the coprocessor CP15.

Original Address http://blog.ednchina.com/LHDDSHL/292841/message.aspx

ARM principle MMU

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.