Embedded Linux-arm MMU Principle Analysis

Source: Internet
Author: User
Tags in domain

In this article, I will introduce and summarize every step of learning embedded Linux. I would like to sum up my experience, and I hope it will be convenient for anyone who wants to get started with embedded Linux. If any error occurs, please correct it.
  • Share resources, welcome to reprint: http://hbhuanggang.cublog.cn

I. Generation of MMU

Many years ago, when people were still using DOS or older operating systems, the computer's memory was still very small and generally calculated in K units, at that time, the program was not large, so although the memory capacity was small, it could still accommodate the program at that time. However, with the rise of the graphic interface and the increasing demand of users, the scale of applications also expands, and finally a problem arises in front of programmers, that is, the application is so large that the memory cannot accommodate the program. The common solution is to split the program into many fragments called overlay. Overwrite Block 0 is run first. At the end, it calls another overwrite block. Although the swap of the covered block is completed by the OS, it is time-consuming and labor-consuming and tedious to split the program first by the programmer. People must find a better way to fundamentally solve this problem. Soon, people found a way
Memory ). the basic idea of the virtual memory is that the total size of the program, data, and stack can exceed the size of the physical memory. The operating system keeps the currently used part in the memory, while other unused parts are stored on the disk, for example, for a 16 MB program and a machine with only 4 MB memory, the OS selects, you can decide at various times which 4 MB of content is stored in the memory, and swap program segments between memory and disk as needed, in this way, you can run the 16 M program on a machine with only 4 M memory. This 16 M program does not need to be separated by programmers before running.

At any time, there is a set of addresses that can be generated by a program on the computer, which is called the address range. The size of this range is determined by the number of cpus. For example, a 32-bit CPU has a range of 0 ~ 0 xffffffff (4G) and for a 64-bit CPU, its address range is 0 ~ 0 xffffffffffffffff (64 t). This range is the address range that can be generated by our program. We call this address range a virtual address space, and a specific address in this space is called a virtual address. The physical address space and physical address correspond to the virtual address space and virtual address. In most cases, the physical address space of our system is only a subset of the virtual address space, here is a simple example to illustrate the two. For a 32bit with MB memory
For x86 hosts, its virtual address space ranges from 0 ~ 0 xffffffff (4G), and the physical address range is 0x000000000 ~ 0x0 fffffff (256 MB ).

On a machine that does not use virtual memory, the virtual address is directly sent to the memory bus, so that the physical memory with the same address is read and written. When virtual memory is used, the virtual address is not directly sent to the memory address bus, but to the memory management unit-MMU (the main character has finally appeared ). It is composed of one or more chips and usually exists in the coprocessor. Its function is to map virtual addresses to physical addresses.

 

Ii. MMU Work Process

Most systems that use virtual memory use paging ). The virtual address space is divided into units called pages, and the corresponding physical address space is also divided. The unit is a page frame ). the page and page box must be in the same size. Next, I will use an example to illustrate how to map between pages and pages under MMU scheduling:

In this example, we have a machine that can generate a 16-bit address. Its virtual address range is from 0x0000 ~ 0 xFFFF (64 K), and this machine only has a 32 K physical address, so it can run 64 K programs, but this program cannot be transferred to the memory at a time to run. This machine must have an external memory (such as a disk or flash) that can store 64 K programs to ensure that program fragments can be called as needed. In this example, the page size is 4 K, and the page size is the same as the page size (this must be ensured that the transmission between memory and peripheral memory is always in the unit of page ), corresponding to 64 K virtual addresses and 32 K physical storage, they contain 16 pages and 8 pages respectively.

Based on the several terms used after paging, we have already touched pages and page boxes. The green part is the physical space, and each grid represents a physical page box. The orange part is a virtual space. Each grid represents a page. It consists of two parts: frame index (page frame index) and P (present bit ), the significance of frame index is obvious. It indicates the physical page to which the page is mapped, and the meaning of bit P indicates whether the ing on the page is valid, for example, when a page is not mapped (or the frame ing is invalid, the frame index part is X), the bit is 0, and the ing is valid, the bit is 1.

We execute the following commands (the commands in this example are not for any specific model and are pseudo commands)
Example 1:
Move Reg, 0 // pass the value of address 0 to register Reg.
The virtual address 0 will be sent to MMU, and MMU will see that the virtual address falls in the range of page 0 (the range of page 0 is 0 to 4095), from which we will see the corresponding page 0 (ing) the page box of is 2 (the address range of page 2 is 8192 to 12287). Therefore, MMU converts the virtual address to physical address 8192 and sends the address 8192 to the address bus. Memory does not know anything about MMU ing. It only sees a read request to address 8192 and executes it. MMU maps virtual addresses from 0 to 4096 to physical addresses from 8192 to 12287.

Example 2:
Move Reg, 8192
Converted
Move Reg, 24576
Because the virtual address 8192 is in page 2, and page 2 is mapped to page 6 (the physical address ranges from 24576 to 28671)

Example 3:
Move Reg, 20500
Converted
Move Reg, 12308
The virtual address 20500 is located at the first 20 bytes of page 5 (the virtual address range is 20480 to 24575, virtual page 5 maps to page 3 (the address range of page 3 is 12288 to 16383), so it is mapped to the physical address 12288 + 20 = 12308.

By setting MMU properly, you can hide 16 virtual pages to any of the eight page boxes, however, this method does not effectively solve the problem that the virtual address space is larger than the physical address space. We can see that we only have 8 page boxes (physical addresses), but we have 16 pages (virtual addresses ), therefore, we can only map 8 out of 16 pages. Let's see what happens in Example 4.

MoV Reg and 32780
The virtual address 32780 falls within the range of page 8. From the total page we can see that page 8 is not effectively mapped (this page is marked with X). What does this happen? MMU notices that this page is not mapped, so it notifies the CPU of a page fault (page fault ). in this case, the operating system must handle this page fault, it must find a rarely used page box from the eight physical page boxes and write the content of the page box to the peripheral memory (this action is called page copy ), then, map the page to be referenced (in example 4, page 8) to the page box just released (this action is called to modify the ing relationship ), then run the faulty command (mov Reg, 32780 ). If the operating system decides to release page 1, it will load virtual page 8 to 4-8 K of the physical address and make two modifications: first, the marked virtual page 1 is not mapped (originally, virtual page 1 was shot in page 1 ), so that any access to the virtual address 4 K to 8 K in the future will cause page faults and the operating system will make appropriate actions (this action is exactly what we are discussing ), second, he changes the page number corresponding to page 8 from X to 1, so he re-executes mov
When Reg, 32780, MMU maps 32780 to 4108.

We have a general understanding of what role MMU plays in our machine and what its basic work content is. The following example shows how it actually works (note, in this example, the MMU does not target a specific model. It is an abstraction of all MMU operations ).

First of all, it is clear that there is only one main task of MMU, that is, to map virtual addresses to physical addresses.
As we already know, most systems that use virtual memory use a technology called paging, just like the example we just mentioned, the virtual address space is divided into a group of pages of the same size. Each page has a page number (this page number is generally an index of the virtual address space in this group, this is similar to the array in C/C ++ ). In the above example, 0 ~ The page number for 4 K is ~ The 8k page number is ~ The page number of 12 K is 2, and so on. The virtual address (Note: it is a fixed address, not a space) is divided into two parts by MMU. The first part is the page index ), the second part is the offset relative to the header address ).. We still use the 16-bit machine to describe an instance. In this instance, the virtual address 8196 is sent to MMU, and MMU maps it to a physical address. The total address range for a 16-bit CPU is 0 ~ 64 K, based on the size of 4 K per page, the space must be divided into 16 pages. The scope of the first part of our virtual address must be equal to 16 (so that we can index every page in the page group). That is to say, this part requires at least four bits. The size of a page is 4 K (4096). That is to say, the offset must be expressed in 12 bits (2 ^ 12 = 4096, so that all addresses on the page can be accessed ), 8196 binary code is shown in:

The index of the page number of this address is 0010 (Binary Code). The index page is page 2, the second part is 000000000100 (Binary), and the offset is 4. The page number in page 2 is 6 (page 2 is mapped to page 6, see). We can see that the physical address in page 6 is 24 ~ 28 K. Therefore, MMU calculates that the virtual address 8196 should be mapped to the physical address 24580 (the first address of the page box + offset = 24576 + 4 = 24580 ). Similarly, if we read the virtual address 1026, the binary code of 1026 is 0000010000000010, page index = "0000" = 0, offset = 010000000010 = 1026. The page number is 0, the page mapped to this page is 2, and the physical address range of page 2 is 8192 ~ 12287, MMU maps the virtual address 1026 to the physical address 9218 (the first address of the page box + offset = 8192 + 1026 = 9218 ). The above is the MMU's working process.

Iii. mmu process of s3c24xx

Next we will explain the MMU of S3C2410 (note 1.
There are a total of four memory ing modes for S3C2410:
1. fault (no ing)
2. coarse page (rough table)
3. Section)
4. Fine page)
We will describe the section (segment.
ARM920T is a 32bit CPU, and its virtual address space is 2 ^ 32 = 4G. In the section mode, the 4G virtual space is divided into a unit called Section (essentially consistent with the page we mentioned above ), the length of each segment is 1 MB (the length of the page we used earlier is 4 K ). 4G virtual memory can be divided into 4096 segments (1 m * 4096 = 4G) in total. Therefore, we must use 4096 descriptors to describe these segments, each descriptor occupies 4 bytes, so the size of this set of descriptors is 16 KB (4 bytes * 4096). These 4096 descriptors constitute a table called tralaton.
Table.

Is the descriptor Structure
Section base address: The base address of the segment (equivalent to the first address of the page number)
AP: access permission
Domain: the index of the access control register. Use domain with AP to check Access Permissions
C: Write-through (wt) mode when C is set to 1
B: Write-back (WB) mode when B is set to 1 (C, B can only have one set at the same time)
The following figure shows the memory ing of S3C2410:

The sdrsam size configured on my S3C2410 is 64 MB, and the physical address range of the SDRAM is 0x3000 0000 ~ 0x33ff FFFF (belonging to Bank 6). Because the size of one section is 1 MB, the physical space can be divided into 64 physical segments (page boxes ).

In section mode, the virtual address sent to MMU (Note 1) is divided into two parts (which is the same as the preceding example ), the two parts are descriptor index (equivalent to the page index in the above example) and offset. The length of descript index is 12bit (2 ^ 12 = 4096. What can you see from this relationship? :)), The offset length is 20bit (2 ^ 20 = 1 m, you can see what? :)). Observe the section base address section of a descriptor (descriptor). It is 12 bits in length and its value is the physical segment mapped to the virtual segment (page box) because the length of each physical segment is 1 m, the last 20 bits of the physical segment is always 0x00000 (each section is aligned with 1 m ), the method to determine a physical address is
The offset part of the base address + virtual address in the physical page is section base address <20 + offset. You may be confused. Let's give an example.

Assume that the following command is executed: mov Reg, 0x30000012. the binary code of the virtual address is 00110000 00000000 00000000, and the first 12 digits are descriptor Index = 00010010 00110000 = 0000, therefore, the section base address = "0 x 768th" of the description is found in the translation table, that is, the virtual segment described by the descriptor (page) the first address of the mapped physical segment (page box) is 0x3000 0000 (base address of the physical segment (page box) = Section Base
Address shifted to 20bit = 0x0300 <20 = 0x3000 0000), and offset = 000000 00000000 00010010 = 0x12, therefore, the physical address mapped to the virtual address 0x30000012 is 0x3000 0000 + 0x12 = 0x3000 0012 (base address of the physical page + offset in the virtual address ). You may ask how the virtual address is the same as the mapped physical address? This is determined by our defined ing rules. In this example, we define a ing rule to map virtual addresses to physical addresses that are equal to each other. Let's write the ing code in this way:
Void mem_mapping_linear (void)
{
Unsigned long descriptor_index, section_base, sdram_base, sdram_size;
Sdram_base = 0x30000000;
Sdram_size = 0x4000000;
For (section _ base = sdram_base, descriptor_index = Section _ base> 20;
Section _ base <sdram_base + sdram_size;
Descriptor_index + = 1; section _ base + = 0x100000)
{
* (Mmu_tlb_base + (descriptor_index) = (section _ base> 20) | mmu_other_secdesc;
}
}

The above Code sets the virtual space 0x3000 0000 ~ 0x33ff FFFF ing to physical space 0x3000 0000 ~ 0x33ff FFFF: Because the virtual space is consistent with the physical space, the virtual address and their corresponding physical address are consistent in value. After the initial translation table is complete, remember to load the first address of the translation table (the address of the descriptor No. 0th) into the control register2 (Control Register No. 2) of the coprocessor CP15, the name of this control register is translation table base (TTB) Register.

The preceding section discusses section base address in Descriptor and ing between virtual addresses and physical addresses. However, MMU also has an important function, namely access permission ). Simply put, the access control mechanism is that the CPU uses some method to determine whether the current program's access to the memory is legal (whether it has the permission to access the memory ), if the current program does not have the permission to operate on the memory area to be accessed, the CPU will throw an exception. The S3C2410 calls this exception permission fault, in X86 architecture, this exception is called general protection. What will cause permission?
What about fault? For example, if a user-level program needs to perform write operations on a system-level memory area, this operation is unauthorized and should cause a permission fault, friends who have worked on the X86 architecture should have heard of the protection mode. The protection mode is based on this idea, so we can also say: the access control mechanism of S3C2410 is actually a protection mechanism. So what elements does the access control mechanism of S3C2410 involve? How do they coordinate work? These elements include:
1. Control register3: domain access control register in CP15
2. AP bit and domain bit in the segment descriptor
3. S bit and R bit in control register1 (control register 1) of the coprocessor CP15
4. Control register5 (control register 5) in the coprocessor CP15)
5. Control register6 (control register 6) in coprocessor CP15)
Domain access control register is an access control register. The valid bit of this register is 32 and is divided into 16 regions. Each region consists of two digits. They indicate the access permission check level of the current memory, as shown in:


You can enter four values for each region, which are, and 11 (Binary). Their meanings are as follows:


00: at the current level, access to the memory area is not allowed. Any access will cause a domain Fault
01: at the current level, access to the memory region must be checked with the AP bit in the segment descriptor of the memory region.
10: Reserved status (we 'd better not enter this value to avoid Uncertain Problems)
11: at the current level, access to the memory area is not checked.
Let's take a look at the domain region in discriptor. There are a total of 4 bits in this region. The value is the index of 16 regions in domain access control register. the AP bit is used with S bit and a bit to describe the access permission for the memory area described by the current descriptor. Their cooperation relationship is shown in:

The AP bit also has four values. I will describe it with the instance.
In the following example, our domain access control register is initialized to 0 xFFFF bdcf, as shown in:

Example 1:
Domain = 4 and AP = 10 in discriptor (in this case, S bit and a bit are ignored)
Assume that I want to access the memory area described in this descriptor:
Because domain = 4, and the value of field 4 in domain access control register is 01, the system will check the access permission.
Assuming that the current CPU is in supervisor mode, the program can perform read and write operations on the memory area described by this descriptor.
Assuming that the current CPU is in user mode, the program can read the memory described in this descriptor. If you write the memory, a permission fault will occur.

Example 2:
Domain = 0 and AP = 10 in discriptor (in this case, S bit and a bit are ignored)
Domain = 0, while the value of field 0 in domain access control register is 11. The system does not check the access permission for any memory area.
Because the system does not check access permissions for any memory area, no matter whether the CPU is in the combined mode (supervisor mode or user mode ), the program can smoothly perform read/write operations on the memory described in this descriptor.

Example 3: domain = 4 and AP = 11 in discriptor (in this case, S bit and a bit are ignored)
Because domain = 4, and the value of field 4 in domain access control register is 01, the system will check the access permission.
Because ap = 11, the program can smoothly perform read/write operations on the memory described by this descriptor regardless of whether the CPU is in the combined mode (supervisor mode or user mode ).

Example 4:
In discriptor, domain = 4, AP = 00, S bit = "0", a bit = "0"
Because domain = 4, and the value of field 4 in domain access control register is 01, the system will check the access permission.
Because ap = 00, S bit = "0", a bit = "0", no matter whether the CPU is in the combined mode (supervisor mode or user mode ), the program can only perform read operations on the memory described in this descriptor, otherwise it will cause permission fault.
Through the above four examples, we can draw two conclusions:
1. Whether access to a memory area requires a permission check is determined by the domain in the descriptor of the memory area.
2. The access permission for a memory area is determined by the S bit and R bit in the descriptor of the memory area and the control register1 (control register 1) in the coprocessor CP15.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.