SOC embedded software architecture design II: virtual memory management principle, MMU hardware design and code block management

Source: Internet
Author: User

Most of the code of the program can be loaded to the memory for execution when necessary. After running, it can be directly discarded or overwritten by other code. We run a lot of applications on the PC at the same time. The virtual address space used by each application can be almost the entire linear address space (except for some reserved space for the operating system ), we can think that each application occupies the entire virtual address space (32 CPUs are 4 GB virtual address space), but our physical memory is only 1 GB or 2 GB. That is to say, when multiple applications compete to use this physical memory at the same time, it will inevitably lead to a certain segment of the program being executed at a certain time, that is, all program code and data are used to reuse physical memory space in a time-based manner-this is the core function of the memory management unit (MMU.

Generally, CPU chips (such as x86, ARM7, and MIPS) have MMU, Which is used together with the operating system to implement virtual memory management. MMU is also a hardware requirement for Linux, WinCE, and other operating systems. However, the controller system chips (for low-end control fields, such as ARM7, mips m series, and 80251) generally do not have MMU, or they only have a single linear ing mechanism.

This article will talk about the hardware design of the Memory Management Unit of the SOC in the Controller field. Its important concept is to reuse the physical memory space in time when code and data are used, the goal is to maximize physical memory savings on the basis of ensuring system functions and performance. Related Articles include:SOC Software Architecture Design: system memory requirement EvaluationAndMemory-saving software design skills.

 

I. Memory Management Unit (MMU) Working Mechanism

Before explaining the memory management in the Controller field, we should first introduce the virtual memory management mechanism in the Processor field. The former is largely a reference for the essence of the latter's core mechanism. The following modules coordinate the virtual memory management: CPU, MMU, operating system, and physical memory (assuming the chip series do not have a cache ):

 

Let's analyze the process of CPU access to memory. Assume that the addressing is 0x10000008, and the size of a page is 4 kb (12 bits ). The virtual address is divided into two parts: Page ing (20 bit, 0x10000) + page offset (12 bit, 0x8 ). The CPU sends the address signal (0x10000008) to MMU through the bus, and MMU obtains the page ing part (20 bit) of the address to the TLB for matching.

What is TLB? The translation lookaside buffer is called a "translation backup buffer" on the Internet ". This translation does not know what it does. It serves as a page table cache. I like to call it a page table cache. The structure is as follows:

 

As you can imagine, TLB is an index address array, and each element of the array is an index structure, including the Virtual page address and physical page address. It is represented as a register in the chip. Generally, the registers are 32-bit. In fact, the page address in TLB is also a 32-bit register, but the index is compared to the first 20 bit, the last 12bit is actually useful. For example, you can set a bit to indicate resident, that is, the index is always valid and cannot be changed, this scenario is generally designed for some codec algorithms with high performance requirements. If the memory is very resident, a new page address is accessed at a certain time point (such as when TLB is filled up.

1) if the first 20 bits of 0x0000008 hit the M index in TLB, it indicates that the virtual page has allocated the corresponding physical memory to it, the page table has been recorded. Whether the code page corresponding to the virtual address is loaded from the program of External Storage (flash, card, hard disk) to the memory requires additional marking. How can we mark it? It is identified by a bit (k) with a low 12-bit TLB as mentioned above. 1 indicates that the code data has been loaded to the memory, and 0 indicates that it has not been loaded to the memory. If it is 1, the physical address in m will be used as 20 bits in height, and 0x8 in the page will be used as 12 bits in height, forming a physical address and sending it to the memory for access. This access will be completed.

2) if K is 0, it means that the code data has not been loaded into the memory. At this time, MMU will output a signal to the interrupt management module, triggering an interrupt for kernel state, the operating system is responsible for loading the corresponding code page to the memory. Modify the K-bit of the corresponding page table item and the K-bit of the corresponding TLB item to 1.

3) if the first 20 bits of 0x0000008 are not hit in all TLB indexes, MMU will also output a signal to the interrupt management module to trigger the interruption and enter the kernel state, the operating system shifts 0x0000008 to the right by 12 bits (divided by 4 K) to the page table to obtain the corresponding physical page value. If the physical page value is not 0, this indicates that the code has been loaded to the memory. In this case, enter the page table item value in a idle TLB item. If the physical page value is 0, this indicates that the virtual page has not been allocated with the actual physical memory space. In this case, the actual physical memory is allocated to the page and the corresponding items of the page table are written (K is 0 at this time ), finally, write this index item to one of TLB.

2) and 3) are actually completed in the interrupt kernel state. Why not? The main reason is that a single interruption should not do too much to increase the interruption delay and affect the system performance. Of course, it is understandable that a chip will interrupt the two. Let's take a look at the structure of the page table. Page tables can also be indexed as TLB, but there are two disadvantages:

1) The page table maps all virtual pages, and its maintenance also requires a lot of space in the memory. When the page size is 4 K, the ing is all 4g/4 k = 1 m index, each index 4*2 = 8 bytes, is 8 M memory.

2) if we use the structure of TLB, the index matching process is a for loop matching circuit with low efficiency. We need to know that this is done in the interrupted state.

Therefore, the general page table is designed as a one-dimensional array, that is, the entire linear virtual address space is used as the subscript of the array in order by page, that is, the first word of the page table (4 bytes) the minimum 4 K of the ing virtual address space, the second 4 K of the ing virtual address space, and so on, the n characters in the page table map the N 4 K space of the virtual address space, that is, (N-1) * 4 K ~ The address space of 4kn. In this way, the size of the page table is 1 MB * 4 = 4 MB, and the matching index is only an offset calculation, which is very fast.

 

The following two concepts are clarified before the introduction of the second part:

1. Bank indicates the meaning of code blocks, similar to the page concept mentioned above.

2. memory reuse for different codes: Different codes mean the code corresponding to different virtual addresses (the addresses after program links are all virtual addresses), and the memory is the physical memory, that is, the code of different virtual addresses of a certain size runs on the same physical memory space at different times. Each block of code is a different code bank.

 

Ii. Hardware Design of SOC Memory Management Unit in the Controller Field

This refers to the SOC design without memory management units. Generally, to reduce costs, if the performance is sufficient, if the 16-bit or 24-bit long CPU can solve the problem, generally, a 32-bit CPU is not selected unless it is a computing performance consideration or the 32-bit CPU license is cheaper (rarely seen ). As long as you can achieve efficient memory management and achieve time-sharing of physical memory, it can be called successful or effective.

There are two methods to achieve code block management in the case of MMU hardware units:

1) The tool chain is used to implement the memory time-sharing mechanism.

2) design our new Memory Management Unit, including its hardware working mechanism, software design and key mechanisms, based on MMU and the block processing method implemented by this toolchain.

Because 2) the content involved in patent review is temporarily hidden and made public later, sorry!

When a CPU without MMU is integrated, the SOC needs to implement memory management by designing another memory management module to implement the core functions of MMU, that is, code paging (Block) ing, in addition, simplified design is required to achieve the highest efficiency, and code blocks must be directly reflected in the Link script. For efficiency purposes, the executable file after the compiled link will also be organized into a simpler execution file by offline parsing, removing unnecessary segments, the block code is put in the logical order, so that the operating system can load the code more quickly when necessary. Of course, the Code memory management of the operating system also needs to work with the memory management hardware circuit, and be able to parse the re-packaged executable program files. Therefore, the implementation of memory management requires architects to fully consider the software and hardware, and try to simplify the circuit and design on the basis of implementing core functions. The modules involved include: hardware mechanism design, physical memory allocation, code block principle, linker script definition, packaging and execution files, and operating system customization.

SOC embedded software architecture design II: virtual memory management principle, MMU hardware design and code block management

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.