1. Preface
The series of articles about memory management in this article is mainly about the memory management knowledge lecture of Chen Li June teacher.
This lecture is divided into three topics on memory management: Hardware foundation of memory management, management of virtual address space, management of physical address space.
This article will mainly take X86 architecture as an example to introduce the segment mechanism of Linux memory management.
2. Paragraph mechanism
- Segment is the basic unit of virtual address space
- The segment mechanism must translate an address of the virtual address space into a linear address of the linear address space
- There are three ways to describe a segment (e.g.)
Base Address of the segment: The beginning of the line in the middle of the linear address space;
Limit: The maximum amount of offsets that can be used within a segment in a virtual address space;
Protection properties for segments: Represents the attributes of a segment. For example, whether the segment can be read out or written. Or whether the segment can be executed as a program, and the privilege level of the segment
General format of the segment descriptor
- Segment Descriptor Descriptor
Figure Segment Descriptor Table (segment)
The segment number describes the number of the virtual address space segment; The base address is the starting address of a linear address space segment
- Other descriptor descriptor in protected mode
Global Descriptor Descriptor (GDT)
Interrupt Descriptor Descriptor (IDT)
Local Descriptor Descriptor (LDT)
Note: To expedite access to the above descriptor table, Intel has designed a special register that stores the base address and the length bounds of the descriptor tables.
- Segment registers in protected mode
The segment register holds the index of the Segment descriptor table (the segment number, also called the selector), that is, select a segment from the Segment descriptor list.
which
Index (13-digit specified segment number);
TI (Table Index) indicates whether to use a global descriptor or a local descriptor.
RPL (requestor Previlege level): 2-bit specifies 4 privilege levels to determine the kernel state or user state
Segment registers in Figure protected mode
- Privileged level in protected mode
Protected mode provides four privileged levels, represented by 0~3 four digits
The general operating system only uses the lowest and the highest two, that is, 0 is the highest privilege level, the kernel state, 3 is the lowest privilege level, the user state
Protected mode provides discretionary access to low privilege levels for high privilege levels, and low privilege levels do not have access to high privilege levels
- Address Translation and protection
Figure Address Translation
The virtual address in the program can be represented as a "selector: offset" in such a way that a virtual address can be converted to a linear address by the following steps:
(1) A segment selector is loaded in the segment register, and a 32-bit address offset is loaded into a register (such as ESI, EDI, and so on).
(2) According to the index value in the selector, TI and RPL value, and then according to the corresponding descriptor table Subgrade address and paragraph bounds, a series of legitimacy checks (such as privilege level check, bounds check), if there is no problem, take out the corresponding descriptor into the segment descriptor buffer Register 3.
(3) A 32-bit linear address is formed by adding the 32-bit segment base address in the descriptor and the 32-bit valid address placed in ESI, EDI, etc.
Note that in the above address translation process, the segment is protected from two aspects:
(1) within a segment, if the offset is greater than the segment bounds, the virtual address will be meaningless and the system will produce an exception.
(2) If a segment is to be accessed, the system checks that the visitor has access rights based on the protection properties of the segment, and if not, an exception occurs. For example, if you want to write in a read-only segment, the system detects that this is a violation based on the attributes of that segment, and an exception occurs.
3.Linux How to process segments
Linux divides the entire virtual address space into a single segment with a base site of 0 and a segment with a limit of 4G.
Since many architectures do not require a segment mechanism to be required, this process uses the segment mechanism and subtly dissolves the segment mechanism.
- You cannot avoid using the segment mechanism when designing an operating system on IA32
Any address given on the IA32 is a virtual address, that is, any address is given by the "selector: Offset" method, which is the basic feature of the memory access mode of the segment mechanism
Therefore, when designing an operating system on IA32, it is not possible to avoid the use of segment mechanism, IA32 specified segment mechanism is not forbidden, so it is impossible to bypass it directly give the address of the linear address space
A virtual address will eventually be converted to a linear address by the "segment base address + offset" method.
- Why bypass the segment mechanism
Because the majority of hardware platforms do not support the segment mechanism, only support paging mechanism, so in order to make Linux more portable, we need to remove the segment mechanism and use only paging mechanism
- How Linux bypasses the segment mechanism
The Linux designers simply let the base address of the segment be 0, while the boundary of the segment is 4GB, when any given an offset, the equation is "0+ offset = linear address", that is, "offset = linear address".
In addition, because the segment mechanism specifies "offset <4GB", the range of offsets is 0H~FFFFFFFFH, which is exactly the linear address space range, which means that the virtual address is mapped directly to the linear address.
The virtual address and linear address we refer to hereafter refer to the same address.
It seems that Linux has cleverly bypassed the segment mechanism without bypassing the segment mechanism.
- What segments must be created by Linux
Because the IA32 segment mechanism also stipulates that different segments must be created for both the code snippet and the data segment, Linux must create a segment descriptor for the code snippet and data segment with a base address of 0 and a segment boundary of 4GB respectively.
Not only that, because the Linux kernel is running at privileged level 0, and the user program is running at privileged levels 3, according to IA32 's segment protection mechanism, privileged Class 3 programs cannot access the privileged 0 segment, so Linux must create its own code snippets and data segments for both the kernel and the user program.
This means that Linux must create 4 segment descriptors--privileged level 0 code snippet and data segment, privileged level 3 code snippet and data segment.
- Linux protects memory by paging mechanism
If you define segments like this, the first function of the segment protection described in the previous section is lost because these segments use exactly the same linear address space (0~4GB) and they cover each other.
It can be imagined that if you do not use paging, the linear address space is directly mapped to the physical space, then you modify any one segment of the data, will also modify the other segments of the data
The method provided by the segment mechanism to divide the linear address space through the "Base Address: Boundary" method, in order to completely isolate the segment from the segment, the way to achieve the protection of the segment does not work at all.
So, does this mean that the user is free to modify the kernel data? Obviously not, because, on the one hand, user segments and kernel segments have different levels of privilege, and on the other hand, Linux defines segments to achieve a pure paging, and the paging mechanism provides the protection we need.
4. References
[1] Linux memory management lecture ppt-Chen Li
Linux Memory Management 2---segment mechanism