2.1 Memory Address
2.2 Segmentation in Hardware
- Intel processors perform address translation in two ways:
- In order to maintain the processor's compatibility with earlier models, and let the operating system bootstrap
- Protection mode
2.3 Segment Selector and segment registers
- The composition of a logical address: The offset of a segment identifier and a relative address within a specified segment
- The segment identifier is a 16-bit long field, called the segment selector
- Offset is a 32-bit long field
- 6 Segment Registers:
- CS: Code segment Register, point to segment containing program directives
- Another important feature: contains a two-bit field to indicate the current privilege level (CPL) of the CPU: 0 for the highest priority, and 3 for the lowest priority
- SS: Stack segment register, pointing to the segment containing the current program stack
- DS: Data segment register, point to contain static data or global data segment
- Its Karma 3 is a general segment register, can point to any data segment
2.4 Segment Descriptor
- Each segment has a 8-byte segment descriptor representation that describes the characteristics of the segment
- The segment descriptor is stored in the Global Descriptor List (GDT) or in the local Descriptor descriptor (LDT)
- Typically only one GDT is defined, and each process needs to create an additional segment to have its own LDT
- GDTR: Storing the address and size of the GDT in main memory
- LDTR: Store the LDT address and size that is currently being used
- Segment Descriptor Field table (table 2-1,p43)
field name |
description |
Base |
contains the first Linear address of Byte |
G |
granularity flag: 0, Segment size in bytes; otherwise in multiples of 4KB |
Limit | The
holds the offset of the last memory cell in the segment, which determines the length of the segment |
S |
system flag: Cleared 0-This is a system segment; otherwise a normal code snippet or data segment |
type |
describes the type characteristics of a segment its access Permissions |
DPL |
descriptor Privilege Level field: The segment set to 0 can only be accessed if the CPL is 0 o'clock, and the segment of DPL 3 is accessible for any CPL value |
P |
segment-present flag: 0-segment is not currently in main memory; Linux always sets this flag (47 bits) to 1 because it never swaps the entire segment to disk |
D or b |
is called a B or D flag, depends on the code snippet or data segment, if the offset address is 32 bits, is basically set to 1, if 16 bits, then is cleared to 0 |
AVL flag |
can be used by the operating system, but Ignored by Linux |
-There are several different types of segments and their corresponding segment descriptors, types that are widely used in Linux:
-Code Snippet Descriptor: Indicates that this descriptor represents a piece of code that can be placed in a GDT or LDT with the S flag 1 (non-system segment)
-Data Segment Descriptor: Indicates that the segment descriptor represents a data segment that can be placed in a GDT or LDT, which is identified as 1, and the stack segment is implemented by a general data segment
-Task Status Segment Descriptor (TSSD): This descriptor represents a task status segment, which means that the segment is used to hold the contents of the processor register, which can only appear in the GDT
-The value of the Type field is 11 or 9, depending on whether the process is running on the CPU, and the S flag for that descriptor is set to 0
-Local Descriptor Descriptor: Indicates that the segment descriptor represents a segment containing the LDT, which appears only in the GDT, with the type value 2,s 0
2.5 Quick Access Segment descriptor
Field name |
Description |
Index |
Specifies the corresponding segment descriptor placed in the GDT or LDT |
TI |
0: The segment descriptor is in the GDT; 1: In the LDT |
RPL |
Requester Privilege level: Indicates the current privilege level of the CPU when the segment selector is loaded into the CS register, and optionally weakens the processor's privileged level when accessing the data segment |
-Since the segment selector is 8 bits long, its relative address in the GDT or LDT is the highest 13-bit value with the segment selector multiplied by 8.
For example, if the GDT is in 0x200020000, and a segment selector specifies an index number of 2, the corresponding segment descriptor address is 0x00020000 + (2 x 8), or 0x00020010
(The base site of the GDT is stored in the GDTR register)
- GDT Related:
- The first entry is always set to 0, which ensures that the logical address of the empty segment selector is considered invalid and therefore causes a processor exception
- The maximum number of segment descriptors that can be saved in the GDT is 8191, which is 2 of 13 parties minus 1
2.6 Segment Unit
- The following procedure is performed by the staging unit:
- Check the Ti field of the segment selector first to determine which descriptor table the segment characters in
- Calculates the address of the segment descriptor from the index field of the segment selector, the value of the index field multiplied by 8, and the result is added to the contents of the GDTR or LDTR register
- Adding the offset of the logical address to the value of the base field of the segment descriptor yields a linear address
The process of translating a logical address into a physical address
With the non-programmable registers associated with the segment register, only the first two operations need to be performed if the contents of the segment register are changed to a string.
2.7 Segments in Linux
- Segments in the 80x86 microprocessor encourage programmers to divide their programs into logically related entities
- Example: subroutine or global vs. local data area
- In fact, fragmentation and paging are somewhat redundant because they can divide the physical address space of a process
- Segments can assign a different linear address space to each process
- Paging can map the same linear address space to different physical spaces
Linux is more inclined to use paging because:
- Memory management becomes simpler when all processes use the same segment register value, i.e. they can share the same linear address
- One of the goals of the Linux design is that it can be ported to most popular processors, and then RISC architectures have limited support for fragmentation
-Version 2.6 Linux only needs to be segmented with 80x86 structure
- All Linux processes running in the user state are addressing instructions and data in a pair of identical fields, that is, the user data segment and the user code segment
- All Linux processes running in the kernel state are addressed with a pair of identical segments for instructions and data, i.e. kernel data segments and kernel code snippets
- The value of the Segment Descriptor field for the four major Linux segments
Segment Selection Fu Yuhong __user_cs,__user_ds,__kernel_cs and __kernel_ds are defined separately
In order to address the kernel snippet, the kernel only needs to load the value generated by the __KERNEL_CS macro into the CS segment register
The same logical address can be used by all processes under user and kernel state
Because the linear address associated with the segment starts at 0, it reaches 2 of the 32-time-square minus 1 addressing limit.
In Linux, the logical address is consistent with the linear address, that is, the value of the offset field of the logical address is always the same as the value of the corresponding linear address.
Because all the fields start with 0x00000000
As long as the current privilege level is changed, some segment registers must be updated accordingly, such as the DS segment Register (also the SS segment register):
- When cpl=3 (user state), the DS register must contain segment selectors for user data segments
- When Cpl=0, the DS segment register must contain the segment selector for the kernel data segment
- When you save a pointer to an instruction or data structure, the kernel does not need to set the segment selector for the logical address at all, because the CS register contains the current segment selector
- For example:
- When the kernel tune executes a call assembly instruction to invoke a function, the directive only specifies the offset portion of its logical address, because the segment selector is already hidden in the CS register.
- Because the "kernel-state Execution" segment is only a snippet, defined by the macro __kernel_cs, it is sufficient to load the __kernel_cs into CS as soon as the CPU switches to the kernel state.
- The case in the example is also suitable for pointers to kernel data structures and pointers to user data structures
2.8 Linux GDT
Linux LDT
- Most user-configured Linux programs do not use a local descriptor descriptor, so the kernel defines a default LDT for most processes to share
- The local default LDT is stored in the Default_ldt array, which contains 5 items, and the kernel uses only two of them effectively:
- Call gate for IBCs execution files
- Call gate for solari/x86 executable file
- A mechanism provided by the 80x86 microprocessor when invoking a gate is used to change the privileged level of the CPU when invoking a predefined function, as detailed in the Intel documentation
- In some cases, the process still needs to create its own local descriptor
- Useful for some programs, such as the wine program, which perform segment-oriented Microsoft Windows applications
- Modify_ldt () system call allows a process to create its own LDT
- Any custom local descriptor that is created by Modify_ldt () still needs his own segment
- The LDT table entries in the GDT copy of the CPU are modified when the processor starts executing the process of owning a custom LDT.
- User-state programs can also use Modify_ldt () to allocate new segments, but the kernel never uses these segments, nor does it know the corresponding segment descriptors, because these segment descriptors are included in the process-specific LDT.
Paging through the paging mechanism hardware
- Paging Unit (paging unit) converts a linear address to a physical address
- One of the key tasks is to compare the requested access type with the access to the linear address, and if this access is not valid, a fault is generated.
- A few important concepts:
- Pages: Linear addresses are divided into groups of fixed-length units, called pages, and typically, pages are both a set of linear addresses and data contained in this set of addresses
- The continuous linear address within the page is mapped to a contiguous physical address
- The kernel can specify the physical address of a page and its access rights without having to specify access to all of the linear addresses contained in the page
- Page box: Paging unit divides all ram into a fixed-length page box (physical page), and the length of a page box is consistent with the length of a page
- A page box is a part of main memory and therefore a storage area, and a page is just a block of data that can be stored on any page box or disk
- Page table: The data structure of linear address mapping to physical address becomes a page table, the page table is stored in main memory, and must be initialized by the kernel before paging unit is enabled
- All 80x86 processors support paging and are started by setting the PG flag of the CR0 register
- The PG is 0, and the linear address is interpreted as a physical address.
- PG is 1, start paging mechanism
General page Pagination
Extended Paging
Hardware Protection Solutions
- There are only two privileged levels associated with page tables because the privileges are controlled by the USER/SUPERVISOR flag mentioned in the "General paging" section
- The flag is 0: only if the CPL is less than 3 (which means that the processor is in the kernel State for Linux) to address the page
- The flag is 1: The page is always addressed
- There are only two access permissions for a page: Read and Write
- The Read/write flag for a page directory or page table entry is 0: the corresponding page table or page is read-only
- Otherwise, it is readable and writable.
Example of a regular page
- Assume that the linear address space of the memory to a running process is 0x20000000 to 0x2003ffff; the space is made up of just 64 pages
- Analysis of the linear address space:
- The linear address high 10 bits (directory fields) start with 2 followed by 0, so the high 10 bits have the same value, all 0x80, pointing to the 129th item in the page directory
- The physical address of the page table in the page directory that must contain the process
- If no other linear address is assigned to the process, the remaining 1023 items in the careers directory are filled with 0
- The value in the Middle 10 (table field) is 0-0x03f, so only the first 64 bits of the page table are meaningful, and the remaining 960 are filled with 0
- Assume that a thread reads bytes in 0x20021406
- The 0x80 of the directory field is used to select the 0x80 directory entry for the page directory
- The table field 0x21 is used to select the 0X21 table entry for the page table, which points to the page box that contains the page you want
- If the present flag of the page table is 0, the page is not in main memory, and a fault is generated.
- A fault is generated whenever a linear address is accessed from the linear space 0x20000000 to the 0X2003FFFF address.
- Because these page tables are all filled in for 0, especially their present flags are cleared 0.
- Offset field 0x406 The byte in the target page box to read offset to 0x406
Deep understanding of Linux Kernel notes-chapter NO. 02: Memory addressing. MD