Linux segmentation Mechanism

Last Update:2018-12-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I. Memory Address Classification

For programmers, the memory address can be simply understood as a way to access the content of a storage unit. For the 80x86 series microprocessor, we need to distinguish three addresses:

(1) Logical Address

This address is usually used in machine language to specify the address of the operand or machine command. This type of address is well reflected in the famous 80x86 segment system architecture, therefore, both dos programmers and Windows programmers have to divide their programs into segments (such as segments, data segments, and stack segments ). Each logical address is composedSegment (segment)And the actual start address of the relative segmentOffset address(That is, offset
Or displacement.

(2) linear address (also known as virtual address)

A 32-bit unsigned integer can be used to describe the memory address space of 4 GB (Power 32 of 2 = 4 GB), equivalent to 4,294,967,296 memory units. Linear addresses are usually expressed in hexadecimal notation. The value range is 0x00000000 ~0 xffffffff.

(3) physical address

This type of address is used to address/access the storage units in the memory chip. They correspond toElectrical signals. A physical address is represented in hexadecimal notation by a 32-bit or 38-bit unsigned integer.

A memory management unit (MMU) can convert a logical address to a linear address through a hardware circuit called a segment unit) and then convert the linear address to the physical address. As shown in:

In a system with multiple processors, all CPUs share the same memory. This means that the RAM chip may be accessed concurrently by independent CPUs. Since read and write operations on the RAM chip must be performed in serial modeMemory ArbitrationHardware circuit. This circuit is used to authorize CPU access when the RAM chip is in idle state (idle here, of course, it is not accessed by the CPU "); when the RAM chip is busy (the CPU is already using the chip), the access is delayed (temporarily disabled, that is, temporarily "Closed Door") by other CPUs.

Even a single processor system uses a memory arbitration. This is because these systems include the special processor-the DMA controller, which also has concurrent operations with the CPU. Of course, the memory arbitration circuit in a multi-processor system is more complex because it has more input ports. For example, the dual-core Pentium maintains a dual-port arbitration device at the entry of each chip, and requires that two CPUs must exchange synchronous messages when using a public bus. From the programming point of view, the arbitration is invisible because it is completely managed by hardware circuits.

2. Conversion from logical address to virtual address

1. segment selector and segment register

A logical address consists of two parts:Segment identifierAnd the relative offset in the CIDR block. The segment identifier is a 16-bit domain called the selector, And the offset address is a 32-bit domain.

To facilitate fast retrieval of segment selection characters, the processor provides six segments register (segmentation register) to cache segment selection characters, which are CS, SS, DS, es, FS, and Gs. although there are only these six registers, the program can reuse the same register for different purposes. You only need to save the register content to the memory, you can restore its content as needed later. Note that CS, SS, and DS have special purposes. See the following introduction:

CS ----- memory segment register, pointing to the segment containing code instructions; CS register also has an important function: used to distinguish between user mode and kernel mode, it contains a specified current priority level (CPL, current priviledge level. If the value of this field is 0, it indicates the highest priority. If the value is 3, it indicates the lowest priority. Linux only uses 0 and 3 to distinguish between kernel mode and user mode.

SS ----- stack segment register, pointing to the segment containing the current program stack;

DS ----- data segment register, pointing to a segment that contains both static and global data.

The other three, elasticsearch, FS, and GS, are general-purpose segmentation registers that can point to any type of segments.

2. segment descriptor

Each segment is represented by an 8-byte segment descriptor, which describes the features of the segment. Segment descriptors are either stored in the Global Descriptor Table (gdt) or stored in the Local Descriptor Table (LDT). Generally, only one gdt is defined. However, each process can have its own LDT. If the process needs to create a segment Other than the one described in gdt. The gdt address and size in the master memory are included in the GDTR control register, while the LDT address and size currently in use are included in the ldtr control register.

Base: The linear address of the first byte of the segment.

G: If it is 0, the segment size is expressed in bytes.

Limit: saves the Offset Value of the last storage unit in the segment, so it is bound with the length and size of the segment. If G is 0, the limit size range is 1 byte ~ 1 MB; otherwise, the limit size range is 4 kb ~ 4 GB.

S: If it is 0, it is a system segment. Otherwise, it is a common data segment or code segment. The system segment stores key data structures, such as LDT.

Type: Describes the segment type.

DPL: descriptor priority. It is mainly used to restrict access to segment.

P: indicates whether the description segment is in the memory.

3. fast access to segment Descriptors

We know that a logical address consists of a 16-bit segment selector and a 32-bit offset address. At the same time, we also know that only segment delimiters are stored in the segment register. We will continue to analyze the segment Selection Characters in section 1. The format is as follows:

Index: indicates the entry of the gdt or LDT middle descriptor. Because the segment register has 8 bytes long, the relative address of the segment register in gdt or LDT is calculated as follows: 13 bits long (for example, 3-15 bits) the index field value of is multiplied by 8. assume that gdt is located at 0x00020000 (the value is stored in the GDTR control register) and the index field value of the segment selector is 2, then the segment descriptor address is calculated as follows:

0x00020000 +(2X8),That isZero X 00020010.

Ti: Table indicator. Ti = 0, indicating that the segment descriptor is in gdt; Ti = 1, indicating that the segment descriptor is in LDT.

RPL: requestor previlige level requester priority.

To speed up the conversion from a logical address to a linear address, 80x86 adds an unprogrammable register.

Each time a segment selector is loaded into a segment register, the corresponding segment descriptor is also loaded from the memory into the matching unprogrammable CPU register. In this way, the conversion of logical addresses does not need to access the gdt and LDT in the main memory, but only the unprogrammable register that contains the segment descriptor. LDT or gdt must be accessed only when the part register content changes.

3. segment Unit

We know that a memory management unit (MMU) can convert a logical address to a linear address through a hardware circuit called a segmentation unit. Then, A linear address is converted to a physical address by using a circuit called paging unit. As shown in:

In this case, what process does this segment unit fulfill its responsibilities?

First, it checks the Ti domain of the segment selection operator to know which description table stores the corresponding segment descriptor. If the segment descriptor is in gdt, the segment unit reads the linear base address of the gdt from the GDTR register; otherwise, the segment unit reads the linear base address of the LDT from the ldtr.

Secondly, calculate the segment descriptor address based on the index field of the linear base address and segment selection operator obtained in the previous step. For the calculation method, see the previous section.

Through the above two steps, we can locate the segment descriptor we need.

Finally, the offset of the Logical Address is added to the linear address base field of the segment descriptor located above to obtain the linear address. In this way, the entire logical address to linear address conversion process becomes

Shows the process of address translation:

Note: Linear and virtual addresses in this article are a concept.

3 Linux Segmentation

We know that the kernel cannot address a ram space larger than 1 GB. Therefore, when the ram space is larger than 1 GB, a vague concept-segmentation will be introduced. In the x 86 system processors, they encourage programmers to divide their procedures into logically related entities, such as subprograms or global and local data areas. However, our Linux does not fully use this mechanism, but it only introduces this method in a very limited way. Segments can allocate different linear address spaces for each process, while paging can map the same linear address space to different physical spaces. In Linux 2.6, all Linux processes in the kernel state use a pair of identical segments for instruction and data addressing: they are called Kernel segments and user data segments respectively.

Note: The linear address related to segments starts from 0, which can reach the addressing length of 2 ^ 32-1. That is to say, all processes in the user or kernel state can use the same logical address. In Linux, the logical and linear addresses are the same.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Linux segmentation Mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Linux segmentation Mechanism

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support