Compilation tutorial: segmented management mechanism

Source: Internet
Author: User
Tags definition extend range readable valid

In this paper, we introduce the definition of segment in protection mode and how to convert two-dimensional virtual address consisting of segment selector and segment offset into one-dimensional linear address.

< a > segment definition and Conversion of virtual address to linear address

Segment is the basis of implementing virtual address to linear address conversion mechanism. In the protected mode, each segment is defined by the following three parameters: Janki address, segment Bounds (Limit), and Segment attributes (Attributes).

The Janki address stipulates the start address of the middle of the linear address space. Under 80386 protection mode, the Janki address is 32 bits long. Because the base address length is the same length as the addressing address, any one of the segments can start from any byte in the 32-bit linear address space, rather than the boundary specified in the real way must be divisible by 16.

Paragraph limits specify the size of the segment. In 80386 protection mode, the segment bounds are represented by 20 bits, and the segment bounds can be in bytes or 4 K bytes. This is defined by one of the segment attributes, which becomes a bit of granularity and is marked with a symbol G. G=0 represents the segment bounds in byte-bit units, so the 20-bit bounds can be expressed as a range of 1 bytes to 1M bytes, with an increment of 1 bytes; The g=1 represents the segment bounds in 4 K bytes, so the 20-bit bounds can be expressed in the range of 4K bytes to 4G bytes, and the increment is 4K bytes. When the segment boundary is in 4 K bytes, the actual segment boundary limit can be computed from the 20-bit boundary limit by the following formula:

Limit=limit*4k+0fffh= (LIMIT SHL) +0FFFH

So when the granularity is 1 o'clock, the boundary of the segment actually expands to 32 bits. Thus, in 80386 protection mode, the length of the segment can be significantly more than 64K bytes.

The base address and bounds define the range of linear addresses that are mapped by the segment. Base site base is a linear address that corresponds to a virtual address with an offset of 0 in the segment, and a virtual address in the segment that is offset by x corresponds to the base+x's linear address. The virtual addresses within the paragraph from the offset 0 to the limit range correspond to the linear addresses from base to Base+limit range.

The following figure shows how a segment navigates from a virtual address space to a linear address space. Figure Basea, such as Representative Janki address, Limita and other representative section boundaries. In addition, segment C is followed by paragraph A, which is also basec=basea+limita.
For example: the base address of section A is equal to 00012345H, the segment limit is equal to 5678H, and the segment boundary is in bytes (g=0), then segment a corresponds to the region from 00012345H-000179BDH in the linear address space. If the segment bounds are in 4 K bytes (g=1), then segment a corresponds to the region of the 00012345h-0568b344h (=00012345H+5678000H+0FFFH) in the linear address space.

By increasing the segment bounds, the capacity of the segment can be expanded. This is useful for ordinary data segments that are to extend capacity in memory, but not for stack segments. Because the bottom of the stack is at the high address, the stack extends to the lower address direction as the stack operation progresses. In order to accommodate the extension of normal data segments and stack segments in two opposite directions, an extended orientation bit is arranged in the segment properties of the data segment, marked Ed. Ed=0 represents a high-end extension, ed=1 to the low-end. Typically, only stack data segments use properties that are extended to the low-end (stack segments can also use up-scaling segments), because the segments that are scaled down are designed for the following two purposes:

First, the stack segment is defined as a unique segment, where the DS and SS contain different selectors.

Second, a stack segment is augmented by copying it to a larger field (rather than adding an existing page to its segment). The designer who does not intend to implement the stack in this way does not need to define a segment that extends downward.

It should be noted that only the segment attribute of the data segment has an extension direction attribute bit Ed, that is, only the data segment (the stack segment as a special data segment) to expand and downward expansion of the points, the other segments are natural upward expansion.

The extension direction and segment bounds of the data segment determine the valid range of offsets within the data segment. When the segment is maximum 1M bytes, in segments extending to the high-end, the offsets from 0 to limit are valid offsets, while the offsets from limit+1 to 1m-1 are illegal and invalid offsets; in the lower-extended segment, the opposite is the case, where the offset from 0 to limit is an illegal, invalid offset. The migration from limit+1 to 1m-1 is a valid offset, and the value of the boundary limit the validity of the corresponding address. When the segment is Max 4G, the situation is similar. This shows that if a segment is extended downward, all offsets must be greater than the limit length, because the limit is the lower limit, and its base address begins at the high addresses. Conversely, if a segment is extended upwards, then all offsets must be less than or equal to the limit length, because the limit is the upper limit, and the base address starts at the lower location. By using segment wrapping, you can define the downward extension segment to any linear address and can be defined to any size.

In the process of converting a virtual address to a linear address, the offset is checked. If the offset is not in a valid range, it causes an exception.

Paragraph attributes specify the main characteristics of the segment. For example, the segment granularity G mentioned above is part of the segment attribute. When various accesses are made to a segment, the access is checked for legality, mainly on the basis of paragraph properties. For example, if you write to a read-only segment, not only cannot write, but it can cause an exception. The definitions and roles of the ripe attribute bits in each section are described below.

< two > Storage Segment descriptor

The data structure used to represent the three parameters of the above defined segment is called a descriptor. Each descriptor is 8 bytes long. In the protected mode, each paragraph has a corresponding descriptor description. Divided by the objects described by descriptors, descriptors can be grouped into the following three categories: Storage segment descriptor, System segment descriptor, Door descriptor (Control descriptor). The storage segment descriptor is described below.

1. Format of storage Segment descriptor

A storage segment is a segment that holds code and data that can be accessed directly by the program. The storage segment descriptor describes the storage segment, so the storage segment descriptor is also referred to as the code and data segment descriptor. The format of the storage segment descriptor is shown in the following table. The above row in the table is a description of the use of the descriptor 8 bytes, the lowest address byte (assuming the address is m) at the far right, and the remaining bytes to the left, and then to the highest byte (address m+7). The next row is a description of the property field.

Storage segment

Descriptor

M+7 M+6 M+5 M+4 M+3 M+2 M+1 M+0
Base (31...24) Attributes Segment Base (23...0) Segment limite (15...0)

Storage segment

Descriptor

Property

Byte m+6 Byte m+5
bit6 bit5 bit3 bit2 bit0 bit7 bit5 BIT4 bit2 bit1 bit0
G D 0 Avl Limit (19...16) P Dpl DT1 TYPE

From the above table, the 32-bit Janki address (the beginning address of the segment) is arranged in two domains of the descriptor, and its bit 0-bit 23 is arranged in the 2nd-4th byte of the descriptor, and its bit 24-bit 31 is arranged in the 7th byte of the descriptor. The length 20-bit bounds are also arranged in two domains of the descriptor, and the bit 0-bit 15 is arranged in the No. 0-1th byte of the descriptor, and the bit 16-bit 19 is arranged in the lower 4 bits of the 6th byte within the descriptor.

The reason for storing Janki addresses and segment boundaries using two domains is related to 80286. In 80286 protection mode, the Janki address is only 24 bits long, and the segment limit is only 16 bits long. The 80286 storage segment descriptor, although also 8 bytes long, actually uses only a low 6 byte, and a high 2 byte must be set to 0. The 80386 storage segment descriptor allows the format of the 80286 storage segment descriptor to continue to be valid under 80386.

The segment attribute in the 80386 descriptor is also arranged in two domains. Here is a description of its definition and meaning.

(1) P-bit is called presence (Present) bit. P=1 indicates that the descriptor is valid for address translation, or that the segment described by the descriptor exists, that is, in memory; p = = The descriptor is not valid for address translation, that is, the segment does not exist. Using this descriptor for memory access can cause an exception.

(2) DPL represents the Descriptor Privilege level (descriptor privilege levels), a total of 2 bits. It provides a privileged level for the described segment and is used for privilege checking to determine whether the segment can be accessed.

(3) The DT bit describes the type of descriptor. For the storage segment descriptor, the dt=1 is distinguished from the system segment descriptor and the gate descriptor (dt=0).

(4) Type describes the specific properties of the storage segment described by the storage segment descriptor.

The bit 0 indicates whether the descriptor has been accessed (accessed) and is marked with symbol A. A=0 indicates that it has not been accessed, a=1 indicates that the segment has been accessed. When the corresponding selector of the descriptor is loaded into the segment register, 80386 places the position at 1, indicating that the descriptor has been accessed. The operating system can test the access bit and determine if the descriptor has been accessed.

The bit 3 indicates whether the described segment is a code snippet or a data segment, marked with symbol E. The e=0 represents the segment as a data segment, and the corresponding descriptor is the data segment (including the stack segment) descriptor. Data segments are not executable, but are always readable. E=1 represents a segment that is an executable segment, a code snippet, and a corresponding descriptor is a code snippet descriptor. Code snippets are always writable, and if you need to write to a snippet, you must use the alias technique, which is to describe the code snippet with a writable data segment descriptor, and then write to the data segment.

In the data segment Descriptor (e=0), the bit 1 in type indicates whether the described data segment is writable and is marked with a W. W=0 indicates that the corresponding data segment is not writable. Conversely, w=1 indicates that the data segment is writable. Note that the data segment is always readable. The bit 2 in type is the ED bit, which indicates the extension direction of the described data segment. Ed=0 indicates that the data segment is extended to the high-end and that the offset within the segment must be less than or equal to the segment boundary. Ed=1 indicates that the data segment is extended to a low, and that the offset within the segment must be greater than the segment bounds.

In the Code snippet descriptor (e=1), the bit 1 in type indicates whether the described code snippet is readable and is marked with a symbol R. R=0 indicates that the corresponding code segment is unreadable and can only be executed. R=1 indicates that the corresponding code snippet can be read executable. Note that code snippets are always not writable, and if you need to write to snippets, you must use the alias technique. In the code snippet, the bit 2 in type indicates whether the described code snippet is a consistent code snippet and is marked with C. C=0 indicates that the corresponding code snippet is not a consistent code snippet (normal code snippet), c=1 indicates that the corresponding code snippet is a consistent code segment. For a description of the consistent code snippet, the following article will be covered in more detail.

The attributes that are described in the Type field in the storage segment descriptor can be summarized in the following table:

Data segment

Type

Type value Description
0 Read-only
1 Read-only, accessed
2 Read/write
3 Read/write, visited
4 Read-only, extend down
5 Read-only, downward-extended, accessed
6 Read/write, extend down
7 Read/write, down extended, accessed
Code Snippets

Type

Type value Description
8 Only Execute
9 Only Execute, visited
A Execute/Read
B Execute/Read, visited
C Execute only, consistent code segment
D Execute only, consistent code segment, visited
E Execute/Read, consistent code segment
F Execute/Read, consistent code segment, visited

(5) G is the segment boundary granularity (granularity) bit. G=0 indicates that the boundary granularity is byte; G=1 represents the boundary granularity of 4K bytes. Note that the boundary granularity is only valid for the segment bounds, the Janki address is not valid, and the Janki address is always in bytes.

(6) D-Bit is a very special bit that has different meanings in the three descriptors that describe the executable segment, the extended data segment, or the segment (usually the stack segment) that is addressed by the SS register.

In the descriptor describing the executable segment, the D-bit determines the address used by the instruction and the default size of the operand. D=1 indicates that the instruction uses a 32-bit address and a 32-bit or 8-bit operand by default. Such a code snippet is also called a 32-bit code snippet; d=0 indicates that by default, a 16-bit address and a 16-bit or 8-bit operand are used, which is also known as a 16-bit code snippet, and is compatible with 80286. You can use the address size prefix and operand size prefixes to change the default address or operand size, respectively.

In the descriptor that extends the data segment downward, the D-bit determines the upper boundary of the segment. The upper boundary of the d=1 represents the upper bound of the 4g;d=0 segment is 64K, which is for compatibility with 80286.

In describing the segment descriptor that is addressed by the SS Register, the D-bit determines what stack pointer registers are used for implicit stack access directives such as push and pop directives. D=1 represents the use of a 32-bit stack pointer register esp;d=0 represents the use of a 16-bit stack pointer register SP, which is compatible with 80286.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.