Process-memory segment mechanism
Take a look at the curse of segments before you start reading
hardware segment mechanism of 1.x86
Wiki.osdev-segmentation
Modes of Memory addressing on x86
introduction of 1.1-segment mechanism
? Just before the 8086CPU appeared, address bus was already 16 bits (64KB), and in the beginning the segment was introduced to solve the problem of "address bus width greater than register width ". For example, 8086 registers only 16 bits, but the address bus has 20 bits (1MB), in order to enable the program to utilize the 1MB physical memory space but does not change the register length (the memory capacity expands, the cost basically saves unchanged), the Intel has introduced the segment mechanism in 8086.
? 8086 provides four segment registers CS, DS, SS, and Es, from the meaning of abbreviations, these four registers represent the current running process of code segment (CS), data segment (DS), stack segment (SS), and a user-defined Segment (ES), the operating system can track four segments of a process at the same time.
1.2-paragraph mechanism compilation
? The effect of the segment mechanism on assembler programmers is discussed from a purely hardware perspective:
x86 address mode register indirect addressing modes:
The 80x86 CPUs let you access memory indirectly through a register using the register indirect addressing modes. There is four forms of this addressing mode in the 8086, best demonstrated by the following instructions:
mov [bx] //将DS:bx地址中的字节拷贝到al中mov [bp] //将SS:bp地址中的字节拷贝到al中mov [si] //将DS:si地址中的字节拷贝到al中mov [di] //将DS:di地址中的字节拷贝到al中
As with the x86 [BX] Addressing mode, these-addressing modes reference the byte at the offset found in the BX, BP, si , or di register, respectively. The [BX], [SI], and [di] modes use the DS segment by default. The [BP] addressing mode uses the stack segment (ss) by default.
You can use the segment override prefix symbols if you wish to access data in different segments. The following instructions demonstrate the use of these overrides:
mov al , cs : [BX] mov al , ds : [ BP] mov al , ss : Span class= "Hljs-attr_selector" >[si] mov al , Span class= "Hljs-tag" >es : [di]
Intel refers to [BX] and [BP] as base addressing modes and BX and BP as base registers (in fact, BP stands for Base Pointe R). Intel refers to the [Si] and [di] addressing modes as indexed addressing modes (SI stands for source index, DI stands For destination index). However, these addressing modes is functionally equivalent. This text would call these forms register indirect modes to be consistent.
? For more addressing modes , refer to Art of Assembly:chapter four-the 80x86 addressing Modes.
? Obviously, there is no operating system to manage memory, hardware segment mechanism is mainly used to expand memory, each visit instruction execution will not check whether the memory access is reasonable, there is no protection mechanism, segment sharing can not be achieved.
1.3 How does the hardware compute the address in real mode?
1.3.1 Assembly Code
1. When compiling the assembly code, the programs are organized into different segments. The addresses of the data within each segment are relative offset addresses that are relocatable relative to the segment. Thus, the address space of a program is a two-dimensional (x, y), a dimension represents a segment (0~X), and another dimension represents an intra-segment offset (0~y).
? Address references (eg.) using segment registers under the x86 hardware segment mechanism. mov al, [bx]
You can use the < segment, offset > Two-yuan pair to represent, wherein the segment represents the segment, offset represents the offset within the segment. In real mode, segment is the value of the segment register, so < Segment:offset > is the physical address.
? Segment for 16bit,offset is 16bit, through the address operation Segment:offset get a 20-bit address < Segment:offset, the calculation of this address operation is done by the way:
? As you can see, the segment base is a multiple of 16 (4-bit) and can be taken < Span class= "mn" id= "mathjax-span-4" style= "Font-family:mathjax_main;" >2 16 A different value, the offset within the segment can be taken < Span class= "mn" id= "mathjax-span-11" style= "Font-family:mathjax_main;" >2 16 A different value. It is clear that there is overlap between segments.
? Because there is overlap, for the same physical memory address value can be obtained by a number of different < segment, offset > two Yuan pairs. Therefore < segment, the mapping of offset > to physical memory address is a many-to-one mapping.
1.3.2 A/C + + compiler for DOS in those days
1. Process
Thanks to the advent of the operating system, especially thanks to the very important concept of the process and the advent of some of the relevant technologies, programmers can no longer use RAM-based programming, and do not have to think about how to layout the program to physical memory to make it run properly.
The importance of the process cannot be overemphasized, and if the operating system is an interface between application software and computer hardware, then the process is a virtual view of the physical hardware that the operating system provides to programmers. Programmers are no longer overly concerned with hardware, but instead consider the process view. Each process gives programmers a single-user, single-task view of the virtual machine, with only one program running in the process view. For process-based users, they only need to focus on how to write a single-user single-tasking program by simply thinking that their program is the only task that runs, regardless of whether their tasks are overwritten by other tasks. Memory view of the process:
2.DOS and Turboc compiler
1.DOS is a single-user single-process personal PC operating system based on x86 computer, running in real mode without entering protection mode.
C compiler based on DOS operating system:
3.far pointer in C: Extended program accessible address range
The 1.far type of pointer is used to access the memory address of the extended area (or other space) when the early DOS and OS/2 personal operating systems are popular.
2.In a segmented architecture computer, a far pointer is a pointer which includes a segment selector, making it p Ossible to addresses outside of the default segment. -far pointer
3.We programmers ' view of a C program address space (which was arranged by a C compiler) like this:
Understanding C by Learning Assembly
4.On C compilers targeting the 8086 processor family, far pointers were declared using a non-standard far Qualifi Er.
Size of Far pointer is 4 byte.
First-bit stores: Segment selector
Next + bit stores: Offset Address (or effective address)
? With the far pointer's help, we can tell the C compiler to use of the registers to calculate the address, one for segment sel Ector (+ bit) and one for offset address (+ bit) to this segment.
For more information about far pointer see Load far pointer and the references below. There I find a forum which have the topic near, far and huge pointers.
Example of Far pointer (targeting the 8086 processor):
//What is segment number and offset address?#include<stdio.h>int main(){int x=100;int far *ptr;ptr=&x;printf("%Fp",ptr);return0;}
OUTPUT:8FD8:FFF4 (assume)
Here 8fd8 is segment address and FFF4 are offset address in hexadecimal number format.
Note:%FP is used for print offset and segment address of pointer in printf function in hexadecimal number format.
In the header file Dos.h There is three macro functions to get the offset address and segment address from far pointer an D vice versa.
- Fp_off (): To-get offset address from the far address.
- Fp_seg (): To get the segment address from the far address.
- MK_FP (): To do far address from segment and offset address.
//What will be output of following c program?#include <dos.h>#include<stdio.h>int main(){int i=25;int far*ptr=&i;unsignedint s,o;s=FP_SEG(ptr);o=FP_OFF(ptr);printf("%Fp",MK_FP(s,o));return0;}
OUTPUT:8FD9:FFF4 (assume)
Note:we cannot guess what would be the offset address, segment address and far address in any of the far pointer. These address is decided by the compiler. Also note that the output (address of i) of printf function was an address from the process ' s view, it's not the actual Phys IC Memory of the RAM (because we is programming in a view provided by the process (not RAM)). The actual address of the contents in a process are arranged by the memory management system of OS (different OS May has DI Fferent memory management system, use different mapping strategy).
-far-pointer-in-c-programming
1.4 segment mechanism in protected mode
? The x86 subsequent series of CPUs introduces a protection mode that provides hardware support for the operating system to implement segmented and paged virtual memory management mechanisms. Protected mode
? High-level languages such as C, C + + and other programs written in the process will be compiled into assembly language, in order to better understand the system, we might as well assume that we are a assembler programmer, and then think about the operating system segment mechanism memory management strategy .
1.4.1 Hardware segment mechanism in x86 series CPUs
? Referring to the x86 addressing mode, you will find that all addressing modes use a segment register, either the addressing instruction explicitly specifies a segment register, or the hardware uses the default segment register in the form of an addressing instruction, in short, the hardware segment mechanism of the x86 penetrates into its instruction set, unable to avoid.
When operating in protected mode, some form of segmentation must is used. There is no mode bit to disable segmentation. The use of paging, however, is optional. -64-ia-32-architectures-software-developer-system-programming-manual-chapter-3
1.4.2 Logically cancels the segment mechanism
In the (80x86-based) window and Linux platform creates Macintosh people know that the program we write the virtual address space (or process address space) is not used at all, or can be regarded as all the content (data, stack, code ...) are in one paragraph. The program virtual memory of this layout mode is called flat memory mode , to bypass the hardware segment mechanism of x86, only the relevant segment selector point to the base of the segment is set to 0, the length of the segment is set to 4G.
Memory layout of a process in Linux:
Reference anatomy of a program in Memory
1.4.2 using the segment mechanism to manage process memory
Although the mainstream PC operating system, such as Linux,window, does not use the segment mechanism, however, the x86 series CPU provides hardware support for the operating system that implements segmented memory management, but requires that this type of operation be run in protected mode.
Wiki-x86 Memory Segmentation
2. Operating system segmentation mechanism
Multics-vm-slides
Multics-vm
The memory management of the-multics operating system uses a segmented paging mechanism.
OS/2 1.x programming
-OS/2 is the only operating system which made full use of segmentation features
Segmentation and the Design of multiprogrammed computer Systems
2.1 Fragmentation Mechanism
The
Segment, in which a program's process view is logically organized into different segments (segment), is either a separate object or a separate view, so programmers who program with a staging mechanism can divide programming tasks into segments. Then give different people to write (perhaps with the help of the compiler and other software tools) different segments, you can also create a new field to expand memory when the program is running, for example, C in the early OS/2 system programming, can be:
sel Selarray; PCH Pcharray; USHORT i;D osallocseg (512 , &selarray, 0 ); Span class= "Hljs-comment" >//creates a 512-byte segment and assigns the segment selector to Selarray. Pcharray = Makep (Selarray, 0 ); //returns a far pointer to the segment header of the Selarray segment. for (i = 0 ; i < 512 ; i + +) pcharray[i] = 0 ; //accesses the memory space of the new segment through the far pointer across segments.
While implementing segmented memory management is a matter of operating system (supported by hardware), you want to use segmented memory management to improve the productivity of your programming and other benefits of segmented memory administration (segment sharing, extended memory ...) Programmers also have to logically segment program programming tasks.
? If the specific implementation is not considered, the logical address of the process, in principle, is a < Segment_number, offset > Two-tuple, where Segment_number gives the segment number to select a segment of the process to get all the information related to this segment, Offset gives the paragraph offsets, the logical address will be converted to a linear address, if the paging mechanism is not turned on, the linear address as a physical address, is sent to address bus.
`
After dividing the segment, programmers can program in the segment view of the segment they are responsible for, without thinking too much about the process view, not to mention the RAM physical memory view.
Here's the problem:
Since the logical address of the process is a < Segment_number, the two-tuple of the offset>, how to determine the physical address of the memory when the process is executed during the fetch?
From the point of view of the assembler programmer, the operating system must be able to use the hardware's characteristics to translate this addressing instruction into an exact linear address, and then output it to address bus, when executing an addressable assembly instruction. Here is a workaround for this problem.
2.2 Virtual Address segmentation
Consider the entire process address space as a contiguous address space (virtual address space), and then use the first few virtual address space divided into different segments of the virtual address, the implementation of the virtual address is divided into two parts, part of the segment number bit, part of the partial shift in the segment.
After dividing the process view of a program into different segments, the programmer who is responsible for a segment gets the segment number of the segment that he is responsible for, sets the segment number before programming the process, and then the program does not have to consider the segment number. For example, the relevant segment selector is set to its corresponding segment number by a directive, and all subsequent fetch instructions are automatically offset from the segment selected by the segment selector with the support of the hardware. or set the segment number, shared segment, etc. to the compiler tool to complete, programmers only need to be programmed in their own segment view. As for how the operating system maps segment selector segments to physical memory, it is entirely the operating system, regardless of the segment programmer.
The following is an example of the memory conversion process of the segment mechanism implemented in a straightforward way :
This is the logical address space for a program:
The size of this process is 16KB, there are 3 segments, respectively, code snippet, heap segment, stack segment.
This is the layout of the process in physical memory:
Under the segment mechanism, the operating system loads the process into physical memory in a way that is transparent to the user.
This is the section Description table maintained by the operating system
A process has a local segment table that is created when the process is created, maintained by the operating system, and the table entry is a segment descriptor that records all information related to a segment of the process. The operating system maintains its local descriptor table for the process that executes the program when it maps the above program to main memory, giving the 4 primary information for each table entry in the tables, which are segment base, segment size, segment offset direction, and protection information. A register can be used to point to the local segment table of the current process, and the value of this register is corrected when a process switch occurs.
The following is an example of an illustration based on the diagram above:
- assumes that a directive references the logical address 100 of a process, and when the reference instruction is executed, the hardware automatically resolves the logical address value 100 (00 0000 0110 0100) to get segment_number = XX = 0,offset = 00000110 0100 = 100, first, the hardware compares offset with size to check if offset is out of bounds, 100<2k, and not out of bounds. The resulting physical address is not 100, but base of setment0 + offset = 32K + 100 = 32868.
- assumes that a directive references the logical address 4200 of a process, and when the reference instruction is executed, the hardware automatically resolves the logical address value 4200 (01 0000 0110 1000) to get segment_number = 1,offset = 00000 1101000 = 104, First, the hardware compares offset with size to check if offset is out of bounds, 104<2k, and not out of bounds. The resulting physical address is not 4200, but base of setment1 + offset = 34K + 104 = 34920.
- If a directive references a process's logical address 7KB (01 1100 0000 0000), Get segment_number = 1,offset = 110000000000 = 3KB, First, the hardware takes offset and siz E comparison, check that offset is out of bounds, 3KB > 2KB, the hardware detects that access to the heap segment is out of bounds.
- assumes that a directive refers to the logical address of the process 15KB, and when the reference instruction is executed, the hardware automatically resolves the logical address value of 15KB (11 1100 0000 0000), gets segment_number = 11 = 3, and discovers grows posit Ive field is 0, which indicates that this segment is offset by a high address to a low address direction, so offset = 1100 0000 0000-4KB =-1KB, first, the hardware compares offset with size to check if offset is out of bounds, 1k<2k, and not out of bounds. The resulting physical address is not 15KB, but base of Setment2 + offset = 28K + -1k = 27K.
In this view, the paragraph mechanism is very similar to the page mechanism, the biggest difference between them is:
- The mapping of a segment space and a page's space in physical memory is continuously allocated, and they are the units in which the process makes continuous memory allocations in physical memory, respectively, and the segment size is usually much larger than the page size, and the segment size is not fixed . Can be as small as 1B or as large as the maximum value that offset can represent .
- The size of the page is determined by the system, is fixed, and the size of the segment is determined by the user, not fixed .
- A segment is a collection of logically related data. Examples include code snippets, data segments, and stack segments. A page is simply a simple collection of data in the process virtual memory space that resides within the same virtual page, so protecting the segment is more meaningful than protecting the page.
- Segment for better sharing .
The main functions of the segmentation mechanism:
Run a program (perhaps larger than main memory) without regard to the actual size of the main memory, but you must ensure that memory can accommodate the largest segment of the process.
Permissions control. Set the permission bit for each segment to control access to the program.
Shared segments.
The program does not have to put all the memory in the swap-out mode.
References:
Wiki-memory segmentation
Segmented Memory Management (operating systems)
Morgan_david/cs40/segmentation
The Curse of segments
Segmentation
How is the segment registers (FS, GS, CS, SS, DS, ES) used in Linux?
Art of Assembly:chapter four-the 80x86 addressing Modes
Virtual Memory Organization
Memory Layout and Access
Removing the Mystery from Segment:offset addressing
What's a far pointer
Far pointer in C with examples
Load far Pointer
Load effective Address
Ssce-intel-memory
COMPILER, assembler, LINKER and loader:a BRIEF Story
Process-memory segment mechanism