[Linux] Process (ii)--process address space

Source: Internet
Author: User

7,linux the concept of process namespaces

For more information about namespaces, refer to the deep Linux kernel Architecture 2.3.2 Chapters

Linux namespaces belong to the operating system level of virtualization, each namespace is isolated from each other, the different namespaces of the process PID can be the same, so from the operating system level may have the same UID and PID, the parent namespace can also contain child namespaces, The child namespace does not know the existence of the parent namespace, but the parent namespace knows the existence of the child namespace.


Namespaces are implemented:
To simplify the task_struct, the kernel introduces a struct nsproxy to unify the namespace that the management process belongs to, and in task_struct it's just a pointer to the struct nsproxy.

Sched.hstruct task_struct{    //...    struct Pid_link Pids[pidtype_max];    struct Nsproxy *nsproxy; /* Multiple processes can share namespaces *    //...}

New namespaces are created:
When creating a new process with a fork or clone system call, there are specific options to control whether the namespace is shared with the parent process, or if a new namespace is created, each namespace has a corresponding flag

Sched.h#define clone_newuts  0x04000000#define clone_newipc  0x08000000#define clone_newuser 0x10000000# Define Clone_newpid  0x20000000#define clone_newnet  0x40000000


8, the virtual address space of the process

On a 32-bit CPU system, each process can access the virtual address space of 4G, where the highest 1G is the kernel virtual address space, and the 0-3g user space for each process is not visible to other processes. Because of the presence of the MMU, a process does not conflict with access to one address of the user space and other processes to the same address.


The user's virtual address space can be divided into at least three parts
Heap space, stack space, mmap space
The virtual address space in Linux is represented by the memory descriptor mm_struct, and each process descriptor has a mm_struct that describes the virtual address space for the process:

struct task_struct{//... void *stack;    struct mm_struct *mm, *active_mm;    unsigned brk_randomized:1; //......}; struct mm_struct{/* Linux manages Linear zone objects using both linked lists and red-black trees, which are based on operational efficiency considerations.        In general, the red-black tree is used to determine the linear region with the specified address, and the list is used to scan the entire linear region set using the */struct vm_area_struct *mmap;               /* Linear Area Object List of the head, linked list is ordered, according to linear address from small to large arrangement */struct rb_root mm_rb;                 /* Linear Zone object The root of the red-black tree *//... unsigned mmap_base;                 /* The address space can be used to map the first address */unsigned task_size;                         /* The virtual address space size of the process *//... pgd_t *PGD;                  /* MMU Page Table */atomic_t mm_users;                  /* Number of user spaces using the mm_struct description, multi-threading when Mm_users > 1 */atomic_t mm_count;                      /* How many times the mm_struct is referenced in the kernel, and if 0 will be released */int map_count; /* VMAs number/MMAP list node number *//... unsigned start_code, End_code, Start_data, End_data; /* The starting and ending addresses of the snippet. Initialized data segment start address, end address */unsigned start_brk, BRK, Start_stack; /* The start address of the heap and the last address of the current heap, the start address of the stack */unsigned arg_start, arg_end, env_Start, env_end; /* Start address, end address of the command line parameter, start address of environment variable, end address *//...};

The malloc () function of the C language is ultimately allocated memory on the heap through the BRK () system call, and on the implementation of Linux, BRK () simply changes the value of BRK (heap end address) in the mm_struct struct, when using the free () function, The program does not return the real physical memory to the BRK value in the os,mm_struct struct, because the next time you use the malloc () function, you do not need to call BRK (), and in general, the virtual address space used by the program is only a little more than the same.

How to view the virtual address space for a process by command:

pmap-d command

The explanations are as follows:

Each column has the following meanings:
Parameter interpretation
Address: The space occupied by the process
Kbytes: The size of the virtual segment
RSS: Device number (main device: Secondary device)
Anon: The node number of the device, 0 indicates that no node corresponds to memory
Locked: Whether to allow swapped
Mode permissions: R=read, W=write, X=execute, s=shared, p=private (copy on Write)
Mapping:bash the corresponding image file name

Resident: Represents the space of a segment that resides in memory
Shared: Indicates that these North allocated memory is shared by other processes in the system.
Private: Indicates the amount of space that can only be used by the process. You can see that the share space does not have a private attribute.
The meaning of the output of the PRSTAT-LP is:
Size: Is the address space occupied by the process.
RSS: The size of the memory that is actually allocated.
The resident you see is different from RSS, which is the actual size of the process in memory, and this value can be up to resident display value.


Supplement: zones that can be continuously accessed by a process are called linear zones
Linux implements linear zones (referred to as VMA) through objects of type vm_area_struct. When an executable program maps to a process virtual address space, a set of VM_AREA_STRUCT data structures is generated. Each VM_AREA_STRUCT data structure represents a part of the executable's impression; the Map_count field of the task_struct structure of the
process descriptor, such as executable code, initialized data, and uninitialized data, describes the number of linear extents that the process has

struct vm_area_struct{    struct mm_struct *vm_mm;                   /* The memory descriptor to which the Mm_struct instance belongs, */    unsigned vm_start, vm_end;                 /* address range, which is the starting and ending address of the user space */    struct vm_area_struct *vm_next, *vm_prev;  /* The list of all vm_area_struct instances of the process is implemented through Vm_next, while the red and black trees are implemented by VM_RB */    pgprot_t Vm_page_prot;                     /* access rights to the virtual memory area */    unsigned vm_flags;                         /* Access rights */    struct rb_node vm_rb;                      /* The corresponding node in the red and black tree *    //...    const struct VM_OPERATIONS_STRUCT *vm_ops; /* Various standard operating function pointer sets on the VMA */    unsigned vm_pgoff;                         /* The offset of the mapping file, in page_size. */    struct file *vm_file;                      /* Mapped file, NOT null    */void * VM_PRIVATE_DATA;                    /* was Vm_pte (shared mem) */};

Mmap () system call can tell the disk file all or part of the mapping to the user space, the process read and write file operations into a direct memory operation, MMAP () simply assigns the value of the file structure to the Vm_file member of the vm_area_struct struct, and does not map the actual data in , if the real content is found in real use, it will cause the fault of the missing pages.

struct Vm_area_struct {union {struct {struct list_head list;void *parent;/* Aligns with Prio_tree_node parent */struct VM _area_struct *head;} Vm_set;struct Raw_prio_tree_node Prio_tree_node;} Shared;}

Given an interval in the file, the kernel sometimes needs to know all the processes that the interval maps to, which is called a shared mapping, to provide the required information, all vm_area_struct instances are also managed by a precedence tree, and are included in the shared members above.

Add: The meaning of heaps and stacks:
The memory used by a program compiled by C + + is divided into the following sections:
1. Stack: The compiler automatically assigns the release, the parameter value of the stored function, the value of the local variable, etc., and its operation is similar to the stack of data structure.
2, heap area (heap): Generally by the programmer assigned to release, if the programmer does not release, the program may end up by the OS recycling, it is worth noting that he and the data structure of the heap is two different, distribution is similar to the data structure of the linked list.
3, Global Zone (Static): Also known as static data memory space, storage of global variables and static variables, global variables and static variables are stored in a block, the initialization of global variables and static variables put a piece of area, not initialized in another area adjacent to the end of the program is released by the system.
4, text constant area: the constant string is placed here, the program is released after the end of the system.
5, program code area: the binary code that holds the function body.


Heap and stack differences
1, different applications
  stacks are automatically assigned by the system, the heap requires the programmer to manually allocate the release, the allocation of C is the malloc () function, the free () function, C + + with the new () function, Release with the delete () function
2, post-application system response
  stack: As long as the stack space is larger than the system request space, otherwise the system reported an exception prompt stack overflow
  heap: heap: First of all should know that the operating system has a record memory address of the linked list, When the system receives a request for a program, it iterates through the list, finds the heap node where the first space is larger than the requested space, and then removes the node from the list of idle nodes and assigns the node's space to the program. Also, for most systems, the size of this allocation is recorded at the first address in the memory space, so that the delete or free statements in the code can properly release the memory space. Also, because the size of the found heap node does not necessarily equal the size of the request, the system will re-place the extra portion into the idle list.
3, the request size limit is different:
  stack to the low address extension, the heap is extended to the high address.
4, application efficiency is not the same
  stack: System application, Fast
  heap: Programmers themselves apply, slow
5, store content is different
  heap: Usually in the heap of the head with a byte to hold the heap size, Specific content by the programmer to arrange the
  stack: In the function call, the first stack is the main function in the function call after the address of the next instruction, and then the parameters of the function, in most C compiler, the argument is from right to left into the stack, when the function call ends, the local variable first out of the stack, Then the argument, the last stack pointer, points to the first saved address, which is the next instruction in the main function.
6, life cycle
  stack space: automatic memory space, where the size of the data is determined at compile time, and the allocation and release of the data is also done by the compiler when the function enters and exits, and the data life cycle is the same as the function.
  Heap Space: dynamic (manual) memory space where the size and initial value of the data are determined at run time and the data life cycle is variable.
Use the ulimit-s command to see that the default maximum stack space supported by Linux is 8192KB (8M)


ASLR (Address Space layout randomization) technology: A security protection technique for buffer overflow, by randomization of linear area layouts such as heap, stack, shared library mapping, etc.
To prevent an overflow attack by increasing the difficulty of the attacker predicting the destination address and preventing the attacker from locating the attack code location directly

[Linux] Process (ii)--process address space

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.