Linux Container (LXC) container isolation implementation mechanism

Source: Internet
Author: User

OneLXCOverview

LXC (Linuxcontainer) is an open source project from the Sourceforge website,LXC provides Linux users with a set of user-space tools that users can LXC creates and manages containers, creating running operating systems in containers to effectively isolate multiple operating systems and achieve OS-level virtualization. The initial Docker container technology was built on LXC , and then Docker shaving LXCin its own kernel.

TwoLXCCommand Introduction

Since the Linux kernel 2.6.27 version has supported the LXC, it is only necessary to install the appropriate User Configuration tool liblxc the linux User Configuration. The following table 3-1 is a common command interface for LXC :

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M00/7E/78/wKioL1cB4X-DlQhiAADobkpeMyM730.png "title=" 1.png " alt= "Wkiol1cb4x-dlqhiaadobkpemym730.png"/>

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M00/7E/7B/wKiom1cB4OGyHYiSAABhYa7utxY114.png "title=" 2.png " alt= "Wkiom1cb4ogyhyisaabhya7utxy114.png"/>

ThreeLXCContainer isolation Implementation Mechanism

From the previous introduction, we can learn thatLXC can create containers for the virtualization of Linux systems, and LXC as the user Layer management tool mainly provides the interface of the management container, the implementation of the mechanism of the container hidden, In this paper, the implementation mechanism of LXC container is analyzed.

The LXC Internal uses two features of the Linux kernel Namespace and Cgroup , which we'll cover in the following two mechanisms.

3.1、Namespaeenamespace Implementation Mechanism

NamespaceThe namespace mechanism provides a lightweight form of virtualization, which is operating system-level virtualization. The mechanism andFreeBSDOfjailMechanisms andOpenVZSimilar. Traditionally,LinuxAll processes in the system passPIDIdentity, the kernel only needs to manage aPIDlist, and the user passesunameSystem calls get all the same system-related information. InLinuxIn the system, the user manages the way throughUIDNumber, that is, by globally uniqueUIDThe list is identified. GlobalIDEnables the kernel to manage the system well, choosing to allow or deny certain privileges. Such asUIDFor0OfRootThe user is allowed to do anything, but the otherUIDUsers will be restricted;XCannot kill another userYThe process, but the userXcan see the userY, and this state does not apply to some scenarios, such as services with high privacy requirements.

To solve these similar problems, theNamespace mechanism provides a minimal resource-intensive solution for the isolation of the process ID, user ID , and other series of resources. Other virtualization scenarios typically require a physical machine to run multiple cores to isolate the above resources, and namespaces can run a kernel on a single physical machine, abstracting the global resources described above through a namespace mechanism. It enables a set of processes to be placed in a container, each of which is isolated from each other, and can be shared between containers. From the user's perspective, the namespace divides the global resource control into the appropriate container, in which the process can see only members in the container and cannot see the members of the other containers.

650) this.width=650; "src=" Http://s5.51cto.com/wyfs02/M00/7E/78/wKioL1cB4amiiVHfAAB-1Eg1bY0864.png "title=" 1.png " alt= "Wkiol1cb4amiivhfaab-1eg1by0864.png"/>

Namespace Hierarchy Relationships

In, we can see three namespaces, the parent namespace records manage 6 PID values, and two child namespaces the parent namespace knows the existence of the child namespace, and two child namespaces do not know the other side's existence, Changes to the properties in the various child namespaces of the process cannot propagate the impact to other namespaces, including the parent namespace, so that running the Linux kernel in two sub-namespaces can be a good way to isolate the two systems.

    • namespace in Linux representations in the kernel

The namespaces are encapsulated for UTS, interprocess communication (IPC), file system view, process PID,UID , and network six properties respectively.

A nsproxy structure is included in each task task structure so that there is a corresponding namespace for each process created, through which the relationship between the process and the namespace can be established.

For namespace support, boot settings are required when the Linux kernel is compiled. If not set, the system default namespace is used, i.e. the entire kernel is a namespace, globally visible.

For each specific sub-namespace implementation form, this is no longer specifically expanded.

    • Representation of namespaces in user space

The namespace mechanism calls the clone implementation by invoking the system at the user level, and the biggest difference between theclone system call and the fiork system call is that the resources of the parent process are inherited selectively by passing in many parameters, and Flork System calls replicate the resources of the parent process.

You can create a new namespace by setting the flags parameter, optionally inheriting the resources of the parent process.

clone newpid creates a new p1d Environment, calling id is linux system pid namespace If the process ends, all processes in this namespace will be ended. pid is hierarchical, and processes in the parent namespace can create child namespaces, and child namespaces are visible to the parent namespace.

CLONE-NEWIPC creates a new system V object and posix signal Queue interprocess communication mechanism, linux communication between systems.

When flags is set to CLONE newuts A new UTS namespace is created, and the initial structure is initialized with the UTS of the calling process, by invoking the The Setdomainname function and the sethostname function can set the domain name and host name separately, and call the uname function to obtain the UTS information.

When flags is set to CLONE newns a new mount mount point namespace is created, and the mount point namespace is the file structure view that is visible to the process. Therefore, the file system can be isolated by mounting the namespace.

When flags is set to CLONE newnet A new network namespace is created, and the network namespace is isolated on the view of the network stack, including IPV4,IPV6, stack protocol, IP routing tables, firewall rules, and so on. A physical network device can only correspond to one network namespace at a time, but a tunnel created by creating a virtual network device can communicate with the actual physical network device, which enables the communication of multiple network devices.

The flags identified above can be combined,LXC Call the interface can easily create a separate operating environment, set the corresponding parameters, complete the operating system level of virtualization isolation.

3.2、CgroupImplementation mechanism

LXC realizes the isolation of resources through the namespace mechanism, while the limitation of physical resources is realized by Cgroup (control group) mechanism.

The Cgroup system defines the following concepts:

1) Control group : The control group is a set of process groups, which is the basic unit of control in the cgroup system.

2) SUBSYSTEM: Subsystem is a kind of resource control system, such as CPU, is the control method of the process in C Shovel OUP , the subsystem can be limited to each Cgroup . The subsystems currently supported by the Cgroup mechanism are shown in the following table:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M01/7E/78/wKioL1cB4dKgV6ErAACBm3F1QJs162.png "title=" 1.png " alt= "Wkiol1cb4dkgv6eraacbm3f1qjs162.png"/>

3) Hierarchy: Hierarchy is the arrangement of each control group in a tree form, and the control of the child nodes inherits the tree of the parent node. A process in a Linux system is in one of the control groups, and at one level. A hierarchy corresponds to the virtual file system of the Cgroup system.

The hierarchical relationships between them are as follows:

650) this.width=650; "src=" Http://s3.51cto.com/wyfs02/M02/7E/78/wKioL1cB4fbT5TaLAAA9BJ1mAXI337.png "title=" 1.png " alt= "Wkiol1cb4fbt5talaaa9bj1maxi337.png"/>

Cgroup Hierarchy Relationship

InLinuxThe first user-created system in aCgroupis the rootCgroup, which contains all the processes in the system, at the first level. In the rootCgroupand was divided into two children.CgroupSystem, i.e.cgroup0AndCGROUPLInCgroupThe second layer of the system. A subsystem can only be at one level, that is, at the first levelCpusetSubsystems are added to the second level in differentcgroup0AndCGROUPL, because it has been said that the sub-nodes of the hierarchy need to inherit the parent node's properties, and duplicates occur if there are two levels. Each level can have multiple subsystems, such as a second level withCPUAndMemoSystem. For a process, it can be at different levels ofCgroupIn

    • Cgroup in the Linux the representation of the kernel layer

cgroup mechanism required in config_cgroups macro. In css_set structure stores process-related cg_list will be in the same css_set structure as follows

cgroup_subsys_state pointer Array, no direct link between process and cgroup_subsys_state indirect pointing to cgroup Assignment is less, struct Cgroup has a member

For the management of subsystem, it is managed by Cgroup_subsys data structure. In this structure defines a set of operations of the interface function pointers, equivalent to the base class C + + , gives the interface function specific behavior needs to be defined for each subsystem characteristics, such as Cgroup subsys_state Interface The information returned by each subsystem is different and requires a different implementation, so that the design can be well done for many types of compatibility.

    • Cgroup in the Linux representation of the user layer

CgroupInLinuxThe user layer passesCgroupFile system mode is represented. For example, in user space you can execute the following command:mount-t Cgroup-o CPU,Memory Cpu_mem/mycgroup/Cpu_memYou can create a name ofCpu_memHierarchy, which hasCPUAndMemoryTwo subsystems, these two systems can be mounted to aLj/mycgroup/cpu_memFile. This one createsCgroupThe process of hierarchy is like the process of creating a new folder, just creating aCgroupSpecial file system. If theCpu_memIn the system, you want to create a newCgroup, just enterCpu_memFile directory, execute commandmkdir NewcgroupYou can create a callNewcgroupControl group, which also represents the newly createdCgroupis part ofCpu_memThis one level. EnterNewcgroupAfter the file directory, there will be the relevant subsystem control files, can read or modify the control files, such as to theTasksfile that is written to a process in the current system.PIDis the equivalent of adding a process to the newly createdCgroupThe control group.


This article is from "I take fleeting chaos" blog, please be sure to keep this source http://tasnrh.blog.51cto.com/4141731/1760061

Linux Container (LXC) container isolation implementation mechanism

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.