Understanding Docker (3): Docker uses Linux namespace to isolate the operating environment of the container

Source: Internet
Author: User
Tags posix docker run

1. Basics: The concept of Linux namespace

The Linux kernel has introduced the concept of namespace from version 2.4.19 onwards. The goal is to have a particular global system resource (Resource) through an abstract approach so that processes in namespace appear to have their own isolated global system resource instances (the purpose of each namespace are to Wrap a particular global system resource in an abstraction so makes it appear to the processes within the namespace They has their own isolated instance of the global resource. )。 There are six types of namespace implemented in the Linux kernel, which are listed as follows in the order of introduction:

namespace The associated kernel version introduced quarantined Global system resources isolation effect in the context of a container
Mount namespaces Linux 2.4.19 File System Mount Contact Each container can see different file system hierarchies
? UTS Namespaces Linux 2.6.19 NodeName and DomainName Each container can have its own hostname and domainame
IPC namespaces Linux 2.6.19 Specific inter-process communication resources, including system V IPC and POSIX message queues Each container has its own system V IPC and POSIX Message Queuing file systems, so only the processes of the same IPC namespace can communicate with each other
PID namespaces Linux 2.6.24 Process ID Digital space (Process ID # space) Each PID namespace process can have its own PID, each container can have its PID 1 root process, also allows the container to migrate between different hosts, because the process ID in namespace and host is irrelevant. This also enables each process in the container to have two PID: the PID in the container and the PID on the host.
Network namespaces Started with Linux 2.6.24 completed on Linux 2.6.29 Network-related system resources Each container has its own network device, IP address, IP routing table,/proc/net directory, port number, and so on. This also allows the same application in multiple containers on a host to be bound to port 80 on the respective container.
User namespaces Starting with Linux 2.6.23 completed on Linux 3.8) User and group ID spaces The user and group IDs of the processes in user namespace can be different from the host, and each container can have a different user and group ID, and a non-privileged user on a host can be a privileged user in the user namespace ;

The concept of Linux namespace is simple, simple and complex. Simply put, we just need to know that in some namespace process, can see independent of its own isolation of some specific system resources, complex, can go to see the Linux Kernel implementation of namespace principle, the network also has a large number of documents for reference, here no longer repeat.

2. Docker container uses Linux namespace for running environment isolation

When Docker creates a container, it creates new instances of the above six namespace, and then puts all the processes in the container into these namespace, so that the processes in the Docker container can only see the isolated system resources.

2.1 PID Namespace

We can see the same process, the PID inside and outside the container is different:

    • Inside the container the PID is 1,ppid is 0.
    • Outside of the container PID is 2198, PPID is 2179 that is docker-containerd-shim process.

[Email protected]:/home/sammy# ps-ef | grep python
Root 2198 2179 0 00:06? 00:00:00 python app.py

[Email protected]:/home/sammy# ps-ef | grep 2179
Root 2179 765 0 00:06? 00:00:00 Docker-containerd-shim 8b7dd09fbcae00373207f01e2acde45740871c9e3b98286b5458b4ea09f41b3e/var/run/docker/ libcontainerd/8b7dd09fbcae00373207f01e2acde45740871c9e3b98286b5458b4ea09f41b3e Docker-runc
Root 2198 2179 0 00:06? 00:00:00 python app.py
Root 2249 1692 0 00:06 pts/0 00:00:00 grep--color=auto 2179

[Email protected]:/home/sammy# Docker exec-it Web31 ps-ef
Root 1 0 0 16:06? 00:00:00 python app.py

With regard to the relationship between Containerd,containerd-shim and container, it can be explained in the article:

    • The Docker engine manages the image and then hands it over to Containerd, Containerd and then runs the container using Runc.
    • Containerd is a simple daemon that can use Runc to manage containers and expose other functions of the container using GRPC. It manages the container's start, stop, pause and destroy. Because the container is running as an orphaned engine, the engine can eventually start and upgrade without restarting the container.
    • Runc is a lightweight tool that is used to run containers and is used only to do this, and this is a good thing to do. Runc is basically a small command-line tool that can use containers directly without using the Docker engine.

Therefore, the parent process of the main application on the host in the container is Containerd-shim, and it is started by the tool Runc.

This also can be seen, PID namespace by the host on the PID map into the PID inside the container, so that the process inside the container seems to have a separate PID space.

2.2 UTS Namespace

Similarly, containers can have their own hostname and DomainName:

[Email protected]:/home/sammy# hostnamedevstack[email protected]:/home/sammy# Docker exec-it web31 Hostname8b7dd09fbcae
2.3 User Namespace

Docker does not support user namespace prior to the Docker 1.10 release. That is, by default, the user of the process within the container is the root user on the host, so that when the file or directory on the host is mapped to the container as volume, the process in the container actually has almost all the permissions of the root to modify the directory on the host, which will have a Big security issues.


    • Start a container: Docker run-d-v/bin:/host/bin--name web34 training/webapp python app.py
    • At this point the user of the process is root inside and outside the container, and it can make arbitrary modifications to the/bin directory on the host within the container:
[Email protected]:/home/sammy# Docker exec-ti web34 iduid=0 (root) gid=0 (root) groups=0 (root) [email protected]:/home/ sammy# iduid=0 (Root) gid=0 (root) groups=0 (root)

The user namespace introduced in Docker 1.10 allows the container to have a "fake" root user, which is root within the container and is a non-root user outside the container. In other words, user namespace implements the mapping between host users and container users.

To enable the steps:

    1. Modify the/etc/default/docker file to add line docker_opts= "--userns-remap=default"
    2. Restart the Docker service, at which point the dockerd process is/usr/bin/dockerd--userns-remap=default--raw-logs
    3. Then create a container: Docker run-d-v/bin:/host/bin--name web35 training/webapp python app.py
    4. To view the process users inside and outside the container:
[Email protected]:/home/sammy# ps-ef | grep python231072    1726  1686  0 01:44?        00:00:00 python app.py[email protected]:/home/sammy# docker exec web35 ps-efuid        PID  PPID  C stime TTY          Time Cmdroot         1     0  0 17:44?        00:00:00 python app.py
    • Viewing the files/etc/subuid and/etc/subgid, you can see that the UID and GID of the Dockermap user on the host are 231072:
[Email protected]:/home/sammy# cat/etc/subuidsammy:100000:65536stack:165536:65536dockremap:231072:65536

[Email protected]:/home/sammy# cat/etc/subgid

    • Looking at the file/proc/1726/uid_map, it represents the mapping of users inside and outside the container and maps 231072 users on the host to 0 (that is, root) users within the container.
[Email protected]:/home/sammy# cat/proc/1726/uid_map         0     231072      65536
    • Now that we try to modify the/bin folder on the host inside the container, we will be prompted for insufficient permissions:
[Email protected]:/host/bin# Touch test2touch:cannot Touch ' test2 ': Permission denied

This means that by using user namespace, the process within the container is running on a non-root user, and we have successfully restricted the permissions of the process within the container.

A few other namespace, such as NETWORK,MNT, are relatively simple, there is not much to say. In summary, the Docker daemon creates six instances of namespace for each container, allowing the processes in the container to be in an isolated, running environment:

[Email protected]:/proc/1726/ns# ls-ltotal 0lrwxrwxrwx 1 231072 231072 0 Sep 01:45 IPC-IPC:[4026532210]LRWXRWXRW X 1 231072 231072 0 Sep 01:45 mnt-mnt:[4026532208]lrwxrwxrwx 1 231072 231072 0 Sep 01:44 Net-net:[402653 2213]LRWXRWXRWX 1 231072 231072 0 Sep 01:45 PID--pid:[4026532211]lrwxrwxrwx 1 231072 231072 0 Sep 01:45 User-& Gt USER:[4026532207]LRWXRWXRWX 1 231072 231072 0 SEP 01:45 UTS--uts:[4026532209]
2.4 Network namespace

By default, when the Docker instance is created, you cannot see the network namespace for the container instance using the IP netns command. This is because the IP netns command reads content from the/var/run/netns folder.

    1. Find the container's main process ID
[Email protected]:/home/sammy# Docker inspect--format ' {. State.pid}} ' web52704
    1. Creating/var/run/netns directories and symbolic connections
[Email protected]:/home/sammy# mkdir/var/run/netns[email protected]:/home/sammy# ln-s/proc/2704/ns/net/var/run/ Netns/web5
    1. You can use the IP netns command at this time
[Email protected]:/home/sammy# ip netnsweb5[email protected]:/home/sammy# IP netns exec web5 IP addr1:lo: <loopback,up ,lower_up> MTU 65536 qdisc noqueue State UNKNOWN Group default  link/loopback 00:00:00:00:00:00 BRD 00:00:00:00:00: XX  inet scope host lo  valid_lft forever Preferred_lft Forever  Inet6:: 1/128 scope host  valid _lft Forever preferred_lft Forever15:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> MTU Qdisc Noqueue State up Group Default  link/ether 02:42:ac:11:00:03 brd ff:ff:ff:ff:ff:ff  inet scope global eth0  valid_lft Forever Preferred_lft Forever  inet6 fe80::42:acff:fe11:3/64 scope link  valid_lft forever preferred_lft Forever
3. Related parameters in namespace in Docker Run command

The Docker Run command has several parameters related to namespace:

    • --IPC string IPC namespace to use
    • --pid string PID namespace to use
    • --userns string User namespace to use
    • --uts string UTS namespace to use

--userns: Specifies the user namespace used by the container

    • ' Host ': Using Docker Host User namespace
    • ': Use Docker Deamon user namespace specified by '--userns-remap '

You can force a container to run in Host user namespace with user namespace enabled:

[Email protected]:/proc/2835# docker run-d-v/bin:/host/bin--name web37--userns host Training/webapp python app.py9c61 E9a233abef7badefa364b683123742420c58d7a06520f14b26a547a9476c[email protected]:/proc/2835# Ps-ef | grep pythonroot      2962  2930  1 02:17?        00:00:00 python app.py

Otherwise, by default, it will run in a specific user namespace.


Similarly, you can specify that the container uses the Docker host PID namespace, so that the processes in the container can see all the processes on the host. Note You cannot enable user namespace at this time.

[Email protected]:/proc/2962# docker run-d-v/bin:/host/bin--name web38--pid host--userns host Training/webapp python App.pyf40f6702b61e3028a6708cdd7b167474ddf2a98e95b6793a1326811fc4aa161d[email Protected]:/proc/2962#[email protected]:/proc/2962# Docker exec-it web38 bash[email protected]:/opt/webapp# PS auxuser       PID%cpu%MEM    VSZ   RSS TTY      STAT START Time   commandroot         1  0.0  0.1  33480  2768?        Ss   17:40   0:01/sbin/initroot         2  0.0  0.0      0     0?        S    17:40   0:00 [kthreadd]root         3  0.0  0.0      0     0?        S    17:40   0:00 [ksoftirqd/0]root         5  0.0  0.0      0     0?        s<   17:40   0:00 [kworker/0:0h]root         6  0.0  0.0      0     0?        S    17:40   0:00 [kworker/u2:0]root         7  0.0  0.0      0     0?        S    17:40   0:00 [rcu_sched]

Similarly, the container can be used by the Docker host UTS namespace. At this point, the most obvious is that the container's hostname and Docker hostname are the same.

[Email protected]:/proc/2962# docker run-d-v/bin:/host/bin--name web39--uts host Training/webapp python app.py38e8b81 2e7020106bf8d3952b88085028fc87f4427af0c3b0a29b6a69c979221[email protected]:/proc/2962# Docker exec-it web39 bash[ Email protected]:/opt/webapp# hostnamedevstack

Reference links

    • http://lwn.net/Articles/531114/
    • Docker basic technology: Linux Namespace (top)
    • Docker basic technology: Linux Namespace (bottom)
    • Https://github.com/crosbymichael/dockercon-2016/blob/master/Creating%20Containerd.pdf
    • https://events.linuxfoundation.org/sites/events/files/slides/User%20Namespaces%20-%20ContainerCon%202015%20-% 2016-9-final_0.pdf
    • https://blog.yadutaf.fr/2016/04/14/docker-for-your-users-introducing-user-namespace/
    • Https://success.docker.com/Datacenter/Apply/Introduction_to_User_Namespaces_in_Docker_Engine

Understanding Docker (3): Docker uses Linux namespace to isolate the operating environment of the container

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.