Understanding Docker (3): Docker container uses Linux namespace for run environment isolation

Source: Internet
Author: User
Tags posix docker run



This series of articles will cover the knowledge of Docker:



(1) Docker Installation and basic usage



(2) Docker image



(3) Isolation of Docker containers-environmental isolation using namespace



(4) Isolation of Docker containers-resource isolation using cgroups



(4) Network of Docker containers



(5) Storage of Docker containers





1. Basics: The concept of Linux namespace


The Linux kernel has introduced the concept of namespace from version 2.4.19 onwards. The goal is to have a particular global system resource (Resource) through an abstract approach so that processes in namespace appear to have their own isolated global system resource instances (the purpose of each namespace are to Wrap a particular global system resource in an abstraction so makes it appear to the processes within the namespace They has their own isolated instance of the global resource. )。 There are six types of namespace implemented in the Linux kernel, which are listed as follows in the order of introduction:


namespace The associated kernel version introduced quarantined Global system resources isolation effect in the context of a container
Mount namespaces Linux 2.4.19 File System Mount Contact Each container can see different file system hierarchies
? UTS Namespaces Linux 2.6.19 NodeName and DomainName Each container can have its own hostname and domainame
IPC namespaces Linux 2.6.19 Specific inter-process communication resources, including system V IPC and POSIX message queues Each container has its own system V IPC and POSIX Message Queuing file systems, so only the processes of the same IPC namespace can communicate with each other
PID namespaces Linux 2.6.24 Process ID Digital space (Process ID # space) Each PID namespace process can have its own PID, each container can have its PID 1 root process, also allows the container to migrate between different hosts, because the process ID in namespace and host is irrelevant. This also enables each process in the container to have two PID: the PID in the container and the PID on the host.
Network namespaces Started with Linux 2.6.24 completed on Linux 2.6.29 Network-related system resources Each container has its own network device, IP address, IP routing table,/proc/net directory, port number, and so on. This also allows the same application in multiple containers on a host to be bound to port 80 on the respective container.
User namespaces Starting with Linux 2.6.23 completed on Linux 3.8) User and group ID spaces The user and group IDs of the processes in user namespace can be different from the host, and each container can have a different user and group ID, and a non-privileged user on a host can be a privileged user in the user namespace ;


The concept of Linux namespace is simple, simple and complex. Simply put, we just need to know that in some namespace process, can see independent of its own isolation of some specific system resources, complex, can go to see the Linux Kernel implementation of namespace principle, the network also has a large number of documents for reference, here no longer repeat.


2. Docker container uses Linux namespace for running environment isolation


When Docker creates a container, it creates new instances of the above six namespace, and then puts all the processes in the container into these namespace, so that the processes in the Docker container can only see the isolated system resources.


2.1 PID Namespace


We can see the same process, the PID inside and outside the container is different:


    • Inside the container the PID is 1,ppid is 0.
    • Outside of the container PID is 2198, PPID is 2179 that is docker-containerd-shim process.


[Email protected]:/home/sammy# ps-ef | grep python
Root 2198 2179 0 00:06? 00:00:00 python app.py

[Email protected]:/home/sammy# ps-ef | grep 2179
Root 2179 765 0 00:06? 00:00:00 Docker-containerd-shim 8b7dd09fbcae00373207f01e2acde45740871c9e3b98286b5458b4ea09f41b3e/var/run/docker/ libcontainerd/8b7dd09fbcae00373207f01e2acde45740871c9e3b98286b5458b4ea09f41b3e Docker-runc
Root 2198 2179 0 00:06? 00:00:00 python app.py
Root 2249 1692 0 00:06 pts/0 00:00:00 grep--color=auto 2179




[Email protected]:/home/sammy# Docker exec-it Web31 ps-ef
UID PID PPID C stime TTY time CMD
Root 1 0 0 16:06? 00:00:00 python app.py



With regard to the relationship between Containerd,containerd-shim and container, it can be explained in the article:






Simply put, it is the Containerd (Docker daemon) that starts Containerd-shim, which then starts Runc,runc and then launches the container, Runc exits after the container is created, and shim as the container's parent process. The container can be deamonless by shim, which means that it does not rely on Docker deamon, meaning that the container will run as usual after the Docker Deamon dead zone. [ Note: This section should be further studied, the description is not necessarily correct ]



This also can be seen, PID namespace by the host on the PID map into the PID inside the container, so that the process inside the container seems to have a separate PID space.


2.2 UTS Namespace


Similarly, containers can have their own hostname and DomainName:


[email protected]:/home/sammy# hostname
devstack
[email protected]:/home/sammy# docker exec -it web31 hostname
8b7dd09fbcae
2.3 User Namespace


Docker does not support user namespace prior to the Docker 1.10 release. That is, by default, the user of the process within the container is the root user on the host, so that when the file or directory on the host is mapped to the container as volume, the process in the container actually has almost all the permissions of the root to modify the directory on the host, which will have a Big security issues.



Example:


    • Start a container: Docker run-d-v/bin:/host/bin--name web34 training/webapp python app.py
    • At this point the user of the process is root inside and outside the container, and it can make arbitrary modifications to the/bin directory on the host within the container:
[email protected]:/home/sammy# docker exec -ti web34 id
uid=0(root) gid=0(root) groups=0(root)
[email protected]:/home/sammy# id
uid=0(root) gid=0(root) groups=0(root)


The user namespace introduced in Docker 1.10 allows the container to have a "fake" root user, which is root within the container and is a non-root user outside the container. In other words, user namespace implements the mapping between host users and container users.



To enable the steps:


    1. Modify the/etc/default/docker file to add line docker_opts= "--userns-remap=default"
    2. Restart the Docker service, at which point the dockerd process is/usr/bin/dockerd--userns-remap=default--raw-logs
    3. Then create a container: Docker run-d-v/bin:/host/bin--name web35 training/webapp python app.py
    4. To view the process users inside and outside the container:
[email protected]:/home/sammy# ps -ef | grep python 231072 1726 1686 0 01:44 ? 00:00:00 python app.py  [email protected]:/home/sammy# docker exec web35 ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root 1 0 0 17:44 ? 00:00:00 python app.py
    • Viewing the files/etc/subuid and/etc/subgid, you can see that the UID and GID of the Dockermap user on the host are 231072:
[email protected]:/home/sammy# cat /etc/subuid
sammy:100000:65536 stack:165536:65536 dockremap:231072:65536


[Email protected]:/home/sammy# cat/etc/subgid
sammy:100000:65536
stack:165536:65536
dockremap:231072:65536


    • Looking at the file/proc/1726/uid_map, it represents the mapping of users inside and outside the container and maps 231072 users on the host to 0 (that is, root) users within the container.
[Email protected]:/home/sammy# cat/proc/1726/uid_map 0 231072 65536
    • Now that we try to modify the/bin folder on the host inside the container, we will be prompted for insufficient permissions:
[Email protected]:/host/'test2': Permission denied


This means that by using user namespace, the process within the container is running on a non-root user, and we have successfully restricted the permissions of the process within the container.



A few other namespace, such as NETWORK,MNT, are relatively simple, there is not much to say. In summary, the Docker daemon creates six instances of namespace for each container, allowing the processes in the container to be in an isolated, running environment:


[email protected]:/proc/1726/ns# ls -l
total 0 lrwxrwxrwx 1 231072 231072 0 Sep 18 01:45 ipc -> ipc:[4026532210]
lrwxrwxrwx 1 231072 231072 0 Sep 18 01:45 mnt -> mnt:[4026532208]
lrwxrwxrwx 1 231072 231072 0 Sep 18 01:44 net -> net:[4026532213]
lrwxrwxrwx 1 231072 231072 0 Sep 18 01:45 pid -> pid:[4026532211]
lrwxrwxrwx 1 231072 231072 0 Sep 18 01:45 user -> user:[4026532207]
lrwxrwxrwx 1 231072 231072 0 Sep 18 01:45 uts -> uts:[4026532209]
3. Related parameters in namespace in Docker Run command


The Docker Run command has several parameters related to namespace:


    • --IPC string IPC namespace to use
    • --pid string PID namespace to use
    • --userns string User namespace to use
    • --uts string UTS namespace to use
3.1--userns


--userns: Specifies the user namespace used by the container


    • ' Host ': Using Docker Host User namespace
    • ': Use Docker Deamon user namespace specified by '--userns-remap '


You can force a container to run in Host user namespace with user namespace enabled:


[email protected]:/proc/2835# docker run -d -v /bin:/host/bin --name web37 --userns host training/webapp python app.py
9c61e9a233abef7badefa364b683123742420c58d7a06520f14b26a547a9476c
[email protected]:/proc/2835# ps -ef | grep python
root      2962  2930  1 02:17 ?        00:00:00 python app.py


Otherwise, by default, it will run in a specific user namespace.


3.2--pid


Similarly, you can specify that the container uses the Docker host PID namespace, so that the processes in the container can see all the processes on the host. Note You cannot enable user namespace at this time.


[email protected]:/proc/2962# docker run -d -v /bin:/host/bin --name web38 --pid host --userns host training/webapp python app.py
f40f6702b61e3028a6708cdd7b167474ddf2a98e95b6793a1326811fc4aa161d
[email protected]:/proc/2962#
[email protected]:/proc/2962# docker exec -it web38 bash
[email protected]:/opt/webapp# ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root 1 0.0 0.1 33480 2768 ?        Ss 17:40 0:01 /sbin/init
root 2 0.0 0.0 0 0 ?        S 17:40 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ?        S 17:40 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ?        S< 17:40 0:00 [kworker/0:0H]
root 6 0.0 0.0 0 0 ?        S 17:40 0:00 [kworker/u2:0]
root 7 0.0 0.0 0 0 ?        S 17:40 0:00 [rcu_sched]
......
3.3--uts


Similarly, the container can be used by the Docker host UTS namespace. At this point, the most obvious is that the container's hostname and Docker hostname are the same.


[email protected]:/proc/2962# docker run -d -v /bin:/host/bin --name web39 --uts host training/webapp python app.py
38e8b812e7020106bf8d3952b88085028fc87f4427af0c3b0a29b6a69c979221
[email protected]:/proc/2962# docker exec -it web39 bash [email protected]:/opt/webapp# hostname
devstack








Reference links


    • http://lwn.net/Articles/531114/
    • Docker basic technology: Linux Namespace (top)
    • Docker basic technology: Linux Namespace (bottom)
    • Https://github.com/crosbymichael/dockercon-2016/blob/master/Creating%20Containerd.pdf
    • https://events.linuxfoundation.org/sites/events/files/slides/User%20Namespaces%20-%20ContainerCon%202015%20-% 2016-9-final_0.pdf
    • https://blog.yadutaf.fr/2016/04/14/docker-for-your-users-introducing-user-namespace/
    • Https://success.docker.com/Datacenter/Apply/Introduction_to_User_Namespaces_in_Docker_Engine


Understanding Docker (3): Docker container uses Linux namespace for run environment isolation


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.