Docker source Analysis (ix): Docker image

Source: Internet
Author: User
Tags docker registry

1. Preface

Looking back on the past 2014, you can see that Docker has a burst of "container wind" in the world, the industry's exploration and practice of Docker is a wave higher. In today's 2015 and the future, Docker does not seem to be like other flash-forward technology, in the history of the stage of the upsurge, but in the industry after the practice and evaluation, has shown unprecedented development potential.

In its essence, "Docker provides container service", I believe very few people will disagree. So, since Docker provides services that are "container" technologies, what can we discover about the nature and history of "container" technology? As mentioned earlier, Docker uses the "container" technology, mainly based on Linux Cgroup, namespace and other kernel features, to ensure that the process or process groups in an isolated, secure environment. The first version of the Docker release was in March in 2013, and Cgroup's official debut dates back to the second half of 2007, when Cgroup was incorporated into the Linux kernel 2.6.24 version. The period of 6 years, not the "container" technology development of the vacuum period, 2008 LXC (Linux Container) was born, which simplifies the creation and management of containers; some PAAs platforms in the industry have also initially attempted to use container technology as the operating environment for their cloud applications, while with Docker released the same year , Google also released the Open Source container management tool LMCTFY. In addition, if the Linux operating system is thrown aside, other operating systems such as FreeBSD, Solaris and so on, also have a similar role of "container" technology, its development history needs to be traced back to the beginning of the millennium.

It can be seen that the development of "container" technology is not short-lived, but the influence of the same era, but few Docker rival. Whether it's the cloud-computing boom that spawned Docker technology or the era of Docker technology catching up with cloud computing, it's no doubt that Docker, as a new darling in the field, will inevitably continue to be popular with the industry. In the era of cloud computing, distributed applications are becoming popular, and their own construction, delivery and operation are not the same as the traditional requirements. With the Cgroup and namespace features of the Linux kernel, the resource isolation and application deployment of the application environment can be achieved naturally, but the kernel features such as Cgroup and namespace are not able to package the container's running environment completely. And Docker's design is good to take this into account, in addition to Cgroup and namespace, in addition to the use of magical "mirror" technology as a Docker management file system and a powerful complement to the operating environment. Docker's flexible "mirror" technology, in the author's view, is also one of the most important factors in its rounds.

2.Docker Image Introduction

As you can see, the first question must be "what is a docker image"?

According to the technical documentation on the Docker website, image (image) is one of the Docker terms that represents a read-only layer. The layer, in particular, represents a portion of the Docker container file system that can be superimposed.

The author introduces the Docker image, believing that many Docker enthusiasts still understand the foggy. Before we get to that understanding, let's take a look at 4 concepts related to Docker mirroring: Rootfs, Union mount, image, and layer.

2.1 Rootfs

Rootfs: Represents a file system perspective, or the root directory of Docker container, that a docker container is visible to its internal processes at startup, not after it is run. Of course, this directory contains the system files, tools, container files, etc. required by Docker container.

Traditionally, when the Linux operating system kernel is booted, the kernel first mounts a read-only (read-only) Rootfs, and when the system detects its integrity, decides whether to switch it to read-write (read-write) mode, Or, finally, mount a file system above the Rootfs and ignore the rootfs. In the Docker architecture, the idea of Rootfs in Linux is still being adopted. When Docker daemon mounts rootfs for Docker container, it is similar to the traditional Linux kernel and is set to read-only (read-only) mode. After the rootfs is mounted, unlike the Linux kernel, Docker daemon does not set the Docker container filesystem to read-write (read-write) mode, but instead uses the technology of the Union mount. On this read-only rootfs mount a read-write (read-write) file system, mount when the read-write (read-write) file system empty nothing.

Give an example of an Ubuntu container launcher. Assume that the user has pulled the ubuntu:14.04 mirror down through Docker registry and started it by command Docker Run–it Ubuntu:14.04/bin/bash. The Rootfs that Docker daemon creates for it and the file system that the container can read and write can be found in Figure 2.1:

Figure 2.1 Ubuntu 14.04 Container Rootfs

As read-only and read-write mean, the process in the container has only read access to the content in the Rootfs, and the content in the Read-write read-write file system has both read and write permissions. By looking at Figure 2.1, you can see that the container has only one file system, but the file system consists of "two tiers", which are read-write file system and read-only file system, respectively. This understanding already has some level (layer) meaning.

In a nutshell, the Docker container file system can be divided into two parts, and the above mentioned is that Docker Daemon uses the technology of union mount to mount the two. So what kind of technology is union mount?

2.2 Union Mount

Union Mount: Represents the way a file system is mounted, allowing multiple file systems to be mounted together at the same time, and presenting a merged directory of various file system content in the form of a file system.

In general, the contents of the mount point directory will be hidden if the content is mounted to a mount point through a file system. The Union mount does not hide the contents of the mount point directory, instead it merges the contents of the mount point directory with the content being mounted and provides a unified, independent view of the file system for the merged content. In general, only one of the merged file systems is mounted in read-write (read-write) mode, while the other file system's mount modes are read-only (read-only). The file system that implements this union mount technique is generally called the Union Filesystem, the more common is UnionFS, AUFS, OVERLAYFS and so on.

When Docker implements the container file system union mount, it provides a variety of specific file system solutions, such as the aufs that have been used in the earlier version of Docker, and the OVERLAYFS supported in the Docker 1.4.0 release.

For a deeper understanding of the Union mount, you can use the Aufs file system to further illustrate the example of the ubuntu:14.04 container file system above. 2.2:

Figure 2.2 Aufs Mount Ubuntu 14.04 File system

In a container created with image ubuntu:14.04, you can consider the container as a file system for the time being rootfs. As mentioned above, the read-write (read-write) file system is empty when mounted. That being the case, from the user's point of view, the container filesystem and the Rootfs are exactly the same, the user can fully follow the usual habits, without any differences in the use of their own perspective of the file system of all the content; However, from the perspective of the kernel, there is a very big difference. To trace the root cause of the difference, you have to mention the Cow (copy-on-write) feature of the file system such as Aufs.

The biggest difference between cow file systems and other file systems is that they never overwrite what is already in the existing file system. Due to the merging of two file systems (ROOTFS and read-write filesystem) through the cow file system, the end-user perspective is the merged file system with all the content, while the Linux kernel can logically differentiate between the two, That is, the user has read-only access to the content in the original Rootfs, and has read and write access to the content in the Read-write filesystem.

Since the user is completely unaware of what is read-only, what is readable and written, only the kernel is taking over, assuming that the user needs to update the file/etc/hosts from its perspective, and that the file happens to be the content of the ROOTFS read-only file system, will the kernel throw an exception or dismiss the user request? The answer is in the negative. When this happens, the cow file system first does not overwrite the file in the Read-only file system, that is, the/etc/hosts in the Rootfs is not written, and the file is then copied to the read-write file system, which is copied to the/etc/hosts in the read-write file system, Finally, the latter is updated. In this way, even though the Rootfs and read-write filesystem are made up of/etc/hosts, Aufs file systems, such as the cow type, can ensure that only read-write filesystem in/etc/hosts are visible in the user's view. That is, the updated content.

Of course, such features also support the deletion of files in Rootfs and other operations. For example, if a user installs Golang through the Apt-get Package management tool, all Golang-related content will be installed on the read-write file system and will not be installed in Rootfs. At this point, the user also wants to remove all the content about MySQL through the Apt-get Package management tool, which happens to be in rootfs when the delete operation does not delete the actual MySQL rootfs, but read-write FileSystem removes this part of the content, resulting in the eventual rootfs of MySQL to the container user is not visible, nor can be visited.

Once you have mastered the concept of ROOTFS and the Union mount in Docker, it becomes easy to understand the Docker image.

2.3 Image

The concept of Rootfs in Docker plays a cornerstone role in the container file system. For a container, its read-only nature is not difficult to understand. Amazingly, the rootfs design and implementation of Docker is a lot more subtle than described above.

Continue with Ubuntu 14.04 as an example, although Rootfs and Read-write filesystem can be merged through AUFS, but considering the rootfs itself approaching 200MB of disk size, If the container is created and migrated with this rootfs granularity, it will be slightly cumbersome, and will greatly reduce the flexibility of mirroring. And, if users want to have an Ubuntu 14.10 Rootfs, then it is necessary to create a brand-new rootfs, after all, Ubuntu 14.10 and Ubuntu 14.04 Rootfs have a lot of consistent content.

The concept of image in Docker solves the above problem very skillfully. The simplest explanation of the image is part of the read-only file system Rootfs in the Docker container. In other words, the rootfs of a Docker container can actually be composed of multiple image. Multiple image composition Rootfs still follows the union mount technique.

Multiple images make up 2.3 of the rootfs (in the figure, the contents of each layer in the Rootfs are divided only to clarify that the ROOTFS is composed of multiple images and does not represent the content division in the actual rootfs):

Figure 2.3 Container Rootfs multiple image composition diagram

As can be seen, the container Rootfs For example contains 4 image, each of which has some part of the user perspective file system in each image. The 4 image is in a cascading relationship, except for the bottommost image, where the image of each layer is superimposed on the other image. In addition, each image contains an image ID that uniquely marks the image.

Based on the above concepts, two concepts are abstracted from Docker Image: The Parent image and the base image. In addition to the bottom image of the container rootfs, the rest of the image depends on one or more of the image below it, and in Docker the next layer of image is called the parent image of the previous image. In Figure 2.3, for example, Imageid_0 is the parent of the Imageid_1 Image,imageid_2 is the parent image of Imageid_3, and Imageid_0 does not have the parent image. For the lowest image, which is not a mirror image of the parent image, it is customary to call base image in Docker.

In the form of image, the previously bloated rootfs were gradually dispersed into lightweight multilayer. In addition to the lightweight features, the image also has the read-only feature mentioned above, so that in different containers, different rootfs, the image can be used for reuse.

Multi-image organization relationship and reuse relationship 2.4 (the image name in the image is illustrated only to clarify the relationship between images and does not represent the relationship between the corresponding name image in the actual situation):

Figure 2.4 Multiple Image organization relationships

Figure 2.4 shows a total of 11 images, and the relationship between the 11 images presents a forest map. The forest contains two trees, the left tree contains 5 nodes, which contains 5 image, and the right tree contains 6 nodes, which contains 6 image. In the figure, some images mark the red field, meaning that the image represents the topmost image of a container mirroring rootfs. ubuntu:14.04, which represents imageid_3 as the topmost layer of rootfs for that type of container, finds the root node of the tree along that node and discovers Imageid_2,imageid_1 and imageid_0 on the path. In particular, imageid_2 as the parent image of Imageid_3, but also the topmost layer in the rootfs of the container mirror ubuntu:12.04, the visible mirror ubuntu:14.04 is just above the mirror ubuntu:12.04, A separate layer was added. Therefore, when you download the image ubuntu:12.04 and ubuntu:14.04, only one copy of Imageid_2, Imageid_1, and Imageid_0 is downloaded to enable the image to be reused. At the same time, the mysql:5.6, mongo:2.2, Debian:wheezy and Debian:jessie in the right tree also show the same relationship.

2.4 Layer

In Docker terminology, a layer is a word that has a similar meaning to the image. Container-mirrored rootfs are container-only file systems, and Rootfs are composed of multiple read-only images. Thus, each read-only image in Rootfs can be called a layer.

In addition to the read-only image, the Docker daemon creates the container on top of the container's rootfs, then mount a layer of read-write filesystem, a layer of file system, also known as a layer of the container, often referred to as the top layer.

Thus, in summary terms, each layer of the read-only image in the Docker container, and the topmost writable file system, is called layer. As a result, the layer's scope is a little more than the image, that is, it contains the topmost read-write filesystem.

With the concept of layer, you can think of a problem: The container file system is divided into read-only Rootfs, as well as a read-write top layer, then if the container runs in the top layer to write content, then the content can be persisted, and also be reused by other containers?

In the above analysis of image, we mentioned the feature of image reuse, so let's take a more daring assumption: Can the top layer of a container be converted to an image?

The answer is yes. In the Docker design concept, the top layer transforms into image behavior (called a commit operation in Docker), freeing up the flexibility of container rootfs. Docker developers can create a container based on an image to do development work, and at any point in the development cycle, you can commit to the container, and all the contents of the top layer into an image to form a new image. After the commit is complete, the user can develop, distribute, test, deploy, and so on based on the new image. Not only is Docker commit the same principle, but based on Dockerfile's Docker build, the idea of chasing the core is constantly transforming the top layer of the container into an image.

3. Summary

It is no coincidence that Docker storms sweep the globe. In today's cloud computing era, the combination of lightweight container technology and flexible mirroring technology seems to subvert previous software delivery models for continuous integration (continuous integration, CI) and continuous delivery (continuous Delivery, CD) has brought new opportunities to the development.

Understanding Docker's "mirroring" technology helps Docker enthusiasts better use, create, and deliver Docker images. Based on this, this article starts with 4 important concepts of Docker image, introduces the content contained in Docker image, the technology involved, and the important features. When Docker introduces excellent "mirroring" technology, it makes it easier to use containers and broadens the scope of Docker usage. However, at the same time, we should also rationally look at the introduction of mirror technology, whether it will bring other side effects. For other reflections on mirroring technology, the Docker source Analysis series will be analyzed later in this article.

4. References

http://www.csdn.net/article/2014-09-24/2821832

Http://en.wikipedia.org/wiki/Cgroups

Http://www.infoq.com/cn/articles/docker-future

https://docs.docker.com/terms/image/

https://docs.docker.com/terms/layer/#layer

Http://en.wikipedia.org/wiki/Union_mount

https://www.usenix.org/legacy/publications/library/proceedings/neworl/full_papers/mckusick.a

Http://www.qnx.com/developers/docs/660/index.jsp?topic=%2Fcom.qnx.doc.neutrino.sys_arch%2Ftopic%2Ffsys_COW_filesystem.html

Docker source Analysis (ix): Docker image

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.