Understanding mirroring, container, and storage drivers

Source: Internet
Author: User
Tags docker ps docker hub docker run

Understanding mirroring, container, and storage drivers

To use storage drivers more effectively, you must understand how Docker creates and stores images. Next, you need to understand how the container uses these mirrors. Finally, you need a technical introduction to both the mirror and container operators.

Mirroring and Layer Layers

Each Docker image references a series of read-only layers that represent the difference between file systems. The hierarchy starts from the bottom and builds the root file system that makes up the container. The following figure shows an Ubuntu image with 4 layers:


Docker's storage driver is responsible for stacking these tiers and providing a unified view.

When you create a new container, you add a new, sparse, writable layer to the underlying stack. This level is often referred to as the "container layer". All changes in the container run----such as writing a new file, modifying an existing file, deleting a file---are written to this level. The following figure shows an Ubuntu-based container.


Content addressed storage

The Docker 1.10 release introduces a new content-addressed storage model. This is a completely new way to locate the mirror and data layer on disk. In the original version, the mirror and data layers were referenced to each other and were stored using a randomly generated UUID. is replaced with a secure content hash in the new model.

The new model improves security, provides a built-in way to avoid ID collisions, and guarantees data integrity in pull, push, load, save operations. It also makes it possible to freely refer to their hierarchy, not from a bulid mirror.

The following figure shows the update for the Docker 1.10 release.


As you can see, all mirror-level IDs are encrypted hashes, but the mirror ID is still a randomly generated UUID.

There are a few things to consider about the new version:

1. Migrating Existing Mirrors

2. Mirroring and hierarchical File system architecture

(PS: Ignore the migration process, all the latest version)

Containers and hierarchies

One of the main differences between containers and mirrors is the topmost writable layer. All write/modify operations of the container are stored in this writable layer. When a container is deleted, the writable layer is also deleted. The underlying image remains intact.

Because each container has his own sparse writable layer, and all changes to the container are written to this layer, it means that multiple containers can share an underlying image and the container has its own data. The following figure shows that multiple containers use the same mirror.


Copy-on-write strategy

Sharing is a great way to optimize your resources. People instinctively do this in their daily lives. For example, two twins Jane and Joseph participate in a different teacher's math classes at different times, and they can share exercise books. Now suppose Jane has a job and needs to complete the homework on page 11th of the book. At this time, Jane copied page 11th, finished her homework and handed in his copy. The original workbook did not change and only Jane had a 11-page change.

Copy-on-write is a type-sharing and replication policy. In this strategy, containers that require the same data share the data they need and do not save one copy per container. At some point, if a process needs to modify or write data, only then will the operating system replicate a single copy of the data for the process to use. Only this process has access to the copy of the data that needs to be modified, and all other processes continue to use the raw data.

Docker's mirrors and containers all use copy-on-write technology. The cow policy optimizes the amount of storage space used by the mirror and the time it takes for the container to start. The next section will see how mirrors and Containers are using cow.

Share Promotion Small Image

This section will focus on mirroring levels and cow technologies. All on the same host local storage mirroring and container tiers are managed by the storage driver. The Linux system is usually located in/var/lib/docker/.

The Dcoker client reports the level of mirroring when the process pull and push operations are in progress. The following command is to download the output of the Docker image:

$ Docker Pull Ubuntu

Using default Tag:latest

Latest:pulling from Library/ubuntu

43db9dbdcb30:pull Complete

85a9cd1fcca2:pull Complete

C23af8496102:pull Complete

E88c36ca55d8:pull Complete

Digest:sha256:7ce82491d6e35d3aa7458a56e470a821baecee651fba76957111402591d20fc1

status:downloaded Newer image forubuntu:latest

From this output you can see that 4 mirror levels have actually been downloaded. Each row lists a mirror level and his UUID or encrypted hash. These 4 levels make up the Ubuntu image.


Each level is stored locally on the host with its own directory.

Prior to Docker 1.10, each level was placed in a folder named after the mirror level ID. However, images downloaded after the Dokcer 1.10 version are not stored this way. For example, the corresponding host folder is shown when version 1.9.1 downloaded a mirror from the Docker hub:

$ Docker Pull ubuntu:15.04

15.04:pulling from Library/ubuntu

47984b517ca9:pull Complete

Df6e891a3ea9:pull Complete

E65155041eed:pull Complete

C8be1ac8145a:pull Complete

digest:sha256:5e279a9df07990286cce22e1b0f5b0490629ca6d187698746ae5e28e604a640e

status:downloaded Newer image forubuntu:15.04

$ ls/var/lib/docker/aufs/layers

47984b517ca9ca0312aced5c9698753ffa964c2015f2a5f18e5efa9848cf30e2

c8be1ac8145a6e59a55667f573883749ad66eaeef92b4df17e5ea1260e2d7356

Df6e891a3ea9cdce2a388a2cf1b1711629557454fd120abd5be6d32329a0e0ac

e65155041eed7ec58dea78d90286048055ca75d41ea893c7246e794389ecf203

Note that the names of the above four directories are in accordance with the hierarchy ID. Now compare the differences in the Docker 1.10 version above.

$ Docker Pull ubuntu:15.04

15.04:pulling from Library/ubuntu

1ba8ac955b97:pull Complete

F157c4e5ede7:pull Complete

0b7e98f84c4c:pull Complete

A3ed95caeb02:pull Complete

digest:sha256:5e279a9df07990286cce22e1b0f5b0490629ca6d187698746ae5e28e604a640e

status:downloaded Newer image forubuntu:15.04

$ ls/var/lib/docker/aufs/layers/

1d6674ff835b10f76e354806e16b950f91a191d3b471236609ab13a930275e24

5dbb0cbe0148cf447b9464a358c1587be586058d9a4c9ce079320265e2bb94e7

Bef7199f2ed8e86fa4ada1309cfad3089e0542fec8894690529e4c04a7ca2d73

ebf814eccfe98f2704660ca1d844e4348db3b5ccc637eb905d4818fbfb00a06a

You can see that the names of the four directories do not match the level ID.

Although there is a difference in mirroring management between version 1.10 and previous versions, all versions of Docker allow mirroring of the shared hierarchy. For example, if you are pulling a mirror and other images that have been downloaded to share portions of the hierarchy, Docker will know this information and will only download parts that are not locally available. When the image is downloaded, the two images share the same level of mirroring.

You can illustrate it yourself. Starting with the Ubuntu image you just downloaded, create a new image for him. One way is to use Dockerfile and Docker build.

The following procedure is slightly.

Replication makes mirroring more efficient

In the previous chapters you can see that the container is a Docker image plus a sparse writable container layer. The following figure shows a container based on an Ubuntu image.


All of the write operations in the container are stored in a sparse writable container hierarchy. The other levels are read-only mirroring levels and cannot be modified. This means that multiple containers can safely share a single underlying image. The following figure shows multiple containers sharing an Ubuntu image. Each container has its own writable layer, but they all share an Ubuntu image.

When an existing file in the image has been modified, Dokcer uses the storage driver to perform a copy-on-write operation. The implementation details of the operation are determined by the storage driver. For AUFS and overlay storage drivers, the copy-on-write operation is as follows:

L Search the entire mirroring level to find files that need to be updated. This process starts at the highest and newest levels, up to the bottom.

L Perform a copy-up operation on the first found file. The copy-up operation copies the files to the container's own sparse writable layer.

L modify this file on a sparse writable layer.

Btrfs, ZFS, and other driver copy-on-write are not the same. You can get specific descriptions of these drivers in subsequent documents.

A container that writes a lot of data consumes more space than a container that does not write much data. This is because most write operations consume more space in the sparse writable layer at the highest level of the container. If a container needs to write a lot of data, you need to consider using a data volume.

A copy-up operation can lead to significant performance bottlenecks. This bottleneck is determined by the storage driver used. However, large files, more hierarchies, and deeper directory trees can have more impact. Fortunately, this operation only occurs when a particular file is first modified. Subsequent operations on the same file do not result in copy-up operations, which directly manipulate the files that have been copied directly at the container layer.

Let's take a look at the case where 5 containers are started by Changed-ubuntu mirroring:


1. On a docker host, use the Dockerrun command 5 times:

This operation launches 5 containers based on the changed-ubuntu image. As the container is created, Docker automatically adds a writable hierarchy and assigns a random uuid. This is returned to the Docker Run command.

2. Run the Docker PS command to confirm that 5 and the container are running

The output shows 5 running containers, and they share a mirror. The container ID of each container is part of the UUID that is created.

3. List the contents of the local storage area:

Docker's copy-on-write strategy not only reduces the space used by the container, but also reduces the time it takes to start the container. At startup, Docker only needs to create a sparse writable layer for each container. The following figure shows 5 containers sharing a read-only mirror.

Data volumes and storage drivers

When an image is deleted, any data that is not written to the data volume is deleted as the image is deleted.

A data volume is a directory or file on a Docker host file system that is directly mounted to a container. The data volume is not controlled by the storage boot. The reading and writing of the data will bypass the storage driver and operate directly at the host level. You can mount any number of data volumes into the container. Multiple containers can also share one or more data volumes.

The following figure shows a DOCEKR host running two containers. Each container has its own addressing space, located on the host's local storage/var/lib/docker/... Directory. There are also hosts that have a shared/data directory as a data volume for the container, mount to two containers.

Part of the data volume host local storage is not subject to storage-driven control. When a container is deleted, any data stored in the data volume will be persisted on the Docker host.



Summary

The image of Docker is composed of multiple read-only tiers, and multiple read-only hierarchies can be shared by multiple mirrors, saving space.

When the Docker container starts, it creates a sparse writable layer on the mirror and updates it with the Copy-on-write policy. The implementation of cow is implemented by the storage driver, and the common Aufs, device mapper and so on.

The data volume skips the storage driver and is not affected by the storage driver and is more efficient.

Understanding mirroring, container, and storage drivers

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.