Docker combat-storage structure for mirrors and containers

Source: Internet
Author: User
Tags diff uuid centos docker run
I. Mirroring, container, and storage-driven relationships

As has been said before, the image is a collection of programs and files, and the container is a running instance of the mirror.

Docker to save storage space to share the data will be layered on the mirror and container, different images can share the same data, and the container assigned an RW layer on the mirror to speed up the container boot order

1. Mirror Layer and container layer

Each mirror consists of multiple mirrored layers, which are combined from bottom to top to form the container's root filesystem, and Docker's storage driver is used to manage these mirroring layers, providing a single file system

When the container starts, Docker creates a thin-type, read-write container layer that uses pre-allocated storage space, which is not initially allocated storage space, and allocates a portion of the storage space from the storage pool when new or modified files are needed

Each container runs with its own container layer, saving the container's run-related data (all file change data) because the mirror layer is read-only, so multiple containers can share the same mirror

When you delete a container, Docker daemon deletes the container layer, preserving the mirroring layer

2. Image storage Mode

To differentiate the mirroring layer, Docker calculates the UUID for each mirror layer, and before Docker 1.10 generates a random code based on the data in the mirroring layer as the Uuid;docker 1.10 version uses the cryptographic hashing algorithm to generate the UUID based on the data in the mirror layer. This avoids UUID collisions and ensures mirrored data integrity in operations such as pull, push, load, and the container layer still generates the UUID in random numbers

Mirroring layers cannot be shared between mirrors prior to the Docker version 1.10, and in Docker 1.10, as long as the mirror layer's data is the same, it can be shared between different mirrors. When Docker is upgraded to version 1.10, a previously downloaded image on the host may still use the old storage mode, and when the Docker 1.10 version of Docker daemon is first started, the image is automatically migrated and the UUID of each mirror layer is recalculated using the cryptographic hashing algorithm. When Recalculating UUID takes a long time and Docker Daemon cannot respond to other Docker commands during migration, it is not recommended in practice

This can be done through the Docker's official image upgrade tool so that Docker Daemon can respond to Docker commands immediately after upgrading Docker

Process Description:
Download the image of the migration Tool Docker pull Docker/v1.10-migrator
Start mirroring Docker run–rm-v/var/lib/docker:/var/lib/docker Docker/v1.10-migrator
Tip: When you boot a mirror, you need to mount the directory that hosts the mirror to the container, which is/var/lib/docker

3. Copy-on-write policy (copy on write)

Storage drivers in Docker are used to manage the mirroring and container tiers, different storage drivers use different algorithms and management methods, and the two technologies used in management are stack-level management and copy-on-write

Copy-on-write uses sharing and replication techniques to keep only one copy of the same data system, and all operations access this data. When the operation needs to modify or add data, the operating system will copy this part of the data to a new place to modify or add, while other operations still access the original data area data, through this technology to save the image storage space, speed up the system startup time

(1) Reduce image storage space with sharing technology

All mirroring and container tiers are stored in the host's file system/var/lib/docker/, managed by the storage driver

When pulling the UBUTNU, you can see the following output, indicating that the version of Ubuntu is composed of five mirror layers, download the image is actually downloaded the five mirroring layer, these mirror layers are in the host file system, each layer has a separate directory, in the version prior to Docker 1.10, The directory has the same name as the image's UUID, and Docker 1.10 uses a new storage method

You can see that the name of the directory and the UUID of the download mirror are different

Although they are stored differently, the mirroring layer can be shared across all versions of Docker. When the image is downloaded, Docker daemon checks the mirror layer in the scene against the mirror layer in the host filesystem, and if it does not, downloads only the non-existent mirror layer

Created a changed-ubuntu image through Dockfile, you can see the mirrored layer and the mirror size of the mirror through Docker history, and you can see that the new mirror layer occupies 0 b. This means that the Changed-ubuntu image only consumes an extra drop of space and does not need to allocate storage space for the other five mirrored layers

As you can see, Docker saves storage space by sharing technology

(2) Faster container start-up time with replication technology

All of the write operations in the container occur at the container layer, and the mirror layer below is read-only. When a container modifies a file, Docker initiates a write-time copy operation through the storage driver

If there are many modifications or new file related operations in the container that will make the container occupy more storage space (when the container modifies the original content to the container layer, different storage-driven replication base units are different), when the number of write operations is high, it is best to use mounted volumes (read and write operations on mounted volumes bypass the storage driver)

However, the system overhead of this replication occurs only once, and subsequent operations are done in the container layer without additional overhead

When the container is started, Docker daemon only needs to create a new writable data layer for the container without duplicating all the mirroring layers, thus reducing the container start-up time

4. Data volumes and storage drivers

Docker uses a data volume (the file or directory of a host) to ensure data persistence, and when the container is started, the data volume is mounted to the container, and all data volumes read and write operations work directly on the host's file system and are not managed by the storage driver

When you delete a container, the data for the data volume is saved in the host, not deleted, and all data that is not in the data volume will be deleted

The storage driver manages all Docker daemon-generated containers, each of which is based on a Linux file system or volume management tool, with the command Docker info to view the currently used storage driver

The storage driver used by the native Docker daemon is overlay, and the back-end file system is Extfs, that is, the overlay storage driver works on the Extfs file system

Storage drivers can be set by –storage-driver=< driver name > When launching Docker daemon, common file drivers are Aufs, Devicemapper, Btrfs, ZFS, Overlay ...

When choosing a storage driver, you need to consider stability and maturity, different Linux distributions will install Docker depending on the operating system to choose the corresponding storage driver, all the default storage drivers for the current machine is the most stable second, aufs storage driver

AUFS is a union FS, which merges different directories into a single directory to make a virtual file system

Aufs is the first storage driver used by Docker and is very stable, but many Linux distributions do not support AUFS

1. Mirroring in the Aufs

Aufs uses the federated Mount Technology to assemble the directories on a stack and mount them on a single mount point, where each directory is called a branch

In Docker, each directory in the AUFS corresponds to a layer in the mirror, and the Federated Mount Point provides a unified access path to the external

2. Container read and write in Aufs

Aufs work in the file layer, that is, even if the file is only a lost change, aufs copy will copy the entire file, when the file is particularly large or the directory level is very deep, the performance of the container is significant; before copying a file, if the file is placed in the underlying mirror layer, then the search file will also take a lot of time, The copy of the file will only occur once

3. Deleting files in Aufs

The image is read-only and cannot be deleted, AUFS will generate a blank file in the container to overwrite the corresponding file in the mirror layer when deleting a file, implement false Delete

4. Configure Aufs

The AUFS storage driver can be used in Linux systems with Aufs installed, and the machine is CentOS and does not support AUFS, so the first step is to hang up. This is just a procedure to write.

grep Aufs/proc/filesystem See if this machine supports aufs and nothing happens if not supported
Launch Docker daemon via Docker Daemon–storage-driver=aufs & command line or configure in Docker daemon configuration file
Finally, use Docker info to see if the AUFS is configured successfully

5. Image storage Mode

When using Aufs as the storage driver, both the mirror layer and the container layer are stored in the/var/lib/docker/aufs/diff/directory, and the/var/lib/docker/aufs/layers/directory holds the mirror layer's metadata. Used to describe how the mirroring layer is superimposed, in which each mirror layer and container layer has a description file that records the names of all mirror layers under that layer

For a chestnut, an image has an ABC three mirroring layer, a on the top, then the contents of the/var/lib/docker/aufs/layers/a file is b C;/var/lib/docker/aufs/layers/b file content is C , while the original data file for C, the lowest layer of mirroring, is empty

6. Container Storage Method

When the container is running, the file system in the container is mounted in the/var/lib/docker/aufs/mnt/< container id>, i.e. different containers have different mount points

All containers (including stopped containers) have subdirectories in/var/lib/docker/containers/, where the container's metadata and configuration files are saved in the corresponding subdirectory, including the log file Xxx-json.log

When the container is stopped, the container directory still exists, and the container directory is deleted when the container is removed (the container layer in/var/lib/docker/aufs/diff is the same, because CentOS cannot be aufs, so it can only be demonstrated/var/lib/docker/ containers/directory)

7. Aufs Performance

In PAAs scenarios, Aufs is a good choice. Aufs can share the image during the capacity, accelerate the container start-up, save storage space, but poor performance in a large number of write operations (copy files) Three, Devicemapper storage driver

Because Aufs is not available in Redhat, and Docker is popular fast, Redhat and Docker Inc jointly developed a storage-driven devicemapper for Redhat

Devicemapper is a set of logical volume management frameworks provided by the Linux 2.6 kernel, and the Devicemapper storage driver in Docker is based on this framework, using the On Demand and snapshot feature to manage images and containers

1. Mirroring in the Devicemapper

Devicemapper storing containers and mirrors on a virtual device, operating on a block device instead of the entire file

Devicemapper process for creating mirrors

Create a thin pool (can be created on a block device or a sparse)
Create a basic device (a device on demand) and create a file system on that device
Create the image, take a snapshot on the underlying device (a snapshot is taken each time a mirror layer is created, and a copy-on-write technique is used to allocate storage space in the thin pool when the content changes)

In Devicemapper, each mirror layer is a snapshot on the previous mirror layer, and the first mirror layer is a snapshot of the underlying device (the Devicemapper device, not the Docker mirror layer), and the container is a snapshot of the mirror

Thin pool's block device configuration is described in Docker combat-docker daemon

2. Read and write operations in Devicemapper

The container layer is just a snapshot of the mirror layer, and instead of saving the data, the block mapping table is saved, pointing to the real data block in the mirror layer.

When the application initiates a read request, devicemapper The block map to find out which data block of the mirror layer the data is on, and then copies the contents of the block into the container's memory area, and then puts the required data back into the application

There are two cases of write operations for Devicemapper as a storage driver, one is to write new data (through on-demand technology), and the other is to update the existing data (copy-on-write technology implementation). Devicemapper is the management of blocks of data, which means that both technologies work in chunks, updating data with only the changed blocks of data instead of the entire file

In Devicemapepr, the size of each chunk is 64KB. When an application initiates a request to write a new data, devicemapper allocates one or more blocks of data to the application at the container layer via on demand technology, and the application writes the data to the newly allocated data block

When an application initiates an update data request, Devicemapper navigates to a block of data that needs to be modified, allocates a new store in the container layer, and uses the write-time replication technology to copy the modified block of data to the newly allocated store for application updates

Devicemapper use of on-demand and write-time replication techniques are transparent to applications in the container and do not need to understand how Devicemapper manages blocks of data

3. Devicemapper Storage Mode

Devicemapper work at the data block level, it is difficult to find the difference between the mirror layer and the container layer on the host

The host/var/lib/docker/devicemapper/mnt holds the mount point of the mirroring layer and the container layer;/var/lib/docker/devicemapper/metadata preserves the configuration information of the mirroring layer and the container layer.

4. Performance of Devicemapper

Devicemapper uses on-demand technology, and uses a block size of 64K, even if the application request data is less than 64K, also allocates a chunk size, when a large number of small data write operations in the container, will affect the performance of the container

Devicemapper is better than aufs when it comes to updating small data in large files, but when performing a large number of small data writes, Devicemapper performance is not as good as Aufs IV, overlay storage driver

OVERLAYFS is a federated file system, similar to Aufs, and more advantageous but not yet mature

1. Mirroring in the overlay

OVERLAYFS uses two directories on the Linux host, one on the lower layer (LOWERDIR), one on the upper layer (UPPERDIR), and the combined Mount technology to provide a file system (merged) for the two directories. Files in the container layer overwrite files in the mirroring layer

When you run the container, the storage driver combines all the mirrored layer directories together with the container layer, where the topmost layer in the mirror is in Lowerdir, and the container layer is in Upperdir

Overlay storage drivers can only use two layers, multi-layer mirrors can no longer be mapped to multiple layers in OVERLAYFS, each mirror layer has a corresponding directory in/var/lib/docker/overlay, using hard links and the underlying associated data

Boot container can see the container layer is also saved in the/var/lib/docker/overlay/directory, into the container layer of the directory can see the file lower-id,merged,upper,work

Where Lower-id contains the top-level ID of the mirror (top-level image is saved in Lowerdir)

The container layer is stored in the upper directory, and the data modified during the run is stored in that directory; The merged directory is the container's mount point, and the Lowerdir and Upperdir;work directories are merged for OVERLAYFS use

You can use the Mount command to view the Mount relationship

2. Mirroring in the Overlay2

When using the overlay storage driver, only use the Lowerdir directory to save the image, and when necessary, use hard links in that directory to save the multilayer image. Overlay2 Native supports multiple lowerdir, up to 128 levels

Overlay2, the mirrors and containers are stored in the/var/lib/docker/overlay2 directory. The underlying mirroring layer contains the link file and the diff directory, where the link file holds the short identifier name (to solve the problem of long strings exceeding the page size limit in the mount parameter), and the diff directory contains the contents of the mirror layer, and the second bottom layer contains the Link,lower,diff, Merged,work, the composition of the container layer is similar

wherein the lower file holds the composition information of the mirror layer, the mirror layer in the file is separated and placed in front of the mirror layer above, that is, adding the image has an ABC three mirroring layer, where a is the topmost layer, then a file's lower file information is B:C

3. Read and write operations in overlay

When the application reads a file, if the target file is in the container layer, the file is read directly from the container layer, and if the destination file is not in the container layer, the file is read from the mirror layer

The Overlayfs file system works at the file layer rather than the data block layer. When a file is first modified in a container, the Overlay/overlay2 storage driver copies files from the mirror layer to the container layer, OVERLAYFS only two layers, and Overlay is faster than aufs when searching for files in a deep directory tree

When you need to delete a file in a container, the overlay storage driver creates a new without file in the container layer/opaque directory to hide the target files/directories in the mirror layer without actually deleting the files or directories in the mirror layer

4. Configure Overlay/overlay2

To use the overlay storage driver requires the host Linux kernel to be version 3.18, while the Overlay2 storage driver requires a Linux kernel of 4.0 or later

Process Description:
Service Docker Stop/systemctl Stop Docker.service stops Docker Daemon
Uname-r checking the Linux kernel version
Lsmod | grep Overlay Loading Overlay Module
Docker Daemon–storage-driver=overlay & Configuring storage drivers or configuring in Docker daemon configuration files
Docker Info Check if overlay is in effect

With Overlay/overlay2 storage drivers, Docker automatically creates overlay mount points in which to create Lowerdir, Upperdir, merged, Workdir

5. Performance of Overlay

Overall, the Overlay/overlay2 storage driver is fast and supports shared page caching, but it can affect write performance only when the file is particularly large, but Overlayfs's copy operation is faster than AUFS because Aufs has many layers and the directory tree is very expensive when it is deep. Also, it is best to keep the data in the data volume when you need to read and write, and all read and write operations bypass the storage driver without adding extra overhead

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.