Docker Storage Driver Device Mapper Introduction

Last Update:2017-06-17 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Device Mapper is a kernel-based framework that enhances many advanced volume management technologies on Linux. Docker's Devicemapper Drive leverages the framework's hyper-provisioning and snapshot capabilities for mirroring and container management. To differentiate, this article uses device Mapper to refer to the framework in the drive, while Devicemapper refers to the storage driver for Docker.
　　Note : Commercially supported Docker Engine (Cs-engine) recommends the use of devicemapper storage drivers on Rhel and CentOS.

Another option outside of Aufs

Docker initially runs on Ubuntu and Devian, and uses AUFS as the storage backend. When Docker became popular, many companies that wanted to use it were using Rhel. Unfortunately, because the Linux mainline kernel does not contain AUFS, Rhel does not support AUFS.
To change this, the Red Hat developer research will include AUFS into the kernel mainline. Finally, they think developing a new storage backend is a better idea. In addition, they intend to use the existing device mapper technology as the basis for the new storage backend.
Red Hat partnered with Docker to contribute to this new drive. Because of this collaboration, Docker engine was redesigned as a storage-driven plug-in. As a result, Devicemapper becomes the second storage driver supported by Docker.
Device Mapper was the Linux kernel mainline after 2.6.9, and the core part of the Rhel family release package. This means that Devicemapper storage drivers are based on stable code, with a lot of work products and strong community support.

Mirroring tiering and sharing

The Devicemapper driver stores each image and container in its own virtual device, which is a hyper-copy-on-write snapshot device. Device Mapper Technology works at the block level rather than at the file level, meaning that devicemapper storage-driven provisioning and copy-on-write operations directly manipulate blocks, rather than the entire file.
　　Note : Snapshots are also used by thin devices or virtual devices.
When using Devicemapper, Docker creates the image as follows:
? Devicemapper Storage driver creates a thin pool.
This pool was created on a block device or on a loop mounted sparse file.
? Then create a basic device.
The base device is a thin device with a file system. You can use the backing filesystem value in the Docker info command to see which file system is used by the backend.
? Each new mirror (and mirror layer) is a snapshot of the underlying device.
These are the copy-on-write snapshots of the hyper-match. That is, when they are initialized, they are empty and only consume space from the pool when there is data written to them.
With Devicemapper, the container layer is a mirrored-based snapshot. Like mirroring, a container snapshot is also a copy-on-write snapshot of the hyper-match. The container snapshot holds all updates for the container, and when writing data to the container, Devicemapper allocates space from the pool to the container layer as needed.
Shows a thin pool, base device, and two mirrors.
　　
If you look closely at the picture, you will find that each mirror layer is a snapshot of its underlying, and the bottom of the image is a snapshot of the base device in the pool. This basic device is generated by device mapper and is not a Docker mirroring layer.
The container is also a snapshot of the image, showing two containers at the entire storage-driven level.
　　

Read operation of Devicemapper

Shows the process of reading a block (the address is 0x44f) in a container.
　　
The application requests access block 0x44f in the container.
Because the container is a thin snapshot of the mirror, it has no actual data. But it has pointers to the mirrored snapshots of the data in the image stack.
The storage driver locates the snapshot block 0xf33 corresponding to the mirror layer a005 according to the pointer.
Devicemapper Copy the data from the block 0xf33 in the mirrored snapshot into the container memory.
? The container driver returns the data to the requesting app.

Write operations

With devicemapper drivers, writing data to a container is done through an on-demand operation. The copy-on-write operation was used to update existing data. However, device Mapper is a block-based technology, so these operations occur at the block level.
For example, to make small changes to a large file in a container, the Devicemapper driver does not copy the entire file, it copies only the blocks that correspond to the content to be modified, each block size is 64KB.

Write Data

Write 55KB data to the container:
Apply a request to the container to write 56KB data;
? The on-demand operation assigns a new 64KB block to the container snapshot.
If you write more than 64KB of data, you need to allocate multiple blocks to the container snapshot.
The data is written to the newly allocated block.

Overwrite existing data

Modify existing data for the first time:
Apply a request to the container to initiate changes to the data;
The Copy-on-write operation locates the block that needs to be updated;
Assign a new block to the container snapshot and copy the data into these blocks;
The modified data is written to these newly allocated blocks.
Applications in containers do not perceive these on-demand and copy-on-write operations. However, these operations can still cause some delay in the application's reading and writing.

Configuring Devicemapper in Docker

Devicemapper is the default Docker storage driver for some Linux distributions, including Rhel and its branches. Currently, the previous release supports this driver:
Rhel/centos/fedora
Ubuntu
Debian
Arch Linux
Docker host runs Devicemapper using the LOOP-LVM configuration mode. This mode uses a factor file to create the thin pool, which is used for mirroring and container snapshots. And these patterns are out-of-the-box, without additional configuration. However, it is not recommended to use LOOP-LVM mode in product deployment.
You can use the Docker Info command to see if this mode is used.

$ sudo docker infoContainers: 0Images: 0Storage Driver: devicemapper Pool Name: docker-202:2-25220302-pool Pool Blocksize: 65.54 kB Backing Filesystem: xfs [...] Data loop file: /var/lib/docker/devicemapper/devicemapper/data Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.93-RHEL7 (2015-01-28) [...]

The above output allows the Docker host to use the Devicemapper storage driver. In addition, the LOOP-LVM mode is used because there are two files of the data loop file and the metadata loop file under/var/lib/docker/devicemapper/devicemapper. These are sparse files that are mapped by loopback.

Configuring the DIRECT-LVM mode in the product

The DIRECT-LVM mode should be used in product deployment, which uses block devices to create thin pools. The following steps describe how to use the DIRECT-LVM mode in Docker host to devicemapper the storage driver.
　　Note : If you have already run Docker daemon on your Docker host and have some images you want to save, push them to the Docker Hub or your private Docker Registry before performing the following steps.
The following steps create a logical volume, configured as a thin pool, to be used as the back-end storage pool. Suppose you have a sparse block device/DEV/XVDF, and the device has enough space to complete this task. In your environment, device identifiers and volumes may vary in size, and you should replace them with values that are appropriate for your environment when you perform the following procedure. In addition, the following steps should be performed when the Docker daemon is stopped.
1) Enter Docker host and stop Docker daemon;
2) Install LVM2 and Thin-provisioning-tools installation package;
The LVM2 installation package provides user-space tools for managing logical volumes.
Thin-provisioning-tools is used to activate and manage pools.

# on Ubuntu$ sudo apt -y install lvm2# On CentOS$ sudo yum install -y lvm2

3) Create a physical volume/DEV/XVDF to replace the block device;

$ pvcreate /dev/xvdf

4) Create Volume group Docker;

$ vgcreate docker /dev/xvdf

5) Create a virtual volume named Thinpool and Thinpoolmeta;
In this example, the data size is 95% of the size of the Docker volume group, leaving these free space for automatic expansion of data or metadata.

$ lvcreate --wipesignatures y -n thinpool docker -l 95%VG$ lvcreate --wipesignatures y -n thinpoolmeta docker -l 1%VG

6) Convert the pool into a thin pool;

$ lvconvert -y --zero n -c 512K --thinpool docker/thinpool --poolmetadata docker/thinpoolmeta

7) Configure the automatic expansion of the thin pool through LVM files;

$ vi /etc/lvm/profile/docker-thinpool.profile

8) Specify the Thin_pool_autoextend_threshold value;
This value is the percentage of the current amount of space used when LVM attempts to expand to free space (100= forbidden).

thin_pool_autoextend_threshold = 80

9) Modify the thin_pool_autoextend_percent;
The value is the percentage of thin pool to be expanded (100= forbidden).

thin_pool_autoextend_percent = 20

10) Check the above steps, your docker-thinpool.profile file should be similar to the following:
A/etc/lvm/profile/docker-thinpool.profile sample file:

activation {thin_pool_autoextend_threshold=80thin_pool_autoextend_percent=20}

11) Apply the new LVM profile;

$ lvchange --metadataprofile docker-thinpool docker/thinpool

12) Confirm whether LV has been modified;

$ lvs -o+seg_monitor

13) If the previous Docker daemon has been booted, then you need to move the previous graph driver directory away;
Moving Graph driver will delete all the mirrored containers and volumes. The following command moves the contents of/var/lib/docker to another directory.

$ mkdir /var/lib/docker.bk$ mv /var/lib/docker/* /var/lib/docker.bk

14) Use the Special Devicemapper option to configure the Docker daemon;
There are two ways to configure the Devicemapper storage driver for Docker daemon. You can add the following parameters when you run daemon:

--storage-driver=devicemapper --storage-opt=dm.thinpooldev=/dev/mapper/docker-thinpool --storage-opt=dm.use_deferred_removal=true --storage-opt=dm.use_deferred_deletion=true

Can also be configured in the daemon configuration file, such as the default profile/etc/docker/daemon.json, can be configured as follows:

{  "storage-driver": "devicemapper",   "storage-opts": [     "dm.thinpooldev=/dev/mapper/docker-thinpool",     "dm.use_deferred_removal=true",     "dm.use_deferred_deletion=true"   ]}

　　　Note : Always use the dm.use_deferred_removal=true and dm.use_deferred_deletion=true options to prevent unintentional disclosure of mapping resource information.
15) (optional) If the SYSTEMD is used and the daemon configuration file is modified, the SYSTEMD information needs to be overloaded;

$ systemctl daemon-reload

16) Restart the Docker daemon.

$ systemctl start docker

When you start the Docker daemon, make sure that the available space for the thin pool and volume group is always monitored. When the volume group is automatically expanded, it may fill up all of the space. You can use the LVS or LVS-A command to monitor logical volumes to see the size of the data and metadata. Alternatively, you can use the VGS command to monitor the available space for the volume group.
When the threshold is reached, the log displays the automatic expansion information for the thin pool, which can be viewed using the following command:

$ journalctl -fu dm-event.service

When you confirm that your configuration file is correct, you can delete the previous backup directory.

$ rm -rf /var/lib/docker.bk

You can also use the Dm.min_free_space parameter. This value ensures that when the party's free space arrives or approaches the minimum, the operation fails with a hint. You can see more driver options.

Check the Devicemapper structure body

You can view the pool-related device files created by the Devicemapper storage driver through the LSBLK command.

$ sudo lsblkNAME               MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTxvda               202:0    0    G  0 disk└─xvda1            202:1    0    G  0 part /xvdf               202:80   0   0G  0 disk├─vg--docker-data          253:0    0   0G  0 lvm│ └─docker-202:1-1032-pool 253:2    0   0G  0 dm└─vg--docker-metadata      253:1    0    G  0 lvm  └─docker-202:1-1032-pool 253:2    0   0G  0 dm

The image information for the above example is displayed.
　　
In the diagram, the name of the pool is called Docker-202:1-1032-pool, spanning two devices across data and metadata. The pool name format for Devicemapper is:

Docker-MAJ:MIN-INO-pool

Maj,min and info represent major, minor device numbers, and inode numbers, respectively.
There are two main directories. The/VAR/LIB/DOCKER/DEVICEMAPPER/MNT directory contains the mapping points for the mirror and container layers. The/var/lib/docker/devicemapper/metadata directory corresponds to a file for each mirror layer and container snapshot, and the file holds the metadata for each snapshot in JSON format.

Device Mapper and Docker performance

Understanding on-demand and copy-on-write can give you a holistic view of container performance.

Performance impact on Demand

The Devicemapper storage driver assigns a new block to the container when the operation is on demand. That is, each time the application writes data to a location in the container, one or more blocks are allocated from within the pool and mapped to the container.
All blocks are 64KB in size, and a write request less than 64KB will also result in the allocation of 64KB blocks, and a write request greater than 64KB requires multiple 64KB blocks. This can affect the performance of the container, especially if the container has many small write requests. However, once a block is assigned to the container, subsequent reads and writes will manipulate the newly allocated block directly.

Copy-on-write Performance Impact

Each time a container updates existing data for the first time, the Devicemapper storage driver performs a copy-on-write operation that copies the data from the mirrored snapshot to the container snapshot. This has a significant effect on container performance.
All copy-on-write operations have a 64KB granularity. Therefore, to modify the 32KB content of a 1GB file, you only need to copy the 64KB-sized blocks to the container. This feature has a significant performance advantage over the copy-on-write operation of the file layer when copying the entire large file to the container layer.
However, in practice, if the container performs a large number of small write requests (<64KB), devicemapper performance is worse than Aufs.

Other Device Mapper Performance considerations

There are also some points that can affect the performance of devicemapper storage drivers.
Mode The default mode for Docker to run Devicemapper storage driver is LOOP-LVM. This mode uses sparse files and has poor performance. This mode is not recommended in the product, and the DIRECT-LVM is recommended in the product environment so that the storage driver can write directly to the block device.
? high-speed storage. For better performance, you can place data files and metadata files on high-speed storage (such as SSDs). Of course, you can also connect directly to a SAN or NAS array.
? Memory usage. Devicemapper is not the most efficient Docker storage driver for memory. Starting n copies of the same container requires an n copy of its file size to be loaded into memory, which has a certain effect on the memory of the Docker host. Therefore, Devicemapper storage drivers may not be the best solution for pass or other high density use cases.
Finally, the data volumes provide better and predictable performance. Therefore, a high-load write request should be written to the data volume.

Docker Storage Driver Device Mapper Introduction

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More