Deep understanding of Docker image size

Source: Internet
Author: User
Tags dockerfile example docker hub

All say container Dafa is good, but if there is no Docker image, how boring is Docker.

Do you still remember the image you pulled from the Docker hub when you first contacted Docker? On the basis of that Virgin Mirror, you run the container of the life of the Virgin. The cornerstone of mirroring has become apparent, in the world of Docker, it can be said: No image,no Container.

Further thinking about Docker images, you may soon associate the following types of mirrors:

1. System-level mirroring : such as Ubuntu images, CentOS images, and Debian containers;

2. Tool stack image : such as Golang image, flask image, Tomcat image, etc.;

3. Service level image : such as MySQL image, MongoDB image, RABBITMQ image, etc.

4. Application-level image : such as WordPress image, Dockerregistry image, etc.

There are a few images that you want to run Docker containers with Docker images, and you must download the Docker image first if you want to have a Docker image, and since it involves downloading Docker images, there will naturally be Docker image storage. When it comes to Docker image storage, let's start by talking about the size of the Docker image.

The following is an analysis of the size of the Docker image from three angles: Dockerfile与镜像 , 联合文件系统 and 镜像共享关系 .

Dockerfile and Mirroring

Dockerfile is made up of multiple directives, and as you delve into the relationship between dockerfile and mirroring, you will soon find that each instruction in dockerfile corresponds to a layer in the Docker image.

Continue with the following dockerfile as an example:

FROM ubuntu:14.04ADD run.sh /VOLUME /dataCMD ["./run.sh"]

With the Docker build above Dockerfile, a three-layer independent mirror is added to the ubuntu:14.04 image, which in turn corresponds to three different commands. The image is as follows:

With the initial understanding of the relationship between dockerfile and mirroring, we further contacted the size of each layer of mirrors.

It has to be said that in a hierarchical managed Docker image, there are a number of layer sizes of 0. The size of those mirrored layers is ultimately due to the fact that the 不为0 current file system has been modified and updated when the Docker image is built. There are two main ways to modify updates:

1.ADD or Copy command : The role of Add or copy is to add content to the container when the Docker build builds the image, as long as the content is added successfully, the layer image currently being built is the size of the added content, such as the above command ADD run.sh / , The newly built layer image size is the size of the file run.sh.

2.RUN Command : The function of the Run command is to run a command in the current empty mirror layer, and if the running command needs to update the disk file, all updates are stored in the current mirroring layer. Example: The RUN echo DaoCloud command does not involve the modification of the contents of the file system, so the size of the current mirror layer is 0 after the command is run, RUN wget http://abc.com/def.tar and the command will download the compressed package to the current directory, so the size of the current layer image is: the incremental modification portion of the file system content, the size of the Def.tar file.

Federated file System

The Dockerfile command corresponds to mirror layer one by one, so does it mean that the total size of the mirror is the sum of the size of each layer of mirrors after the Docker build is completed? The answer is yes. Still consider: If the size of the ubuntu:14.04 mirror is 200MB, and the size of run.sh is 5MB, then the above three-layer mirror from top to bottom, each layer size is 0, 0, and 5MB, then the resulting image size is indeed 0+0+5+200=205MB.

Although the size of the final image is cumulative for each layer of mirroring, it is important to note that the size 并不等于 of the file system content in the container of the Docker image (excluding virtual files such as mount files,/proc,/sys, etc.). The reason, and the joint file system has a great relationship.

First look at this simple dockerfile example (if there is a 100MB compressed file in the current directory of Dockerfile Compressed.tar):

FROM ubuntu:14.04ADD compressed.tar /RUN rm /compressed.tarADD compressed.tar /

1.FROM Ubuntu: The size of the mirrored ubuntu:14.04 is 200MB;

2.ADD Compressed.tar/: Compressed.tar file is 100MB, so the current mirror layer size is 100MB, the total image size is 300MB;

3.RUN rm/compressed.tar: Delete file Compressed.tar, Deletion at this time does not delete the next layer of the Compressed.tar file, only the current layer will produce a Compressed.tar delete tag, to ensure that the layer will not see Compressed.tar, so the current mirror layer size is also 0, The total image size is 300MB;

4.ADD Compressed.tar/: Compressed.tar file is 100MB, so the current mirror layer size is 300MB+100MB, the total image size is 400MB;

After the analysis, we found that the total size of the image is 400MB, but if we run the image, we can quickly find that the value displayed after execution in the container root du -sh is not 400MB, but about 300MB. The main reason is that the nature of the federated file System guarantees two compressed.tar mirrored layers with files, only the container sees one. This also illustrates a status quo, when the user is based on a very large, even several GB of images to run the container, the inside of the container to view the root directory size, found that only 500MB less, set smaller.

分析至此,有一点大家需要非常注意:镜像大小和容器大小有着本质的区别。

Mirroring shared relationships

Docker Image said big, small, but once the total number of mirrors, it is not to the local disk caused great storage pressure? Average 500MB per mirror, wouldn't it be 100 mirrors to prepare 50GB of storage space?

The result is often not as we might expect, Docker is very well designed for mirroring reuse, saving disk space for mirroring. The reuse of Docker images is mainly reflected in the following: Multiple different Docker images can share the same mirror layer.

Assuming that there is only one ubuntu:14.04 image in the local mirror store, we use the mirror as a reuse for two dockerfile:

FROM ubuntu:14.04RUN apt-get update
FROM ubuntu:14.04ADD compressed.tar /

Assuming that the final Docker build has a mirror name of Image1 and Image2, because two dockerfile are based on ubuntu:14.04, both the image1 and image2 mirrors are multiplexed with the mirror ubuntu:14.04. Assuming that RUN apt-get update the modified file system content is 20MB, the size relationship of the final local three mirrors should be as follows:

ubuntu:14.04: 200MB

image1: 200MB (ubuntu:14.04) +20MB=220MB

image2: 200MB (ubuntu:14.04) +100MB=300MB

If you simply add up to the size of three mirrors, the result should be: 200+220+300=720MB, but due to the presence of mirror multiplexing, the actual amount of disk space occupied is: 200+20+100=320MB, a full saving of 400MB of disk space. This is enough to demonstrate the great benefits of mirror multiplexing.

Summarize

While learning Docker, there are often three parts of the content that are inseparable, that is, dockerfile,docker mirroring and Docker containers, analyzing the size of the Docker image. The size of the Docker image, seemingly bland, is to optimize the image, the container disk limit must be involved in the content.

This series will analyze the Docker image by following several articles:

1. Deep understanding of Docker image size

2. Actually, Docker commit is simple.

3. What you have to say is the difference between Docker save and Docker export

4. Why some container files cannot be moved

5. Breaking the VOLUME of the MNT Namespace container

Welcome to the [Docker source code Analysis] public number, more exciting will be presented.

Copyright NOTICE: This article for Bo Master original article, without Bo Master permission not reproduced.

Deep understanding of Docker image size

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.