Dockerfile Multi-stage construction principles and usage scenarios

Source: Internet
Author: User
Tags etcd

After the Docker 17.05 release, a new Dockerfile multi-stage build was added. The so-called multi-stage build, in effect, allows multiple instructions to appear in a dockerfile FROM . What is the point of doing this?

Why does the old version of Docker not support multiple from directives

Before the 17.05 release of Docker, only one instruction is allowed in the Dockerfile FROM , which starts from the nature of the image.

In the Docker concept brief , we mention that you can simply understand that the Docker image is a compressed file that contains the program you need and a filesystem. In fact, this is not rigorous, Docker image is not just a file, but a bunch of files, the most important file is the layer.

In Dockerfile, most directives generate a layer, such as the two examples below:

# 示例一,foo 镜像的Dockerfile# 基础镜像中已经存在若干个层了FROM ubuntu:16.04# RUN指令会增加一层,在这一层中,安装了 git 软件RUN apt-get update \  && apt-get install -y --no-install-recommends git \  && apt-get clean \  && rm -rf /var/lib/apt/lists/*# 示例二,bar 镜像的DockerfileFROM foo# RUN指令会增加一层,在这一层中,安装了 nginxRUN apt-get update \  && apt-get install -y --no-install-recommends nginx \  && apt-get clean \  && rm -rf /var/lib/apt/lists/*

Assuming that the underlying image ubuntu:16.04 already exists on the 5 layer and is packaged into a mirrored foo using the first Dockerfile, Foo has 6 layers and is packaged as a mirrored bar with a second dockerfile, with 7 layers in bar.

If the ubuntu:16.04 other mirrors do not count, if only Foo and bar two mirrors are present in the system, how many layers are stored in the system altogether?

is 7 stories, not 13, because Foo and bar share 6 layers. Layer sharing mechanism can save a lot of disk space and transmission bandwidth, such as your local already have Foo image, and pull the bar mirror from the Mirror warehouse, pull only the last layer of the local not to be able to, do not need to pull the entire bar mirror strands. But how is layer sharing implemented?

It turns out that each layer of the Docker image only records file changes, and when the container starts, Docker calculates each layer of the mirror and finally generates a filesystem, called a federated mount. If you are interested in this, you can get to know AUFS.

The various layers of Docker are relevant, and in the process of joint mounting, the system needs to know on what basis to add new files. Then this requires that a Docker image can have only one starting layer and only one root. So, in Dockerfile, only one instruction is allowed FROM . Because multiple FROM instructions can cause multiple roots, it is not possible. But why has the Docker 17.05 version allowed Dockerfile to support multiple FROM instructions?

The meaning of multiple from directives

Multiple from directives are not intended to generate multiple root layer relationships, and the last generated image is still based on the last from, and the previous from is discarded, so what is the meaning of the previous from?

Each from instruction is a build phase, multiple from is a multi-stage build, although the last image generated can only be the result of the last phase, but it is possible to copy the files from the predecessor phase to the back stage, which is the greatest significance of multi-stage construction.

The biggest use scenario is to separate the build environment from the running environment, for example, before we need to build a Go language program, then we need to use the Go command and other compile environment, our dockerfile may be this:

# Go语言环境基础镜像FROM golang:1.10.3# 将源码拷贝到镜像中COPY server.go /build/# 指定工作目录WORKDIR /build# 编译镜像时,运行 go build 编译生成 server 程序RUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GOARM=6 go build -ldflags '-w -s' -o server# 指定容器运行时入口程序 serverENTRYPOINT ["/build/server"]

The underlying image golang:1.10.3 is very large, because it contains all the go language compiler tools and libraries, and run time we just need to compile the server program on the line, do not need compile-time compiler tool, the final generation of a large volume image is a waste.

The solution to using the impulse cloud is to separate the program compilation from the mirroring package, use the build service of the pulse cloud, choose the Add construct Go language build tool, and compile in the build step.

Finally, the compilation interface is copied to the image, then the Dockerfile base image does not need to include the GO compilation environment:

# 不需要Go语言编译环境FROM scratch# 将编译结果拷贝到容器中COPY server /server# 指定容器运行时入口程序 serverENTRYPOINT ["/server"]
Hint: scratch is a built-in keyword, not a real image. FROM scratchwill use a completely clean file system that does not contain any files. Because the go language compiles without running, you do not need to install any runtime libraries. FR OM scratch can minimize the last generated image, which contains only the server program.

After the Docker 17.05 release, there is a new solution that can be solved directly by a dockerfile:

# 编译阶段FROM golang:1.10.3COPY server.go /build/WORKDIR /buildRUN CGO_ENABLED=0 GOOS=linux GOARCH=amd64 GOARM=6 go build -ldflags '-w -s' -o server# 运行阶段 FROM scratch# 从编译阶段的中拷贝编译结果到当前镜像中COPY --from=0 /build/server /ENTRYPOINT ["/server"]

The mystery of this Dockerfile lies in the parameters of the copy instruction, which is the --from=0 first stage when copying a file from the front stage to the current stage with multiple from statements. In addition to using numbers, we can also give the stage a name, such as:

# 编译阶段 命名为 builderFROM golang:1.10.3 as builder# ... 省略# 运行阶段FROM scratch# 从编译阶段的中拷贝编译结果到当前镜像中COPY --from=builder /build/server /

More powerful, copy can be --from copied not only from the predecessor stage, but also directly from an existing image. Like what

   FROM ubuntu:16.04       COPY --from=quay.io/coreos/etcd:v3.3.9 /usr/local/bin/etcd /usr/local/bin/

We directly copy the ETCD image of the program into our image, so that in the generation of our program image, we do not need to compile the source code ETCD, directly to the official compilation of the program Files to come.

Some programs do not have apt source, either the version of Apt source is too old, or simply provide the source code to compile themselves, when using these programs, we can easily use the existing Docker image as our base image. However, our software may sometimes need to rely on multiple such files, we can not simultaneously use both nginx and ETCD mirror as our base image (does not support Dogan), in this case, using COPY --from is very convenient and practical.

Share reference: Pulse Cloud development Platform

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.