Dockerfile Best Practices

Source: Internet
Author: User
Tags curl file copy json lua mkdir mercurial dockerfile example docker run
Original link: https://my.oschina.net/u/2612999/blog/1036388 Summary: Dockerfile best Practices


While Dockerfile simplifies the process of mirroring building, and it can be versioned, improper Dockerfile use can also cause many problems: The Docker image is too large. If you often use mirrors or build images, you will encounter a large image, and even some of the build times of a few G Docker images can be too long. Each build can take a long time, and it may be a big problem to repeat the work in places where you need to build images often (such as unit tests). Most of the content between multiple mirror builds is exactly the same and repetitive, but every time you do it, waste time and resources general guidelines and recommendations containers should be short-lived

Dockerfile instructions


The container model is a process, not a machine, and does not require boot initialization. Run when needed, stop when needed, be able to remove post-rebuild, and minimize configuration and startup. using the. dockerignore file

Dockerfile volume instruction


In the Docker build, ignoring some of the useless files and directories can increase the speed of the build. such as the. git directory. The definition of Dockerignore is similar to Gitignore. You can refer to the links for specific use methods. avoid installing unnecessary installation packages



To reduce the complexity of mirroring, the size of the mirror, and the build time, you should avoid installing useless packages. only one process is run per container



A container runs only one process. Containers play a role in isolating application isolation data, and different applications run in different containers to make the cluster's vertical expansion and the reuse of containers easier. Use the link command to combine or use docker-compose when multiple app interactions are required. minimum number of layers



The balance between the readability of the dockerfile and the number of mirrored layers needs to be mastered. It is not recommended to use too many mirrored layers. multi-line commands sorted alphabetically



The command line is sorted alphabetically to help avoid repeated executions and improve dockerfile readability. Apt-get update should be combined with apt-get install with a backslash (\) for line wrapping. For example:


RUN apt-get update && apt-get install-y \
  bzr \
  cvs \
  git \
  mercurial \
  Subversion
Building the Cache


Each instruction in the Dockerfile will submit the result as a new mirror. The next instruction is built based on the image of the previous instruction. If a mirror has the same parent image and instructions (except ADD), Docker will use mirroring instead of executing the instruction, which is caching.



Therefore, in order to effectively use the cache, try to keep dockerfile consistent, and will remain unchanged in front and often change at the end.



If you do not want to use the cache, add the parameter--no-cache=true when you execute the Docker build.



The Docker matching image determines whether or not to use the cache as follows: Start with the underlying image that exists in the cache, compare all the child mirrors, and check whether the instructions they are building are exactly the same as the current. If inconsistent, the cache does not match. In most cases, it is sufficient to compare the instructions in the Dockerfile. However, specific instructions require more judgment. In the Add COPY directive, the files that will be added to the image are also checked. The checksum of the file is usually checked (checksum). The cache matching check does not check the files in the container. For example, when you update a file in a container by using the RUN apt-get-y Update command, the cache check policy is not used as the basis for cache matching. dockerfile Directive from



Using the image in the official warehouse as the base image, it is recommended to use the Debian image, which is kept up to 100mb in size, and is still a full release version. RUN



Write complex or long RUN statements in the form of multiple lines ending with \ To improve readability and maintainability.



Apt-get Update and Apt-get install are executed together, otherwise apt-get install will have an exception.



Avoid running Apt-get upgrade or Dist-upgrade, in a non-privileged container, many of the necessary packages do not upgrade properly. If the underlying image is obsolete, you should contact the maintainer. Recommended Apt-get update && apt-get install-y package-a package-b This way, update first, then install the latest package.


RUN apt-get update && apt-get install-y \
    aufs-tools \
    automake \
    build-essential \
    curl \
    D Pkg-sig \
    libcap-dev \
    libsqlite3-dev \
    mercurial \
    reprepro \
    ruby1.9.1 \
    ruby1.9.1-dev \
    s3cmd=1.1.* \
 && rm-rf/var/lib/apt/lists/*


In addition, you can reduce the size of the mirror by removing/var/lib/apt/lists.



Note: The official Ubuntu and Debian will automatically run Apt-get clean, so you do not need to explicitly call CMD



It is recommended to use a format such as CMD ["Executable", "param1", "param2"]. If the image is used to run the service, you need to use cmd["Apache2", "-dforeground"], and this format directive applies to any mirror of the service nature. entrypoint



The entrypoint should be used for the primary command of the mirror and use CMD as the default setting, as an example of S3cmd:


entrypoint ["S3cmd"]
CMD ["--help"]


Get help:


Docker Run S3cmd


or execute the command:


Docker run S3cmd ls s3://mybucket


This is useful when the name of the image is the same as the program.



EntryPoint can also launch custom scripts: Define scripts:


#!/bin/bash
set-e

If ["$" = ' postgres ']; then
    chown-r postgres "$PGDATA"

    if [-Z "$ (ls-a" $PGDATA ")" ]; Then
        gosu postgres initdb
    fi

    exec gosu postgres "$@"
fi

exec "$@"


Note: This script uses the EXEC command to ensure that the PID that the final application starts within the container is 1. This script allows the container to receive UNIX signals.


COPY./docker-entrypoint.sh/
entrypoint ["/docker-entrypoint.sh"]


This script provides users with a variety of ways to interact with Postgres:



You can simply start Postgres:


Docker run Postgres.


or run Postgres and pass in the parameters:


Docker run Postgres postgres--help.


You can even start a completely different program from the image, such as Bash:


Docker Run--rm-it Postgres bash
EXPOSE


You should use the default port whenever possible. For example, the Apache Web service uses expose 80,mongodb with expose 27017. env can update the PATH environment variable using env. For example, ENV Path/usr/local/nginx/bin: $PATH can ensure that cmd ["Nginx"] is running properly. Env can also provide the environment variables required by the program, such as Postgres's PGDATA. Env can set information such as version. Make version information easier to maintain.


Env pg_major 9.3
env pg_version 9.3.4
RUN curl-sl http://example.com/postgres-$PG _version.tar.xz | tar-xjc/usr /src/postgress &&
... ENV path/usr/local/postgres-$PG _major/bin: $PATH
ADD or COPY


Although ADD is similar to copy, it is recommended to use copy. Copy only supports the basic file copy function and is more controllable. And ADD has more specific, such as tar file automatic extraction, support URL. It is usually necessary to extract the files from the Tarball to the container before the ADD is used.



If you use multiple files in Dockerfile, each file should use a separate COPY instruction. This way, only instructions with file changes will not use the cache.



To control the size of the image, it is not recommended to use the ADD directive to obtain a URL file. The correct approach is to use wget or curl in the RUN command to get the file and delete the file when it is not needed.


RUN mkdir-p/usr/src/things \
    && curl-sl http://example.com/big.tar.gz \
    | tar-xjc/usr/src/things \&& make-c/usr/src/things All
VOLUME


VOLUME is commonly used as a data volume and should be mounted with VOLUME for any mutable file, including database files, code bases, or files/directories created by the container. USER



If the service does not require privileges to run, use the user directive to switch to a non-root user. Switch users with user MySQL after using RUN groupadd-r mysql && useradd-r-g MySQL MySQL



Avoid using sudo to elevate permissions because it poses more problems than it can solve. If you really need such a feature, then you can choose to use Gosu.



Finally, do not switch users over and over again. Reduce unnecessary layers. Workdir



For clarity and maintainability, you should use Workdir to define the work path. It is recommended to use Workdir instead of the Run CD ... && do-something instructions. Onbuild



Other directives in Dockerfile are prepared for the purpose of customizing the current image, and only onbuild is prepared to help others customize themselves.



The onbuild directive is used to set a number of triggered instructions to perform some action when the mirror is created as a base image (that is, when the from in Dockerfile is the current mirror). The directives defined in Onbuild are executed after the from instruction of the Dockerfile file used to generate the other mirrors, and any of the instructions described above can be used for onbuild instructions, which can be used to perform some operations that vary due to the environment, making mirroring more generic.



Note: Directives defined in Onbuild will not be executed in the build of the current image. You can view the contents of a mirrored onbuild directive definition by viewing the Onbuild key of the Docker inspect <image> command execution results. The directives defined in Onbuild are executed as part of the from instruction of the Dockerfile file referencing the image, and the order of execution is performed in the order defined by Onbuild, and if any one of the instructions defined in Onbuild fails to run, Will break the from instruction and cause the entire build to fail, and the build will continue in normal order when all the instructions defined in the Onbuild have completed successfully. directives defined in Onbuild do not inherit from the currently referenced mirror, which means that all referenced onbuild instructions are cleared when the mirror that references Onbuild is created. Onbuild directives do not allow nesting, such as Onbuild Onbuild ADD. /data is not allowed. The onbuild directive does not execute its defined from or maintainer directives. For example, Dockerfile creates a mirrored image-a with the following content:


[...]
Onbuild ADD. /APP/SRC
onbuild run/usr/local/bin/python-build--dir/app/src
[...]


If you create a new mirror based on image-a, the onbuild instruction content is automatically executed when you use the from image-a to specify the underlying image in the new Dockerfile, which is equivalent to adding two instructions later.


From Image-a
#Automatically run the following
ADD./app/src
run/usr/local/bin/python-build--dir/app/src
Usage Scenarios node. js


Suppose we want to make an image of an app that is written by node. js. We all know that node. JS uses NPM for package management, and all dependencies, configuration, startup information, and so on, are placed in the Package.json file. After you get the program code, you need to do the NPM install before you get all the dependencies you need. You can then launch the app with NPM start. Therefore, it is generally written Dockerfile:


From Node:slim
run Mkdir/app
workdir/app
COPY./package.json/app
Run ["NPM", "Install"]
COPY. /app/
CMD ["NPM", "Start"]


Put this Dockerfile in the root directory of the node. JS project, and after you have built the image, you can start the container running directly. But if we have a second node. JS Project, it's similar. Well, then copy this Dockerfile to the second project. What if there's a third project? Copy it again. The more copies of the file, the more difficult version control is, and let's continue to look at the problem of maintaining this scenario.



If the first node. JS project in the development process, found that there are problems in the Dockerfile, such as typos, or need to install additional packages, and then the developer fixed the Dockerfile, re-build, problem solving. The first project is no problem, but the second one. Although the initial Dockerfile was copied and pasted from the first project, the Dockerfile of the second project will be automatically repaired because the first item has been repaired with their Dockerfile.



So can we make a base image and then use this base image for each project? In this way, the underlying image is updated, the individual items are not synchronized with the Dockerfile changes, and the rebuild inherits the updates from the underlying image. Well, yes, let's see the result. Then the above Dockerfile will become:


From Node:slim
RUN mkdir/app
workdir/app
CMD ["NPM", "Start"]


Here we take out the project-related build instructions and put them in the sub-project. Assuming that the base image is named My-node, its own Dockerfile within each project becomes:


From My-node
copy./package.json/app
RUN ["NPM", "Install"]
copy./app/


After the underlying image changes, each project uses this Dockerfile to reconstruct the image, inheriting the update of the underlying image.



So, the problem is solved. No. Exactly, only half solved. If there's something in this Dockerfile that needs to be adjusted. For example, NPM install needs to add some parameters, what to do. This line of RUN is not likely to be placed in the underlying image, because the current project is involved./package.json, do you want to change it again? Therefore, this makes the basic image, only solves the original Dockerfile of the first 4 instructions change problem, and the subsequent three instructions change is completely no way to deal with.



Onbuild can solve this problem. Let's use Onbuild to re-write the Dockerfile of the underlying image:


From Node:slim
run Mkdir/app
workdir/app
onbuild COPY./package.json/app
onbuild Run ["NPM", "Install "]
onbuild COPY. /app/
CMD ["NPM", "Start"]


This time we go back to the original Dockerfile, but this time the project-related instructions are added to the onbuild so that when the base image is built, the three rows will not be executed. Then the Dockerfile of each project becomes simply:


From My-node


Yes, that's the only line. When you build the image in each project directory with only one row of Dockerfile, the previous three lines of the underlying image onbuild will start executing, successfully copying the current project's code into the mirror, and executing NPM install for this project to generate the application image. Maven



Similar to Java,go and other compiled projects, you can use the Onbuild directive to optimize the dockerfile.



Write the Onbuild dockerfile as follows:


From Maven:3-jdk-8

RUN mkdir-p/usr/src/app
workdir/usr/src/app

onbuild ADD./usr/src/app

Onbuild RUN mvn Install


Then all project dockerfile that rely on MAVEN compilation can be simplified into the following form:


From Maven:3.3-jdk-8-onbuild
CMD ["Java", "-jar", "/usr/src/app/target/ Demo-1.0-snapshot-jar-with-dependencies.jar "]
Official Dockerfile Example:Go Perl Hy Ruby© Copyright belongs to the author

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.