"Editor's note" Shopify is a provider of online shop solutions company, the number of shops currently serving more than 100,000 (Tesla is also its users). The main frame of the website is Ruby on rails,1700 kernel and 6TB RAM, which can respond to 8,000 user requests per second. In order to expand and manage the business more easily, Shopify started using Docker and CoreOS technology, and Shopify software engineer Graeme Johnson will write a series of articles to share his experience, this is the second in the series, This paper focuses on how to use container technology in the production environment of Shopify.
This is the first group (230365411) of the students to translate together the article, thank you for your obligation to help, this time to participate in the translation of students including Wang Daolong, Sun Hongliang, Wu Jingrun, Wong, Zhou Jingmao, Zhao Wenju. We also welcome you to join!
The following is the original translation:
Why use container technology?
Before we dive into the mechanism of building a container, let's discuss the motives for doing so. The potential of a container in a data center is like a gaming game. In the early days of PC games, you usually need to install video or audio drivers before you start playing a game. However, game consoles offer different experiences, similar to Docker:
Predictability: Games bring it from the game, ready to run, no need to download or update; quick: The game band uses read-only memory to gain the speed of lightning; simple: Game belts are robust and have child protection to a large extent, they are real plug and play, predictable, rapid, simple, It's a good thing to scale. The Docker container provides a building block that makes our data center easier to run and more suitable for applications to be stand-alone modules, and ready units are like game belts on games consoles.
Bootstrapper
In order to achieve the container, you need to have both development and operational skills. First, communicate with your operations team, and you need to be sure that your container is fully replicated in your production environment. If you are running on OS X (or Windows) desktop operating system, but deployed on Linux, use a virtual machine (such as vagrant) as a local test environment. The first step is to update your operating system and install support packs. Pick a base mirror to match your production environment (we use Ubuntu14.01) and make no mistakes-you don't want to handle the hassle of container upgrades and operating systems/packages at the same time.
Select container Type
In terms of the container type, Docker provides you with enough room to choose from a single process of "thin" containers to a "fat" container that makes you feel like a traditional virtual machine (for example, Phusion).
We chose to follow the "thin" container and remove extraneous components from the container. Although it is difficult to make choices in two ways, we prefer the smaller ones, because the simplicity of the container consumes less CPU and memory. This way is explained in detail in the Docker blog.
Environment Configuration
In a production environment, we use the Chef Deployment tool to manage the various nodes of the system. In this way, we can easily run chef in a container, but we do not want some services in each container to run, such as: Log Indexing Service, running State collection services. And the use of chef is no doubt that many services will be repeatedly installed in unnecessary containers, due to the inability to endure the above futile duplication of work, we chose to run the Docker on each host to share the same copies of these services.
The key to how to make a container lightweight is to convert the run script of the Chef Deployment tool to a dockerfile (part of which we later replace it with a custom Docker build process that will be covered later). The birth of Docker is a godsend, enabling operations personnel to assess the internal production environment and to review and collate what is needed throughout the system lifecycle. In this link, for the system to give up as far as possible ruthless, but also to ensure that the code review process as far as possible to be cautious.
In fact, the whole process is not as difficult as it sounds. Ultimately, our team ended up with a 125-line dockerfile, which defines the underlying mirrors that all containers need to share on Shopify. The base image contains 25 packages that include a large span of programming language runtimes (Ruby, Python, and node), as well as multiple development tools (Git, Vim, build-essential, and go), as well as some library files that need to be shared. Also, the underlying mirror contains a series of tool scripts to run tasks, such as starting Ruby by tuning parameters, or sending monitoring data to Datadog.
In these environments, our applications can easily add their own specific requirements to the underlying image. However, our biggest application is only to add some operating system dependencies, so overall, our basic image is relatively simple and lean.
The 100 law of the vessel
When you choose which services to be containerized, you can first assume that you have 100 small containers running on the same host, and then think about whether it is really necessary to run a copy of the 100 services, or if you share a single host better.
Here are some examples of how we can determine how to use a container according to the 100 Law:
Log index: Logging is critical to diagnosing errors, especially if the container exits and the file system disappears. We deliberately avoided modifying the log behavior of the application, such as forcing them to redirect to the system log, and allowing them to continue writing logs to the file system. Running 100 log agents seems to be the wrong approach, so we create a background program to handle the following important tasks:
runs on the host side and subscribes to the Docker event, configures the log index to monitor the container when it is started, and deletes the index instruction when the container is destroyed. To ensure that the container exits, all logs are indexed, and you may need to delay the destruction of the container slightly.
Statistics: Shopify generates RUN-TIME statistics on several levels (systems, middleware, and applications), and the resulting results are relayed by proxy or application code.
Many of our statistical results can be transmitted through STASD, fortunately we can configure the host side of the Datadog to receive traffic from the container, through the appropriate configuration, you can send STASD address to the container; Since the container is essentially a process tree, Therefore, a host-side system monitoring agent can see the container boundary sharing a single system monitoring is also free; from a more container-centric point of view, to consider the integration of Docker and Datadog, will add Docker metrics to host side of the monitoring agent; Most application-level metrics are still available, either by sending events via STASD or by directly talking to other services. It is important to specify a name for a container, so metrics is also easier to read.
Kafka: We use Kafka to handle events from the Shopify stack to the partner in real time. We use Ruby on rail code to publish Kafka events, generate information, and then put them into SYSV message queues. A go language-written daemon takes a message out of the queue and sends it to Kafka. This reduces the time of the Ruby process and we are better able to cope with Kafka server crashes. The downside is that SYSV Message Queuing is part of the IPC namespace, so we can't connect to Container:host. Our solution is to add a socket end to the host to place the message in the SYSV queue.
The use of the 100 law requires some flexibility. In some cases, only the "glue" of the component needs to be written, or it can be configured to achieve the goal. Ultimately, you should get a container that contains the things your application needs to run and a host environment that provides Docker hosting and shared services.
your application into a container
As the environment is ready, we can turn our attention to the container of the program.
We are more inclined to thin container can do one thing accurately. For example, a unicorn (unicorn is a UNIX and LAN/local host optimized HTTP server) master and worker thread Service Web request, or a resque (Resque use Redis to create background tasks, store into queues, and then execute. It is one of the most common background task management tools under rails. The worker thread service is a specific queue. Thin container allows fine-grained scaling (generally refers to the particle size of the interface of a remote method call) to meet the requirements. For example, we can increase the number of Resque worker threads that check for spam attack responses.
We find that some standard conventions are useful for the layout of code in a container:
applications always root under the app in the container; The application usually publishes the service through a single port;
We have also established conventions for container git repo (repo encapsulates operations on GIT):
/container/files has a file tree that is copied directly into the container when it is established, for example, the application log requesting the Splunk index, which adds/container/files/etc/splunk.d/inputs.conf File into your git repo is enough (the responsibility to control the log index is transferred to the developer, which is a subtle and significant change that used to be the work of the Operation Manager);
/container/compile is a shell script that compiles your application and generates a container that is ready to run. Creating this file and adapting to your application is the most complex place;
The/container/roles.json Save command line is used in machine-readable form to run the workload. Many of our applications run the same code base with multiple roles, and some handle web traffic while processing background tasks. This part of the inspiration comes from Heroku's procfile. The following is an example of a Roles.json file:
We build with a simple makefile drive, or we can run it locally. Our dockerfile looks like this: the goal of canceling the compile phase is to produce a container that is ready to run immediately. One of the key advantages of Docker is that the container starts super fast without being corrupted because of extra work. To achieve this goal you need to understand your entire deployment process. Give a few examples:
we are using Capistrano (Capistrano is an open source tool that runs scripts on multiple servers, primarily for deploying Web applications) to deploy code to machines and asset compile what happened earlier as part of the deployment. By compiling the mobile assets into a container, the deployment of new code is simpler and faster for a few minutes. Our unicorn host launches to get data from the table model (the Table object represents an HTML table.) Not only is this slow, but our smaller container size means that more database connections will be required. On the other hand, it is possible to do this (get data) when the container is set up to speed up the startup.
In this example, the compile phase includes the following logical steps:
Bundle install asset compile database setup
In order to maintain the size of this announcement (mail), we have simplified some details. Key management is a major detail that we have not discussed here. Do not register it in source control. We've got the code to encrypt the key, and the blog posts dedicated to that topic will come soon.
Debugging and Detail
Most of the things that run in the container outside of the application and container are the same. In addition, most of your debugging tools, such as Strace,gdb,/proc/filesystem, run on the host where Docker resides. There are familiar tools Nsenter or nsinit that can be used to mount to a running container for debugging.
Docker exec, as the new tool provided in Docker version 1.3.0, can be used to inject processes into the running container. Unfortunately, if you inject the process with root authority, you still need to pass nsenter, and in some places it may not be as expected.
Process Layering
Although we run lightweight containers, we still need the initialization process (pid=1) to integrate closely with monitoring tools, background management, service discovery, and so on, and to give us fine-grained health monitoring.
In addition to initializing the process, we added a Ppidshim process (pid=2) to each container, the pid=3 of the application process. Because of the existence of the Ppdishim process, applications do not directly inherit from the initialization process, avoiding the fact that they think they are daemon daemons, resulting in erroneous consequences.
The final process level is this:
Signals
If you are using container technology, you are likely to modify an existing run script or write a new set of scripts that contain Docker run calls. By default, Docker run proxies signal to your application, so you must understand how the application interprets signal.
The standard UNIX approach is to send sigterm to request an orderly shutdown of a process. To ensure that the application adheres to this Convention, Resque uses Sigquit to shut down the process in an orderly manner, sigterm to shut down the process. Fortunately, Resque can be configured to use Sigterm to gracefully close the process.
hostnames
We select the container name to describe the workload (for example, Unicorn-1, resque-2), and combine these host names for simple traceability. The end result would be this: unicorn-1.server2.shopify.com. We use the Docker hostname tag to pass the results into the container, which causes most applications to report the correct values. Some programs (Ruby) ask the hostname to get the thumbnail (unicorn-1) without asking for the desired FQDN. Since Docker manages the/etc/resolv.conf file, our current version does not allow arbitrary changes, so we rewrite the gethostname () and uname () methods through ld_preload and inject them into the library. The end result is that the Monitor tool publishes the hostname we want without changing the application code.
Registration and Deployment
We found that the behavior of copying ' bare metal ' in the process of building a container is equivalent to an ongoing process of debugging. If you are sensible, you certainly want to automate the construction of containers.
For everyone to push, we use the GitHub commit hook to trigger the construction of a container, in which we submit a status log to determine if the build succeeds. We use Git to submit SHA to the container's "Docker tag" so you can see exactly what version of the code is included in the container. To make it easier to debug and use the script, we also put the SHA in a file in the container (/app/revision).
Once your build is normal, you will want to push the container to a central registry repository. To increase deployment speed and reduce external dependencies we chose to run our own library in the datacenter. We run multiple copies of the standard Python registry behind the Nginx reverse proxy, which caches the contents of the GET request, as shown in the following illustration:
When multiple Docker hosts request the same mirror image, we find that large network interfaces (Gbps) and reverse proxies are effective in solving the problems associated with code deployment such as "thundering thundering". The proxy method also allows us to run multiple types of libraries and provide automatic failover in the event of a failure.
If you follow this blueprint, you can automate the construction of the container and securely store the container in a central warehouse registry, which can be incorporated into your deployment process.
In the next article in this series, the author describes how to manage the secrets of the application.
Original link: Docker at Shopify:how we built containers this power over 100,000 online shops (compile/csdn-docker subtitle Flora (Wang Daolong, Sun Hongliang, Wu Jingrun, Wong, Zhou Jingmao, Zhao Wenju) revisers/Zhou Xiaolu)
If you need more information about Docker or technical documentation to access the Docker technology community, if you have more questions, please put it in the Dcoker Technical Forum and we will invite experts to answer. CSDN Docker Technology Exchange QQ Group: 303806405.
CSDN invites you to participate in China's large data award-winning survey activities, just answer 23 questions will have the opportunity to obtain the highest value of 2700 Yuan Award (a total of 10), speed to participate in it!
National Large data Innovation project selection activities are also in full swing, details click here.
The 2014 China Large Data Technology Conference (Marvell conference 2014,BDTC 2014) will be held at Crowne Plaza Hotel, New Yunnan, December 12, 2014 14th. Heritage since 2008, after seven precipitation, "China's large Data technology conference" is currently the most influential, the largest large-scale data field technology event. At this session, you will not only be able to learn about Apache Hadoop submitter uma maheswara Rao G (a member of the project Management Committee), Yi Liu, and members of the Apache Hadoop and Tez Project Management Committee Bikas Saha and other shares of the general large data open source project of the latest achievements and development trends, but also from Tencent, Ali, Cloudera, LinkedIn, NetEase and other institutions of the dozens of dry goods to share. There are a few discount tickets for the current ticket purchase.
Free Subscribe to the "CSDN large data" micro-letter public number, real-time understanding of the latest big data progress!
CSDN large data, focus on large data information, technology and experience sharing and discussion, to provide Hadoop, Spark, Impala, Storm, HBase, MongoDB, SOLR, machine learning, intelligent algorithms and other related large data views, large data technology, large data platform, large data practice , large data industry information and other services.