: Build your own container with less than 100 lines of Go code

Source: Internet
Author: User
Tags virtual private server docker run
This is a creation in Article, where the information may have evolved or changed.

Note: This article tells the container, the comparison is easy to understand. Recommended, early entrants look.

Ext.: Http://www.infoq.com/cn/articles/build-a-container-golang?utm_source=infoq&utm_medium=related_content_ Link&utm_campaign=relatedcontent_news_clk

The advent of Docker in March 2013 made it possible for the software development industry to have a dramatic change in the way it packaged and deployed modern applications. Following the release of Docker, a variety of competitive, salute and supportive container technologies have sprung up, which has brought great attention to this area, but also aroused people's reflection. This series of articles will answer readers ' various puzzles and analyze how containers are actually used in the enterprise.

This series of articles will first look at the core technology behind the container, understand how the developer is using the container, and then analyze the core challenges of deploying containers in the enterprise, such as how to integrate container technology with continuous integration and continuous delivery pipelines, and improve monitoring methods to support changing workloads. And the potential demand for using short-term containers. The summary of this series of articles will analyse the future of container technology and explore the role of nuclear-free technology (unikernels) in organizations that are at the forefront of technology.

This article is one of the articles in this series, "containers in real applications – away from hype". You can subscribe to this series of articles via RSS to get updated notifications.

Metaphor is a good way, but the problem is that when you hear a metaphor, it stops your mind from thinking. Some people say that software architecture design "is like" architecture design. No, they're not the same. While this metaphor may sound helpful, it actually creates a bigger problem. Similarly, containerized technology in software is often described as "like" the ability to move software as it moves through a cargo container. This analogy is not accurate, or, although the meaning of the metaphor is correct, it misses a lot of detail.

Freight containers and software containers do have many consistent parts. The cargo container has the standard shape and the size, it can bring the large scale and the standardization economical benefit. and software containers can bring many of the same benefits, but the metaphor is only on the surface, it's a goal, not a fait accompli.

In order to understand exactly what a container in the software world is, we first need to know how to develop a container, which is what this article will cover. During the introduction, we'll talk about container and container technology, Linux containers (including namespaces, cgroups, and layered file systems), and then we'll start with some code to create a simple container from scratch, and finally, we'll talk about the true meaning of the container.

What exactly is a container?

I want to make a little game first. Now please tell me what the container is in your mind immediately. Are you ready? OK, let me see if I can guess your answer:

You may refer to one or more of the following items:

    • A way to share resources
    • Process isolation
    • Similar to lightweight virtualization technology
    • Encapsulates a root file system with Meta data
    • Similar to chroot jail
    • Something similar to a cargo container.
    • Is the functionality that Docker provides

In a word is difficult to express the rich meaning of the container! The word "container" also begins to be used in various (and sometimes overlapping) concepts, both for the metaphor of containerized technology and for the implementation of containerized techniques. If we can think of these concepts separately, we can get a clearer picture. So let's first consider why we use containers and then discuss how to use them (and then we'll go back to the reasons why).

Start stage

First, suppose there is a program that we call run.sh. We need to copy this program to a remote server and start running. However, it is not safe to run code at random on a remote server, and it is difficult to manage and scale. Therefore, we put the virtual private server and set the user permissions. Now everything is running well.

But this run.sh program has some dependencies. It requires certain libraries to exist on the host, and there are always some differences between remote and local behavior. So we used technologies such as the AMI (Amazon machine image) and the VMDK (VMware Mirror) and Vagrantfiles. Now everything is running well.

However, it has some shortcomings. The entire library is already huge, and because these technologies are not highly standardized, it is difficult to deploy efficiently. Therefore, we have developed the caching function. Now everything is running well.

Related Vendor Content

To quickly learn Amazon EMR best practices, sign up for INFOQ online classroom

Archsummit Late event Recruiting, inviting you and architects to talk about technology

How to quickly build a complete mobile live system

How far are you from being a qualified technology leader?

Docker application practice in securities industry

Related Sponsors

GMTC Global Mobile Technology Conference June 24, 2016-25th, Beijing, click to learn more!

The presence of the caching feature makes Docker's image much more efficient than Vmdks or vagrantfiles, allowing us to pass the differences relative to the common base image rather than passing the full mirror. This means that we can move the entire environment from one location to another. That's why when you execute a "Docker run program," it can start in near real-time, or even start a full operating system image. We'll delve into the details of how this works in detail later.

This is the function of a container that can wrap dependencies together, enabling us to deploy code in a repeatable, secure way. But this is a high-level goal, not its definition. So, let's discuss some practical things.

Create a container

So, what exactly is a container (this time is serious!) )? If creating a container can be as simple as executing a create_container system call. It is certainly not so simple, but in fact, it is not far.

In order to discuss containers at a low level, we will first discuss three elements, three of which are namespaces, cgroups, and hierarchical file systems, respectively. Although there are other elements, it is possible to achieve the principal function through these three.

Name space

Namespaces provide the isolation necessary to run multiple containers in a single machine, while making each container feel as if it were running in a separate environment. At the time of writing, there were altogether 6 namespaces. Each namespace can be requested independently, which is equivalent to a single view of a process (and its child processes) that provides a subset of the resources of the machine.

These namespaces include:

  • The Pid:pid namespace provides a view of a process and its child processes as a subset of the processes in the system. You can think of it as a mapping table. When a process in the PID namespace requests a list of processes to kernel, kernel examines the mapping table. If the process already exists in the table, then kernel returns its mapping ID instead of the real ID. If the process does not exist in the mapping table, then kernel assumes that the process does not exist at all. The PID of the first process created in the PID namespace is 1 (so that its host ID has a mapping value of 1), and the namespace is represented in the container as an isolated process tree.
  • MNT: In a sense, the Mount namespace is the most important namespace, and it provides a unique mount table for the processes contained therein. This also means that when these processes mount or un-mount a directory, other namespaces, including the host namespace, are not affected. More importantly, we will see that by combining this system call with Pivot_root, it allows a process to have a unique filesystem. So, just swap the container's file system, and the process will think it's running in some Ubuntu, BusyBox, or Alpine.
  • The Net:network namespace assigns a separate network stack to the process that uses it. Typically, a real physical network card is assigned only in the main network namespace, which is the namespace in which the process automatically starts when the machine starts. But we can create a virtual network device pair, the interconnected NIC, where one end belongs to a network namespace, and the other side belongs to another network namespace, creating a virtual connection between the two network namespaces in this way. This approach is somewhat similar to having multiple IP stacks communicating in the same host. With certain routing logic, each container is able to maintain its own independent network stack while communicating with the outside world.
  • The Uts:uts (UNIX time-sharing System) namespace provides a unique view of the system host name and domain name for the process. When a UTS namespace is entered, modifications to the hostname and domain name do not affect other processes.
  • The IPC:IPC (interprocess communication) namespace isolates communication mechanisms between various processes, such as Message Queuing, and so on. Refer to the relevant documentation for the namespace for more details.
  • The User:user namespace has just recently been supported, and from a security standpoint, it may be the most powerful namespace. The user namespace is able to map the UID seen by a process to a different UID (and GID) collection in the Master machine. This feature is very useful, and by using the user namespace, we are able to map the container's root user ID (such as 0) to an arbitrary (and unprivileged) UID in the Master machine. This means that we can make a container think that it has access to root without giving it permission in any root namespace (we can even give it root-like permissions for its access to container-specific resources). The container is free to run the process at UID 0 (which usually means that the user has root privileges), and kernel internally maps the UID to a true UID that is not granted privileges. Most container systems do not map any of the UID in the container to the UID 0 in the calling namespace, in other words, there is no root-privileged UID in the container.

Most container technologies place user processes in all of the above namespaces, and then initialize those namespaces to provide a standard environment. For example, it can create a network card in the container's isolated network namespace so that it can connect to the real networks in the host.

CGroups

To be honest, cgroups's content is enough to be described in a single article (I will reserve the right to write an article in this area!). )。 I'll describe this very briefly in this article, and once you understand the concepts, you can find answers to most of the questions directly in the documentation.

Essentially, the role of Cgroups is to bring together a series of processes or task IDs to set restrictions on them. The role of namespaces is to isolate processes, and the purpose of cgroups is to implement fairness in the process (or it can be unfair, depending on what you think, you can do whatever you want) to share resources.

Kernel exposes cgroups to a special file system that you can mount. You can add a process or thread to a cgroup by simply adding the process ID to a task file. After that, you can read and modify a variety of configurations by simply editing the files in that directory.

Tiered file system

Namespaces and Cgroups are responsible for containerized isolation and resource sharing, and they implement the container's main function and security. The layered file system allows us to efficiently move the full machine image, which guarantees the continuous operation of the container.

Essentially, a layered file system optimizes the invocation process of creating a copy of the root file system for each container. There are a number of different ways to achieve this goal. Btrfs uses write-time copy (Copy-on-write) technology at the file system level, while AUFS uses the "union mounts" mount mechanism. Because this step can be achieved in a number of ways, this article chooses a very simple way: we will really create a copy. Although this is slow, it is true that you can accomplish the task.

Create a container

The first step: build the skeleton of the program

Let's start by building the skeleton of this program. If you have installed the latest version of the Golang programming language SDK, open your editor and copy the following code.

  Package Mainimport ("FMT" "OS" "Os/exec" "Syscall") func main () {switch OS. ARGS[1] {case ' run ': parent () case ' child ': Child () default:panic (' Wat should I do ')}} Func parent () {cmd: = Exec.command ("/proc/self/exe", append ([]string{"Child"}, OS. Args[2:] ...)    ...) Cmd. Stdin = OS. Stdin cmd. Stdout = OS. Stdout cmd. Stderr = OS. Stderr If err: = cmd. Run (); Err! = Nil {fmt. Println ("ERROR", err) OS. Exit (1)}}func child () {cmd: = Exec.command (OS). ARGS[2], OS.    Args[3:] ...) Cmd. Stdin = OS. Stdin cmd. Stdout = OS. Stdout cmd. Stderr = OS. Stderr If err: = cmd. Run (); Err! = Nil {fmt. Println ("ERROR", err) OS. Exit (1)}}func must (err error) {if err! = Nil {panic (err)}}  

So what is the role of this program? The entry for the program is MAIN.GO, which requires reading the first parameter. If the parameter is ' run ', then run the parent () method, and if it is ' child ', run the Child () method. The parent method executes '/proc/self/exe ', which is a special file that contains a memory image of the current executable file. In other words, we will recall the program itself, passing ' child ' as the first parameter.

What is the point of this madness? For the time being, it does not have much effect. It just allows us to execute another program in which a user requests a program (by the ' OS '). Args[2:] ' in the content definition. However, based on this simple structure, we are able to create a container.

Step Two: Add namespaces

In order to add a namespace to the program, we only need a single line of code. Add the following code to the second line of the parent () method to have the Go program add some extra flags when it runs the child process.

cmd.SysProcAttr = &syscall.SysProcAttr{    Cloneflags: syscall.CLONE_NEWUTS | syscall.CLONE_NEWPID | syscall.CLONE_NEWNS,}

If you run the program now, you'll see that the program is already running in UTS, PID, and the Mnt namespace!

Step three: Root file system

Your process is now running in some isolated namespaces (feel free to add additional namespaces to the cloneflags of the above code), but the file system and the host look the same. This is because the process is running in a Mount namespace, but the original mount is still inherited from the namespace that is responsible for creating the work.

So we need to make some changes here. We need to switch to the root file system with the following four lines of code, put the code at the beginning of the ' Child () ' function.

must(syscall.Mount("rootfs", "rootfs", "", syscall.MS_BIND, ""))    must(os.MkdirAll("rootfs/oldrootfs", 0700))    must(syscall.PivotRoot("rootfs", "rootfs/oldrootfs"))    must(os.Chdir("/"))

The last two lines of code are the most important ones, and they tell the operating system to change the current directory corresponding to '/' to ' Rootfs/oldrootfs ' and set the new Rootfs directory to '/'. After the ' Pivotroot ' call is finished, the '/' directory in the container will point to the Rootfs directory (the binding mount call is to satisfy certain conditions of the ' pivotroot ' command, the operating system requires the ' Pivotroot ' file system cannot be in the same tree, and it is really foolish to be able to accomplish this by mounting the rootfs binding to itself. )。

Fourth step: Initializing the container environment

Currently, your process is running in a series of isolated namespaces and has a root filesystem of your choice. We can skip setting the part of cgroups because it is very simple. We have also omitted the root file System Management Section, which allows you to efficiently download and cache the root file system image we just created with the ' pivotroot ' method.

We also omitted the configuration portion of the container, and you are now all a new container running in the isolated namespace. We set the Mount namespace by switching to Rootfs, but the other namespaces still have only the default content. In the actual container, we need to configure the container's entire environment before it can run the user process. For example, we need to set up the network, switch to the correct UID before running the process, and set some other necessary restrictions (such as the ability to use dropping and set rlimits), and so on. These jobs may make our program more than 100 lines of code.

Fifth Step: summary

Okay, we've got a super-simple container that uses just (FAR) less than 100 lines of Go code. Of course, we deliberately simplified the process. If you are going to use this program in a production environment, you must be crazy, and you have to bear the consequences yourself. But I think that through these simple and less formal codes, it can help you understand exactly what's going on. So let's take a look at the program in its entirety.

Package Mainimport ("FMT" "OS" "Os/exec" "Syscall") func main () {switch OS. ARGS[1] {case ' run ': parent () case ' child ': Child () default:panic (' Wat should I do ')}} Func parent () {cmd: = Exec.command ("/proc/self/exe", append ([]string{"Child"}, OS. Args[2:] ...)    ...) Cmd. Sysprocattr = &syscall. sysprocattr{Cloneflags:syscall. clone_newuts | Syscall. Clone_newpid | Syscall. Clone_newns,} cmd. Stdin = OS. Stdin cmd. Stdout = OS. Stdout cmd. Stderr = OS. Stderr If err: = cmd. Run (); Err! = Nil {fmt. Println ("ERROR", err) OS. Exit (1)}}func child () {must (Syscall. Mount ("Rootfs", "Rootfs", "", "Syscall.ms_bind," ")) must (OS. Mkdirall ("Rootfs/oldrootfs", 0700)) must (Syscall. Pivotroot ("Rootfs", "Rootfs/oldrootfs")) must (OS. Chdir ("/")) cmd: = Exec.command (OS. ARGS[2], OS.    Args[3:] ...) Cmd. Stdin = OS. Stdin cmd. Stdout = OS. Stdout cmd. Stderr = OS. Stderr If err: = cmd. Run ();    Err! = Nil {    Fmt. Println ("ERROR", err) OS. Exit (1)}}func must (err error) {if err! = Nil {panic (err)}}

So what is the meaning of the container?

The following personal views may be controversial: For me, a container is a wonderful way to deploy software that can run code in a very low-overhead manner and achieve very high isolation, but its meaning is not just here. A container is a technology, not a user experience.

As a buyer on a Amazon.com website, I do not specifically call the docks to arrange the delivery of the goods. Similarly, as a user, I just want to deploy the container to a production environment without having to know its specifics. Containers are a wonderful technique for building systems, but we should not be attracted by the ability of containers to deploy machine images efficiently, from the perspective of creating an excellent developer experience.

Many platform-as-a-service (PaaS) systems, such as the user experience provided by Cloud Foundry, are based on code rather than containers. For most developers, the goal is to run the code directly after uploading it. Cloud Foundry and most other PAAs platforms internally get user-uploaded code and create a containerized image that can be expanded and managed at will. Cloud Foundry does this with buildpack, but you can also ignore this step and upload a docker image created from Dockerfile directly.

The PAAs platform still shows all the benefits of the container, including a consistent environment, efficient resource management, and so on, but with the control of the user experience, the PAAs platform provides a simplified user experience for the developer and provides some additional functionality For example, the root file system is automatically patched when there is a security risk. In addition, these platforms provide services such as databases and Message Queuing that you can apply to your app without having to consider each feature from a container perspective.

Well, we've done a good research on the nature of the container. Now, how do we use them?

About the author

Julian Friedman, the head of the engineering team at IBM, is responsible for the development of Cloud Foundry's container technology garden. Prior to participating in the development of Cloud Foundry, Julian was involved in the development of many new technology projects. Includes performance optimizations for IBM Waton, which has been tested on the dangerous edge of this program, as well as the first few iterations of IBM Cloud technology. He has recently received a doctorate in the field of map/reduce, so he intends to stop thinking about Map/reduce for the rest of his life if possible. His Twitter account is @doctor_julz.

The advent of Docker in March 2013 made it possible for the software development industry to have a dramatic change in the way it packaged and deployed modern applications. Following the release of Docker, a variety of competitive, salute and supportive container technologies have sprung up, which has brought great attention to this area, but also aroused people's reflection. This series of articles will answer readers ' various puzzles and analyze how containers are actually used in the enterprise.

This series of articles will first look at the core technology behind the container, understand how the developer is using the container, and then analyze the core challenges of deploying containers in the enterprise, such as how to integrate container technology with continuous integration and continuous delivery pipelines, and improve monitoring methods to support changing workloads. And the potential demand for using short-term containers. The summary of this series of articles will analyse the future of container technology and explore the role of nuclear-free technology (unikernels) in organizations that are at the forefront of technology.

This article is one of the articles in this series, "containers in real applications – away from hype". You can subscribe to this series of articles via RSS to get updated notifications.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.