Management and operation of Enterprise Docker image Warehouse

Source: Internet
Author: User
Tags scale image docker hub docker registry

container application is more and more widely used, the advantages of container technology is the development of operation and maintenance integration. By encapsulating the application and its dependent packages, operating system files and so on in the container image, the application has the same operating environment in the development, testing and publishing process, which brings great convenience. As you can see from the classic Docker container state transition diagram in Figure 1, container mirroring (images) has the most associated arrows, which goes without saying, mirroring is the core of container technology.

Figure 1 Docker container state transition diagram

In a nutshell, container technology consists of a static two-part process: The stationary image of the encapsulated application (images) and the dynamic container (containers) that runs the application. Accordingly, the development and operation of the container mainly involves the management of image and runtime (runtime). This article mainly talk about the container image management part.

Container mirroring is managed primarily around a mirrored warehouse (registry). During the lifecycle of an app, whether a developer or CI system publishes an image, or if a tester or an OPS person downloads the image, it is done through a mirrored warehouse. A mirrored warehouse can use a public SaaS service, such as a Docker Hub. The advantage of public service is that it can be used directly without its own maintenance. But given the reasons for access efficiency and mirroring security, most companies have built their own private image warehouses (Registry), and therefore need to have a mirroring management strategy throughout the application lifecycle.

This paper mainly introduces the principles and methods of management container mirroring in development operations, and uses harbor as an example for the sake of explaining the principle. Harbor is an open source enterprise-class registry developed by the VMware China Research and development team to help users quickly build enterprise-class registry services, providing rights control, mirroring synchronization, Chinese management interface and other powerful features, favored by the majority of users. Interested friends can follow or use: Https://github.com/vmware/harbor. ensure consistency of mirrored content

At all stages of application development, testing, and running, you need to make sure that you are using the same application's image. One approach is to use the same dockerfile at each stage to generate the desired image. It is generally assumed that the same dockerfile can be part of the same image, but in reality it is not. For example, the following Dockerfile section:

From Ubuntu
RUN apt-get install–y python
ADD App.jar/myapp/app.jar

First, the from base image implicitly uses the latest version, and the images built at different times may be different versions. Even if a version number such as ubuntu:14.04 is indicated, it is not insured, because the same version of the system may contain different packages such as patches.

Look at apt-get such commands (similar to Curl,wget, etc.), often from the Internet to introduce third-party software packages, version consistency is even more uncertain. And the file App.jar added to the image in the Add statement, depending on the version of the file at the time of the build, is also an indeterminate factor.

As you can see from the example above, although the purpose of the Docker image is to construct an immutable application environment, the image generated by the same dockerfile does not necessarily contain the same content because it is often built with an indeterminate input. Therefore, the best approach is to always use the same image (binary format) in different environments, although it is larger in volume than dockerfile, but can ensure the consistency of the image. Transport of Mirrors

Many users use the same registry as a mirrored warehouse in development, test, and operations, which is better suited for small teams or simple projects. In other cases, it is best to use multiple registry to differentiate between different uses and safety control requirements. The reference flow for container image management (as shown in Figure 2).


Figure 2 Management process for applying mirroring

Registry of the development environment: primarily used by developers, with frequent mirroring changes. When the development is completed, the stable image is generated through the CI system and synchronized to the test registry;

Registry of the test environment: primarily used by testers, the image remains the same. When the test passes, the image is pushed to the registry of the quasi-production environment;

Registry of the quasi-production environment (Staging): Primarily used by Test and OPS personnel, mirroring remains the same. When the quasi-production environment is run smoothly, it is released to the production environment registry;

Registry of the production environment: the node that publishes the mirror to the production environment runs.

From development to production, the required container image will step into the next level of registry and finally reach the production system, thus realizing the build-transfer-run (build-ship-run) process of the container image.

Harbor provides mirroring automatic synchronization and replication between registry, and automates the management of the image transfer process by configuring replication policies. After the harbor replication policy is started, the differences between the target registry and the local source registry on the mirror are compared, and the target registry missing mirrors are pushed from the local past, making the mirroring of the two registry instances exactly the same. Subsequent pushes to the mirror on the source instance are incrementally synchronized to the target instance. When the mirror is deleted on the source instance, the mirror on the target instance is also deleted. The replication mechanism of harbor enables mirroring synchronization between two or more registry instances. permissions control for mirroring

In the enterprise, there are often different development teams responsible for different application projects, and the same as the source code sub-project management, the image also needs to be stored and managed according to the project. Because there are different members of the project team, such as project managers, product managers, development, testing, and operations, each person needs to use the mirror differently, so you can assign the appropriate permissions based on the role.

For example, testers typically only need read access to the Mirror (pull), developers need read and Write permissions (Push/pull), and project managers can add and remove project members and set their roles in addition to the developer's privileges.

In harbor Registry, there are three roles in each project: Project admin, Developer (developer), and guest (guest). Some projects, such as public images placed in the library, can allow anonymous access, which means that users can access them without Docker login, which facilitates the use of certain scenarios. In the whole system, also has the system administrator, has the maintenance mirror synchronization policy, the user addition and deletion and so on permission.

It should be noted that in different environments, a member's role can be different. For example, in the Registry of the development environment, the OPS people generally do not need permission (or only read permission), while in the production environment, the registry need to have read and write permission. Large-scale image publishing method

In the actual production operations, it is often necessary to publish the image to dozens of, hundreds or more cluster nodes. At this point, a single registry can no longer meet the download requirements of a large number of nodes, so configure multiple registry instances for load balancing. It is tedious to maintain the image on multiple registry instances by hand. Harbor supports a master-and-slave image publishing model that solves the challenges of large-scale image publishing.


Figure 3 Image distribution in master-slave mode

As shown in Figure 3, the image is synchronized to multiple registry like a "fairy scatter" as long as it is posted on a registry.

If you are a geographically distributed multi-datacenter or cross-cloud cluster, you can also use hierarchical publishing, such as synchronizing from the group headquarters to the provincial company, from the provincial company to the city company (as shown in Figure 4):

Figure 4 Image distribution of multilevel master-slave mode

During the synchronization process, if the source image is deleted, Harbor automatically synchronizes the remote mirror deletion. In the process of mirroring synchronous replication, harbor monitors the entire replication process, encounters network errors, and retries automatically.

The Monitoring screen for synchronous replication is shown in Figure 5:

Figure 5 Monitoring of the mirroring replication strategy

image deletion and space reclamation

The Docker command does not provide the registry image deletion function, the system runs over and over, it will produce many useless mirrors, occupy a lot of storage space. To remove the mirror and reclaim the space, it is cumbersome to call the Docker Registry API to complete. Harbor provides a visual image removal interface that allows you to logically delete a mirror. The space of the garbage image can be recycled in the maintenance state. Registry High Availability

Registry high Availability (HA) is a concern for most production systems, and the basic requirement is that there is no single point of failure. It is often necessary to determine the technology used, based on the time allowed for service disruption, and the cost and loss that can be sustained. Here are 3 different high-availability reference scenarios.

Figure 6 Implementing multiple registry instances with shared storage

A more standard scenario is that multiple registry instances share the same back-end storage, and any one instance persists to the stored mirror and can be read by other instances. Requests that come in from the front lb can be split into different instances to handle, load balancing, and a single point of failure (as shown in Figure 6).

It should be pointed out that the problems to be considered in practice are far more complex than the above models. For example, the selection of shared storage, the user session is shared on different instances, and so on. Users can design different solutions according to their own business requirements. Harbor will launch a solution based on Swift distributed storage and shared session (using Redis) to meet the needs of users.

If there is no shared storage, the 2nd option is to replicate the image with a dual master replication strategy between two nodes. Even if one instance fails, another instance can still provide services to some extent to meet the needs of HA. In this scenario, the user data for two instances is not synchronized, so the same user account (as shown in Figure 7) needs to be configured separately.

Figure 7 Dual master replication for quasi-ha

The 3rd scenario is to leverage existing high-availability platforms, such as Vsphere HA, with distributed storage Vsan, to achieve high availability of registry, as shown in Figure 8:

Figure 8 Building a highly available registry architecture based on Vsan and vsphere

In the event of a node failure, vsphere automatically switches to a good node and the image data is not lost (as shown in Figure 9).


Figure 9 Registry automatic migration with Vsan and vsphere

Summary

This article takes the open source Harbor registry As an example, summarizes the common use scene and the main point of registry in the enterprise, hopes to have the inspiration to everybody. We also welcome you to use Harbor and feedback, Harbor's GitHub address: Https://github.com/vmware/harbor.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.