Dockone WeChat Share (88): PPTV Media's Docker and DevOps

Source: Internet
Author: User
Tags zookeeper glusterfs redis cluster
This is a creation in Article, where the information may have evolved or changed.
"The editor's word" devops is a concept that was proposed around 2009, advocating a high degree of synergy between development (development) and Operations (Operations) in two areas. This improves the reliability, stability, elasticity and safety of your production environment while completing high-frequency deployments. This share introduces the PPTV media, supported by Docker technology, and optimized for DevOps, including:
    1. About DevOps
    2. Application of Docker in PPTV
    3. The combination of DevOps and Docker
    4. Implementation scenarios


About DevOps

Let's start with a brief introduction. Devops,devops usually refers to a concept that arose around the 2009, advocating a high degree of synergy between development and IT operations, with the main goal of increasing user and business requirements to improve product delivery capability and efficiency.

I believe that DevOps is no stranger to many friends. As of today, there may already be many companies trying to achieve devops through Docker. There is no perfect DevOps implementation plan and standard.

Application of Docker in PPTV

PPTV DCOs platform built on Docker. Based on Mesos + marathon as the core, combined with Docker and Nginx, the DCOs management platform is developed on this basis. Including Rights Management module, unified log Management module, IP pool management module, storage Management module, Service Discovery module, and with the continuous integration platform Jenkins integration, implementation of the application container creation, operation.

Mesos, Marathon, Docker, Jenkins Everyone should know, there is not much to say. Below is the main introduction of Docker usage of PPTV.

So far, the PPTV test environment has been largely migrated to Docker, and the production environment is being containerized. So the content behind is primarily related to the architecture of the test environment.

Functional framework


Figure 1

Current schema and publishing process


Figure 2

Container Network

We use the network bridging approach, based on the Docker bridge mode, the default Docker bridge Bridge to replace the Linux bridge, the Linux Bridge network segment of the IP into the container, the container and the traditional environment application interoperability.

Create a docker network that uses Linux bridge:

Docker Network Create--gateway 10.199.45.200--subnet 10.199.45.0/24-o Com.docker.network.bridge.name=br-oak-- Aux-address "defaultgatewayipv4=10.199.45.1" oak-net

This command creates a Docker bridge network and bridges it to the Br-oak Bridge on the host where the Docker resides. The network uses the 10.199.45.0/24 subnet, while --aux-address "DefaultGatewayIPv4=10.199.45.1"This parameter points to the gateway to 10.199.45.1 when the container starts.

Key parameter:--aux-address "defaultgatewayipv4=10.199.45.1".

With bridging, you need to consider ipam. We have added the IP pool Management module to the DCOs management platform to achieve global IP control and provide an interface for requesting IP.

Figure 3

The idea is to use the app information on Marathon as the data source and periodically call the Marathon API to update the IP list. When we want to create a container using fixed IP on marathon, we first request IP pool management module IP allocation interface, request the app ID to assign interface, the management platform according to the app ID to determine whether the application is a new application, If it is a new app, it returns an unused IP from the IP pool and associates the IP with the app.

When the container is run, it automatically adds-net and-IP parameters specify the container IP.

Figure 4

Container storage

Container Storage Driver

Our OS is using CentOS. So in the past time, using the loop mode of the Devicemapper as the storage driver, is docker default, in the use of the process encountered some problems.

Figure 5

As shown above, the problem has not been resolved more than once, and it is only possible to restart Docker at the end of the process. If you encounter such a problem in a production environment, you should not want to see it.

After comparison (comparative article: "PPTV media dcos Docker storage mode Research selection", here is not described in detail), we decided to select a storage driver in Btrfs and Devicemapper direct LVM. For our current plan to use Docker on-line service, mainly focus on the files from hundreds of KB to a few MB size, just consider reading, write the operation is mounted volume way, not directly written in the container. In terms of reading, Devicemapper has a slightly better performance than Btrfs. In the stability of the comparison, because the offline test can not completely imitate the scene of the line, the initial plan to go online when a part of the container running in the Devicemapper storage-driven environment, a part of the container running in btrfs storage-driven environment, to observe, compare.

Container external volumes use Glusterfs to support container persisted data in distributed storage. such as application logs and application of static files. The reason for choosing Glusterfs because Glusterfs has been in use within the company, enough to meet the storage requirements of the test environment.

Container Log Management

The application logs are collected uniformly on the glusterfs. A log volume was built on the Gluster to mount the volume on each mesos slave host, such as/home/logs.

When the container starts, add the volume parameter, such as the container APP1 when the –v/home/logs/app1:/home/logs is started.

This approach solves the problem of container data persistence in a test environment, but is not recommended in the online environment. Online environment Our current consideration is to mount the container log directory on the host and collect it uniformly through logstash or otherwise.

Service discovery

The Service Discovery module subscribes to Marathon's message, and we have made a contract for the marathon app's label property, which contains the IP, service port, and service domain name of the app. The app is created, stopped, and deleted every time marathon. The service module receives the corresponding message and updates it to Nginx and DNS after the information is processed.

Label Property:

Figure 6


Figure 7


Figure 8

The combination of DevOps and Docker

The slogan of Docker is Build, Ship, and Run Any App, Anywhere。 So the ideal state to use Docker is this:

Figure 9

The ideal process for devops through Docker should be this:

Figure Ten

But the actual situation is usually:

Figure One

In the test environment to test the image passed, it is likely not directly to the line run up. Because the test environment and the on-line environment have inconsistent schemas, or configuration inconsistencies.

For example, App1-Redis1 in the top picture, there are two possible problems:
    1. The REDIS1 link address in the configuration file for the online environment may be an online Redis IP. The offline configuration is the Redis IP under the line.
    2. Redis on-line may be a redis cluster, such as a port of 19000. While the offline Redis may be a single node, the service port is 6379. In addition, there may be other problems with inconsistent configuration parameters.


The way to solve the top two problems is to ensure that the line is configured consistently on the downline.

In the past solutions:

For problem 1, usually calls between the various components of the project through the domain name to call, the online environment through DNS resolution domain name, offline environment is generally tied host.

For question 2, you need to plan at the beginning of the project, whether the same port is used on the offline and online. If you want to configure the service port at a later stage, it may cause many unknown problems, which can be costly.

Online and offline configuration inconsistencies should be present in many companies, and this situation is likely to lead to online issues. Take our own actual situation as an example. Then look at the above mentioned our current release process:

Figure

We compiled the application package for the test environment through Jenkins, replacing the configuration file in the application package with the offline configuration in the process of compiling, and then putting the application package into the container by means of mounting. Once the test passes, the application package for the online environment is compiled with Jenkins, and the configuration file is replaced with a line configuration during this compilation. Finally, the application package is submitted to the online operation and maintenance release. That is to say, the application package that we end up posting to the line is actually not tested. So out of production accidents, fortunately, there is a gray scale release, and timely return, reducing the scope of the impact.

To get as close to Docker as possible to achieve the ideal flow of devops, we aim to achieve:
    1. The output from the Jenkins build project is a large Docker image with an application package (code + online configuration) and an environment (such as Tomcat).
    2. Simulate the on-line architecture in a test environment to ensure that the environment is as consistent as possible.
    3. The large images tested in the test environment can be published directly online. Deploy a large image in the test environment, perform a test task, and after the test passes, the OPS will publish the large image to the online environment.


To achieve this goal, we need to solve the problem:
    1. Run the components in the Docker cluster that require persistent data, such as DB, ZooKeeper, MySQL, and so on.
    2. Ensure that the line and the offline configuration are consistent, including access links and service port unification.
      3. The test environment may have multiple sets of identical applications, requiring environmental isolation to ensure that each environment can use the same configuration and perform different tests in parallel.


Implementation scenarios

Question 1

It is unrealistic to simulate the architecture of the on-line environment in a test environment before Docker, which requires a lot of manpower and material resources. With Docker, we can easily create many sets of environments, as long as there are corresponding mirrors. For files that require persistent data, because of the low access pressure in the test environment, we use glusterfs storage, and the CPU, memory and other computing resources to maximize the crushing of server resources.

Question 2, 3

We hope that the implementation of the scheme as simple as possible, the impact as small as possible. Therefore, in conjunction with the Service Discovery module, we have re-agreed on the label of the app, isolating the test environment into a set of integrated environments and multiple sets of functional environments. At the same time each set of environments uses its own set of DNS. Simulate the line architecture as much as possible in a test environment.

Figure

For example, the resulting effect:

A complete set of integrated test environments, including master services, Redis, MySQL, DB, zookeepoer, etc., for other functional environment test calls. Each component call within the environment uses an address that is consistent with the line. For example, DNS is responsible for parsing the internal domain name of the test environment.

Multiple sets of functional test environment (only one set), including the need for functional testing projects, each project in the functional environment has its own redis, MySQL, DB and other services, the same configuration and integration environment, cross-project calls, the default call integration environment. Differentiate between an integrated test environment and a functional test environment through DNS resolution.

By completing this step, we are basically able to ensure that most of the projects are configured and in a consistent environment. At the same time, because the output in Jenkins is a Docker image, the mirrored image can be run directly on the online environment in a consistent environment. Of course, there are some configurations and projects that can not be completely consistent, it will only minimize the difference. Special projects are handled by other special means.

A brief summary of our main idea of combining Docker with DevOps is to deploy multiple sets of operating environments for testing in the test environment while minimizing the difference between the test environment and the online environment, with the help of the Docker build, ship, and run features, Increase collaboration between development and IT operations to improve project delivery and efficiency.

Q&a

Q: The problem of inconsistent online and offline environment, through the configuration center, such as file or security variables, the container app to start reading files or variables, can be used in the correct configuration, so that the way PPTV practice does not use this method?

A: Our online environment has a configuration center, but because of administrative reasons, the configuration center is mainly management system-level configuration, and the business-related configuration is not placed in the heart of the configuration. And the effect we want to do is to use the same configuration online to run the environment, through the configuration Center to do, just to make the configuration variables, configuration is actually different.
q:db Change process How to control, PPTV business is related to the sub-database sub-table, and what means to target the test library and production database changes to the business impact?

A:db change mainly through the process to standardize, such as sub-database table, table structure changes, need 1 days in advance of the application, by the DBA Assessment Audit, involving the database update must avoid business peak, choose early morning or 8 in the morning before. Technically, require the business party to have a backup script in the database change and back up the data that needs to be updated.
Q: How to solve the problem of frequent reload of the release of Nginx? Is there a limit to the swap of the container?

When there is a change on the A:marathon, the reload is not immediately triggered, the change information is first recorded in DB, and then there is an interval of batch sync triggering Reload,swap is not currently configured
q:1, Redis, MySQL, DB, zookeeper these if run in the container, the data initialization persistence scheme is what, 2, the Linux Bridge Network plan has not compared the Calico contiv and so on?

A:1. Mount external Volumes glusterfs implement persistent storage. 2. Compare Calico, you can refer to a previous article, "PPTV Docker cluster Network Solution selection", Contiv did not make a comparison.
Q: How is the continuous component zookeeper, MySQL, and so on integrated with the application, is it a whole image or a single image combination?

A:zookeeper, MySQL is a different image, running in different containers.
Q: How does the service automatically find out? Novice, do not know how to register node into haproxy or Nginx?

A: The Label property of the Marathon Convention, the default label property is meaningless, you can determine in the program whether the label conforms to the contract conditions, and trigger the registration, modify the operation.
Q: Hello, how do your DNS do, single point of failure problem is how to solve?

The A:dns also runs in a container dispatched by the marathon, Marathon realizes the fault self-healing, and the DNS reboot will trigger the synchronous operation of the service discovery.
Q: Is the grayscale publishing fallback a fallback images or a direct fallback to the application package in Docker, and if the database is changed in volume before fallback, the database is also rolled back together? Are you having any problems?

A: The current online environment of the container is still in progress, the future will be back images. The database on the line does not run in the container, the database is rolled back when it is rolled back, and the process is the same as when the container is not used.
Q: In a production environment, an app ID can exist in situations where multiple containers are required, and multiple IPs are required. There should be a problem with the usage as described above. How is this piece solved?

A: In the production environment in the upper layer of the marathon package a publishing system, the same project to create multiple containers, marathon on the creation of multiple apps Id,marathon information is not visible on the outside.
Q: I would like to ask, using BRIDGE,10.199.45.0/24, will the IP be exhausted? Have you tested the efficiency of bridge?

A:10.199.45.0/24 Just for example, there will be multiple IP segments in the actual scenario, or use a large network segment. Only a simple test, bridge efficiency can basically reach the native network of more than 90%.
Q: What is the reason for choosing CentOS instead of Ubuntu or CoreOS? How is DNS and IP address pooling coordinated?

A: Technology Stack reason, the company has been using Redhat and CentOS. DNS and IP are managed by the DCOs management platform.
when Q:mesos and marathon are combined, closing the container will result in a container with a stop state instead of automatic cleanup, only script cleanup, what are your options?

A: Currently we are also scripts. In the future, with Mesos hooks, the Mesos discovers that the container is deleted at the end of the task, or notifies the management platform, which is regularly batched by the management platform.
The above content is organized according to the October 11, 2016 night group sharing content. Share people Li Zhou, PPTV media dcos technology chief, focused on Docker network solutions, container monitoring, DevOps. Previously in small and medium-sized start-up companies responsible for the long-term customer site implementation work, has accumulated a large number of network, monitoring, authority certification, Big Data, Oracle, middleware and other experience, but also like to study a variety of emerging technologies。 Dockone Weekly will organize the technology to share, welcome interested students add: Liyingjiesz, into group participation, you want to listen to the topic or want to share the topic can give us a message.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.