The road to automatic monitoring of large-scale Docker platform

Source: Internet
Author: User

Although Docker technology is currently in an unstable development and standard-setting stage, but this technology has shown a very hot growth state, but it is an indisputable fact. How hot is it? Let's take a look at a foreign surveillance company Datadog 2016 latest survey report:

650) this.width=650; "src=" http://s5.51cto.com/wyfs02/M01/89/BA/wKiom1ga153Rf4EyAAG9OWc40BE135.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 1.jpg "alt=" Wkiom1ga153rf4eyaag9owc40be135.jpg "/>

It can be seen that since the May 2015, the application of container technology has shown a significant increase of 30%, the application of the abandonment of container technology, there has been a balance state.

This elimination, with the promotion of container technology, the protagonist Logge, an Internet financial senior operations engineer, also began to be affected by its. Recently, GE's company started using Docker to deliver online applications, and the first application came up with a 50+ application container. Logge led the operations and development team through a difficult trip to the pit, and finally let the application of the line run up. But the joy of success is fleeting, immediately old GE's face again appeared on the embarrassing words, how to gracefully monitor the container platform, become the operation of the team's big question mark?

Let's take a look at the challenges faced by the team from the perspective of Lao GE, first of all:

Question 1: How can I monitor the availability and resource consumption of the container?

After a selection, old GE uses the superior Cloud monitor for the monitoring of the container on the line, by deploying the agent on the Docker host, the dynamic monitoring of all containers is realized, as shown in:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/89/BA/wKiom1ga17HxD47xAAK3ZCYU0jU163.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 2.jpg "alt=" Wkiom1ga17hxd47xaak3zcyu0ju163.jpg "/>

The monitoring agent obtains the Docker Daemon API (which is essentially in Docker's container management API and Cgroup resource statistics), and its monitored metrics include: number of running containers (per), Number of containers stopped (per), Container CPU utilization (%), container RAM utilization (%), The container disk read rate (b/s), the container disk write rate (/s), the container file system size (b.), the container file change size (b), the container network send rate (S/b), the container network receive rate (b/s).

Monitor Agent also for the operating system itself monitoring, the OS, Docker all kinds of indicators can reach the second-level monitoring granularity, fully meet the team's system monitoring requirements.

Second, it faces:

Question 2: How can I monitor the application in the container?

With the monitor Agent, specific capture plugins can be enabled for each container application, enabling the monitoring of metrics for their specific application. As the graph in question 1 shows, the agent can use the net port to access specific container applications to monitor application availability and performance metrics.

Excellent cloud Monitor supports a large number of traditional and Internet resources:

650) this.width=650; "src=" http://s2.51cto.com/wyfs02/M00/89/B8/wKioL1ga177zchYMAAIgCYl-Oh4906.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 3.jpg "alt=" Wkiol1ga177zchymaaigcyl-oh4906.jpg "/>

After solving these two problems, the work quickly got on track. But the application is life, it is constantly developing, with the increase of the instance, the use of container orchestration ability, the application began to do elastic expansion, and immediately new problems emerged:

Question 3: How do I monitor new containers under rapid change?

Through the communication with the technical team of the superior cloud, the old GE team has added container change trigger script to the Docker host, and the monitoring configuration item of the new container is automatically generated by the script by using the monitor Agent's feature of easy automation configuration, which satisfies the automatic monitoring of the application of the newly added container.

650) this.width=650; "src=" http://s4.51cto.com/wyfs02/M01/89/B8/wKioL1ga19WhKhV_AAJohDyCrDo904.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 4.jpg "alt=" Wkiol1ga19whkhv_aajohdycrdo904.jpg "/>

Finally, because a large number of applications use distributed micro-services, the same microservices run multiple instances across the network, and in the past a single application as a unit of monitoring has been unable to meet the requirements, because a single indicator does not represent the performance level of the entire application, so the team encountered a monitoring visualization problem:

Question 4: How can I visualize the overall performance metrics for distributed applications?

For example, the current on-line application has 6 jetty MicroServices, and will continue to increase, so how to confirm that all business jetty service load is balanced?

When using agent monitoring, we can set the "Source Tag" feature for the indicator data, and we can set the "App=shop.portal" for the 6 jetty service and the automatically added jetty service.

Then, through the various dashboards of the excellent cloud monitor, the data can be extracted by tags, combined with wonderful data aggregation formulas and rich charts, to visualize the traffic access trends, traffic access totals, load rankings, resource consumption, and so on for these 6 jetty services, as shown in:

650) this.width=650; "src=" http://s1.51cto.com/wyfs02/M02/89/BA/wKiom1ga196xw9jjAAI4tHLjIBs469.jpg "style=" height : auto;vertical-align:middle;border:0px; "title=" 5.jpg "alt=" Wkiom1ga196xw9jjaai4thljibs469.jpg "/>

A similar problem also includes, "multiple Nginx services in a cluster, how many HTTP connection sessions are there in total?" "," What is the current volume of transactions successfully processed by all nodes in the cluster? "," What is the CPU utilization ranking for all nodes in the cluster? And so on

Monitoring operations through the old GE team container, we found that the superior cloud monitor naturally supports monitoring of containers and in-container applications, and can respond flexibly to the container's elastic expansion capacity for automated container monitoring. At the same time, monitor also has a very good data aggregation and visualization means, to get rid of the operation and maintenance personnel need to face the monitoring indicators, to achieve monitoring the overall application, control the global.

Author Profile:

Yu Junwei

It operations in the field of senior experts, gifted cloud software PRODUCT director, with 10 operation and maintenance combat experience;

Successively developed the network management, System Management, CMDB, ITSM and other products, and successfully built a number of national network operation and maintenance management projects;

Its leading research and development products are widely used in customs, taxation, public security, social security, banking, insurance, energy and more than 20 industries.

This article is from the "excellent cloud dual-state Operations" blog, please be sure to keep this source http://uyunopss.blog.51cto.com/12240346/1868929

The road to automatic monitoring of large-scale Docker platform

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.