[Dry goods] decrypts monitoring treasure Docker monitoring implementation principle
Sharing person: Neeke, Senior cloud intelligence architect, a member of the PHP development team, and author of PECL/SeasLog. 8 years of R & D management experience, engaged in large-scale enterprise informatization R & D architecture, and has been involved in Internet digital marketing for more than 09 years, with in-depth research on Architecture and performance optimization. Joining cloud intelligence in 2014, we are committed to the architecture and R & D of APM products. Advocating agility and efficiency, GettingReal.
In September 2015, the enterprise-level application performance monitoring and management service provider cloud intelligence officially launched the Docker monitoring function to monitor the CPU, memory, network traffic, and Swap status of Docker containers in real time, developers and O & M personnel can clearly understand their resource consumption when using Docker.
As the first SaaS provider in China to implement Docker monitoring, What is the technical principle of monitoring Docker? What are the advantages of foreign Docker monitoring products? The following is an example of this sharing. Please refer to Neeke for details:
1. Docker monitoring overview
In the cloud era, there are still a large number of physical machines directly supporting services. Compared with virtual technology, this method is outdated, so various open source container technologies have greatly promoted the development of virtualization technology.
Compared with other container technologies, Docker containers are relatively new and develop most rapidly. Needless to say, there is a big brother behind Google. Several start-ups with Docker as the core technology have also emerged in China, such as several cloud smart partners and DaoCloud, which are promising companies.
Although it is so hot, Docker O & M has always been a pain point.
It can be said that currently only two APM vendors around the world provide SaaS-based Docker O & M monitoring, one of which is the us apm vendor New Relic. They officially released Docker monitoring in late June; the other, the cloud intelligence CloudWise, a Chinese APM vendor, launched Docker monitoring in September 7 following New Relic. In a sense, CloudWise fills the SaaS service gap of Docker monitoring in China.
2. Working Principle of Docker monitoring
As we all know, CloudWise has taken the lead in proposing an end-to-end Integrated Monitoring Model in the APM field. In this model, it has released a SmartAgent software architecture that is technically advanced and easy to deploy and manage. The implementation of this Docker monitoring is also based on the SmartAgent architecture.
SmartAgent is fast, efficient, and intelligent. During the entire deployment process, you can complete the process within two minutes. The deployment is divided into two parts: Download, decompress, and start the data sending proxy SendProxy. SendProxy is used to provide an efficient local data receiving queue and data sending engine, and can be deployed in a LAN, so that machine monitoring that cannot access the Internet can also be efficiently transmitted to the cloud smart SaaS platform through SendProxy. Download, decompress, and start the DockerAgent.
DockerAgent is developed and compiled using Python. Currently, Ubuntu and CentOS are supported. The DockerAgent complies with the SmartAgent plug-in specifications. Therefore, you can use the agent directly, regardless of the monitoring or transparent view.
The DockerAgent has three threads: DockerProcess \ DockerConfig \ DockerPing and an object Task. The three threads perform their respective jobs and are controlled by the Task object. The core attribute of a Task is the unique identifier of the Task, Task status, and Task frequency. These attributes are regularly synchronized between DockerConfig and the ClouwWise cloud platform.
When the task status is normal, the DockerProcess thread starts to collect data and observe the frequency specification. DockerPing is responsible for heartbeat detection and regularly generates heartbeat data. All the data is transferred by the DockerAgent to SendProxy, stored by SendProxy to enter the queue, and asynchronously pushed to the CloudWise cloud platform.
As mentioned earlier, the DockerAgent complies with the SmartAgent plug-in specifications. Therefore, like other plug-ins, it contains directories such as bin, conf, lib, and log, and has a STARTUP script. This script provides commands such as start, stop, and status.
The above is the introduction of DockerAgent. Later, the architecture and plug-in specifications of SmartAgent will be released in an Open Source manner. At that time, those who are keen on open source and monitoring can participate directly. 3. DockerAgent data collection principles
Next, let's talk about how the DockerAgent collects data. DockerAgent first uses the docker info command to obtain the docker system information, which contains useful Data, such as Containers, Images, Name, CPUs, Data Space Used, Data Space Total, total Memory.
This data seems simple and basic, but it can free the Docker O & M personnel from repeated work N times a day. Next, we will use docker version to check the docker version. Currently, our DockerAgent only supports Docker Versions later than 1.15.
Then, use the dockerps command to obtain the container running information, container id, and container name. Then you can know what docker containers are running on this machine.
Finally, obtain the performance indicators of these docker containers in sequence. Some of the methods for obtaining performance indicators use docker native interfaces and some use cloud intelligence algorithms. It includes the system time zone/time of the container and the host; the cpu usage of the container (through the cpuacct of the container in the cgroup/cpuacct. obtained by stat), the Container ip address, the number of processes running in the container, the container memory metrics, rss \ cache \ memory_limit \ total_cwop, and so on (through the memory of the container in cgroup/memory. stat); container network indicators (obtained through ifconfig/statistics ). After the release of the DockerAgent, many enthusiastic users received feedback on that day. A lot of feedback is very good, and we are actively absorbing and improving. Solve the Docker O & M, monitoring, and management problems that really have a headache for everyone. We believe that in a short period of time, we will iterate out the DockerAgent that is better, more stable, and more in line with user expectations, so as not only to fill the domestic Docker monitoring gap, it will truly become a partner of many Docker users and enterprises to solve the Docker O & M and monitoring problems that really have a headache for everyone.
Q: What are the differences and advantages between docker monitoring and datadog?
A: The installation and deployment of DataDog is too cumbersome. At that time, it took one afternoon to run the data. DataDog charts are relatively free to define, Which is better. The biggest advantage of our Docker monitoring is zero-infrastructure deployment. In addition, DataDog is too expensive. It seems that an Agent should be close to 100 RMB. Currently, CloudWise DockerAgent is free of charge.
Q: I just mentioned that docherconfig is synchronized with the cloud platform on a regular basis. Is the data collected by docker process and docker ping synchronized?
A: The synchronization configuration is not used to synchronize the collected data.
Q: What I want to talk about is that sendproxy is used to asynchronously connect to the cloud platform. So what is the role of dockerconfing?
A: DockerConfig periodically retrieves configuration information from the cloud platform. The collected data is sent to SendProxy by DockerProcess and DockerPing. The data to be synchronized is actually the attribute of the Task, such as the Task Name, Task frequency, and Task status.
Q: How does one collect data by executing the ps command on the docker containers on the machine and then using docker info to obtain their metrics?
A: dockerinfo returns the overall docker metrics on the current machine. Then, ps gets the docker containers that are living and obtains their respective metrics in sequence.
Q: Does that include the docker output from the ps command? Is ps taken directly? In this case, ps not only obtains the active docker containers, but also includes their metrics?
A: ps cannot obtain indicators. It obtains active containers and lists them. Then, it uses other methods to obtain their indicators. The container name is also obtained when listing in ps.
The above is a sharing of Neeke's implementation principles of cloudmonitor Docker monitoring. You can register cloudmonitor for free trial. Please contact us if you have any questions or needs.