Flume+kafka collection of distributed log application practices in Docker containers

Source: Internet
Author: User
Tags ftp access
1 Background and questions
With the advent of cloud computing, PAAs platforms, virtualization, containerized technologies such as Docker, more and more services are deployed in the cloud. Usually, we need to get logs for monitoring, analysis, prediction, statistics and other work, but the cloud service is not a physical fixed resources, log access to increase the difficulty, the past can be SSH landing or FTP access, it is not so easy to obtain, but this is what engineers desperately need, The most typical scenario is: on-line, everything in the GUI PAAs platform point Mouse to complete, but we need to combine tail-f, grep and other commands to observe the log, to determine whether the online success. Of course this is a situation, the perfect PAAs platform will do the work for us, but there are a lot of ad-hoc requirements, PAAs platform can not meet us, we need logs. This paper gives a method of centralized collection of decentralized logs in containerized services in distributed environment.

2 design constraints and requirements description
Before you do any design, you need to be clear about application scenarios, functional requirements, and non-functional requirements.

2.1 Application Scenarios
Distributed environment can carry hundreds of servers generated by the log, a single data log less than 1k, the maximum is not more than 50k, the total log size is less than 500G per day.

2.2 Functional Requirements
1) Collect all service logs centrally.
2) can be differentiated from the source, by service, module and day granularity of the segmentation.

2.3 Non-functional requirements
1) does not invade the service process, the collection log function needs to be deployed independently, occupies the system resources controllable.
2) Real-time, low latency, less than 4s from the generation of logs to the central storage latency.
3) persist for the last n days.
4) As far as possible to deliver the log, do not ask not to lose weight, but the proportion should not exceed a threshold value (for example, one out of 10,000).
4) can tolerate not strictly ordered.
5) The collection service is offline, the availability requirements are not high, 3 9 can be met year-round.

3 Implementing the Architecture
A schema implementation architecture is shown in the following figure:


Analysis of 3.1 producer layer
The service assumptions within the PAAs platform are deployed within the Docker container, so in order to meet the non-functional requirements, another process is responsible for collecting logs and therefore does not invade the service framework and processes. Using flume ng for log collection, this open source component is very powerful, can be seen as a monitoring, production increment, and can be published, consumption of the model, source is sources, is the increment source, channel is buffer channel, here use memory queue buffer, sink is slot, It's a place to spend. The source in the container is the execution of tail-f This command to take advantage of the standard output of Linux read the delta log, sink is a Kafka implementation, used to push messages to the distributed message middleware.

3.2 Broker Layer Analysis
Multiple containers within the PAAs platform, there will be multiple flume NG clients to push messages to the Kafka message middleware. Kafka is a highly throughput, high performance messaging middleware that works in sequential writes with a single partition, and supports random reads at offset offsets, making it ideal for implementations of the topic Release subscription model. There are multiple Kafka in the diagram, because the cluster feature is supported, and the Flume NG client within the container can connect to several Kafka broker publishing logs, or it can be understood to connect a number of topic partitions, which enables high throughput, one can be flume Ng Internal packaging batch sent to alleviate the QPS pressure, and can be dispersed to multiple partitions to write, while Kafka will also specify the number of replica backup, guaranteed to write to a master also need to write n backup, here is set to 2, not using the common distributed system of 3, is to ensure high concurrency characteristics to meet the non-functional requirements of # #.

Analysis of 3.3 consumer layer
Consumption Kafka increment is also a flume NG, can see its strong point, is that can access arbitrary data source, are pluggable implementation, through a small number of configurations can be. Here, using Kafka Source to subscribe to the topic, the log collected by the same first into the memory buffer, and then use a file sink write files, in order to meet the functional requirements of # #, can differentiate the source, by service, module and day granularity, I realized a sink myself, Called Rollingbytypeanddayfilesink, the source code on github, you can download the jar from this page, and directly into the flume Lib directory.

4 Practice Methods
4.1 In-container configuration
Dockerfile
Dockerfile is the inside of the container program running script, which will contain a lot of Docker's own commands, the following is a typical dockerfile,base_image is a run program and Flume bin image, the more important is entrypoint , mainly using Supervisord to ensure the high availability of processes within the container.
[AppleScript] Plain text view copy code?
01 02 03 04 05 06 07 08 09 10 11 From $ {Base_image} maintainer $ {maintainer} ENV refresh_at $ {refresh_at

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.