Flume-ng built-in counter (Monitoring) source code-Level Analysis

Source: Internet
Author: User

How is the built-in monitoring of flume integrated? Many people have asked this question. Currently, you can use the cloudera manager and ganglia graphical monitoring tools to obtain JSON strings from the browser or customize the reports to other monitoring systems. What is the monitoring information? Is the statistical information of each component, such as the number of successfully received events, the number of successfully sent events, and the number of transactions processed. Different components have different countors for statistics. Currently, only three major components, source, sink, and channel, are counted until version 1.5, namely sourcecounter, sinkcounter, and channelcounter, the statistical items of these three counters are fixed, that is, you cannot set your own statistical items. In addition, there are channelprocessorcounter and sinkprocessorcounter, which are not set before, so it is still "decoration ". Some may also find that some built-in components use countergroup to collect statistics. This allows you to set statistics on your own, but unfortunately it is currently (version 1.5) this customizable counter information cannot be used in monitoring, because it is only a separate class and does not inherit the abstract class monitoredcountergroup. Some built-in components use countergroup, so there will be no data during monitoring. Different versions may use different components of this countergroup. Next we will focus on sourcecounter, sinkcounter, and channelcounter.

All statistics, monitoring, and related classes of flume-ng are stored in org. apache. flume. instrumentation. HTTP, org. apache. flume. instrumentation, org. apache. flume. instrumentation. util three packages.

As mentioned above, monitoredcountergroup is used to track internal statistical indicators, register the mbean of the component, and track and update the statistical value. All components to be monitored must inherit this class. This class can trace all components in flume, but currently only three components are implemented. The most important methods are as follows:

(1) constructor monitoredcountergroup (type, string name, string... attrs). This method is mainly used to set the component type and name. Then, add all attrs (which are the set statistical items) to map <string, atomiclong> countermap, the value is set to 0, and the start time and end time of the initialization counter are set to 0.

(2) In the START () method, the counter is registered first, and then the statistical value of all statistical items is set to 0; the start time is set to the current time

(3) Register () method. If this counter is not yet registered, register the mbean of this counter and you can trace it.

(4) In the stop () method, the end time is set to the current time, and the information of each statistical item is output. When we press Ctrl + C to end the process, the final statistical information is from here.

Other methods are used to obtain information or update values in countermap, which is relatively simple.

Next let's take a look at the various statistical items and their meanings in the three components:

1. sourcecounter inherits monitoredcountergroup. The main statistical items are as follows:

(1) "src. Events. Received" indicates the number of events accepted by the source;

(2) "src. Events. Accepted" indicates the number of events successfully processed by the source. The difference is that the above process may fail even if it is accepted;

(3) "src. append. Received" indicates the number of append calls, which are called in avrosource and thriftsource;

(4) "src. append. Accepted" indicates the number of successful append processing times;

(5) "src. append-batch.received", indicating the number of times appendbatch was called, called in avrosource and thriftsource;

(6) "src. append-batch.accepted", indicating the number of successful appendbatch processes;

(7) "src. open-connection.count", used in avrosource to indicate the number of opened connections;

Generally, source calls are concentrated in the first two.

2. sinkcounter inherits monitoredcountergroup

(1) "sink. connection. creation. count "indicates the number of links created, for example, creating a link with hbase, establishing a link with avrosource, and opening a file;

(2) "sink. Connection. Closed. Count", which corresponds to the stop operation, destroyconnection, and close file operations above.

(3) "sink. Connection. Failed. Count", indicating the number of exceptions and failures during the above "Link;

(4) "sink. batch. Empty" indicates that the number of events processed in this batch is 0;

(5) "sink. batch. underflow" indicates that the number of events processed in this batch is between 0 and the configured batchsize;

(6) "sink. batch. Complete" indicates that the number of events processed in this batch is equal to the set batchsize;

(7) "sink. event. Drain. Attempt", number of events to be processed;

(8) "sink. event. Drain. sucess" indicates the number of events that have been processed successfully. The difference is that the above is not processed yet.

3. channelcounter inherits monitoredcountergroup

(1) "channel. Current. Size", which indicates the current capacity of the channel;

(2) "channel. event. Put. Attempt" generally refers to the number of events that are attempted to be sent in a Channel transaction in the put Operation of source;

(3) "channel. event. Take. Attempt" generally refers to the number of event attempts in the take Operation Records of the sink in channel transactions;

(4) "channel. event. Put. Success" generally refers to the number of successfully put events in channel transactions;

(5) "channel. event. Take. Success", generally refers to the number of events successfully taken in channel transactions;

(6) "channel. Capacity" refers to the channel capacity, which is set in the channel start method.

The above statistical items are fixed. We can add corresponding item values as needed, and view Component Changes in monitoring to learn the running status of the flume process. For example, you can view the channel capacity to learn the relative processing speed of source and sink, the number of successful and failed processes of source or sink, and the running status of components.

Of course, some people may want to collect their own statistical items when customizing their own components. These statistical items are not included in the above three components. What should I do? Customized by yourself. As mentioned above, you must inherit the abstract class monitoredcountergroup, set your own statistical items, and then set the statistical items to an array to call the monitoredcountergroup constructor; then, add the update Value Method to the custom counter. Finally, create a custom counter in the Custom component and enable its start method. The rest is to update the statistical item value.

 

Another important thing is the implementation of monitoring! Yes, there are two built-in HTTP methods (JSON string) and ganglia. The latter needs to install ganglia. The former is very simple. You only need to add-dflume In the flume startup command. monitoring. type = http-dflume. monitoring. port = XXXX. The last XXXX is the port you need to set! Then you can access the IP address xxxx/metrics of the flume node in the browser, and refresh the node to view the latest component statistics. For details about ganglia, refer to the ganglia cluster component and meal card user guide.

What if I want to implement a server to report information to other systems? Currently, there are at least two methods:

1. Http above. You can constantly retrieve JSON strings and parse various statistical indicators by yourself. The rest is how to fix them.

2. Implement a server similar to HTTP by yourself. You must implement the org. Apache. flume. instrumentation. monitorservice interface, which has only two methods: Start and Stop. This interface inherits from the retriable interface, so it has the configure (context) method that can read the configuration file to obtain some configuration information.

Take HTTP as an example (the corresponding class is org. Apache. flume. instrumentation. http. httpmetricsserver). Its start method starts a jetty as a web server to provide web services. Httpmetricshandler, a data processing class of abstracthandler, handle (string target, httpservletrequest request, httpservletresponse response, int dispatch) of this class, is implemented to set some webpage formats and use jmxpollutil. getallmbeans () obtains the Map <string, Map <string, string> metricsmap composed of mbeans registered by all components and traverses this metricsmap to convert this metricsmap to JSON and output it to the web page. The stop method is some cleanup work. Here is to disable jetty server. It's easy, so we can implement a server. In the start method, start a thread to traverse the metricsmap every second or write it into MySQL, hbase, or somewhere else...

You can call your own counters in the defined components, and package the counters, monitoring classes, and custom components (source, sink, and channel) into Lib, add-dflume after the command is started. monitoring. type = aaaaa-dflume. monitoring. node = Bbbb. Note: dflume. monitoring. it seems that you must set the type parameter, which is your own monitoring class (AAAAA here). The following optional parameters are some parameters. You can customize the parameter name, for example, you can set the IP address and port of the database server.

At this point, the introduction is complete. These are all from the source code and have not yet been implemented for reference.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.