Data structure analysis of event events in "Flume" and "Source analysis" Flume

Source: Internet
Author: User

Objective

First look at the definition of event in Flume official website


A line of text content is deserialized into an event "serialization is the process of converting an object's state into a format that can be persisted or transmitted. Relative to serialization is deserialization, which transforms a stream into an object. These two processes combine to make it easy to store and transfer data ", the maximum definition of event is 2048 bytes, exceeding, then cutting, the remainder being put into the next event, the default encoding is UTF-8, which is all unified.

But this explanation is for the definition of event in the Avro deserialization system, and Flume NG does not use this in many of the event, so you just have to remember the event data structure, the above explanation can be ignored.


First, the event definition

Public interface Event {  /**   * Returns A map of name-value pairs describing the data stored in the body.   *  /Public map<string, string> getheaders ();  /**   * Set the event headers   * @param headers Map of headers to replace the current headers.   *  /public void Setheaders (map<string, string> headers);  /**   * Returns The raw byte array of the data contained in this event.   *  /Public byte[] GetBody ();  /**   * Sets the raw byte array of the data contained in this event.   * @param body the data.   *  /public void Setbody (byte[] body);}
Very simple data structure.

The header is a map,body is a byte array, the body is the actual transmission of data we actually use, the header transmits the data, we are not going to be sink out.

Ii. How the event is produced and how it is diverted

while (line = Reader.readline ()) = null) {            synchronized (eventlist) {              Sourcecounter.incrementeventreceivedcount ();              Eventlist.add (Eventbuilder.withbody (Line.getbytes (CharSet)));              if (Eventlist.size () >= BufferCount | | timeout ()) {                flusheventbatch (eventlist);}}          }

public static event Withbody (byte[] body, map<string, string> headers) {    Event event = new SimpleEvent ();    if (BODY = = null) {      BODY = new byte[0];    }    Event.setbody (body);    if (headers! = null) {      event.setheaders (new hashmap<string, string> (headers));    }    return event;  }

Here is a simple package of the event body content, line is our real data content, convert it into UTF-8 encoded byte content into the body of the event, its header is null.

The SimpleEvent class is used.

The header word is that when the event object is being split, we can customize the settings for some key-value pairs, and this is done for the subsequent channel multiplexing used to prepare the

When the source end of the output event, through the header to distinguish between different event, and then on the sink side, we can through the header of the key to the different event output to the corresponding sink downstream, so that the event is diverted out, But here is a premise: do not recommend through the body of the event to set the header, because Flume is a sink, the sink is not in the middle of the water processing, to be processed, such as water out of the reprocessing

A1.sources.r1.interceptors = I1a1.sources.r1.interceptors.i1.type = Hosta1.sources.r1.interceptors.i1.hostHeader = Hostname
As above, host is your custom interceptor, and Hostheader is a custom key, so you define different headers for each event when the event is output, and then use the multiplexed channel mode shunt

A1.sources.r1.selector.type = Multiplexinga1.sources.r1.selector.header = Statea1.sources.r1.selector.mapping.CZ = C1a1.sources.r1.selector.mapping.US = C2 C3a1.sources.r1.selector.default = C4
This allows you to put it into different channel according to the key in the event header, and then, by configuring multiple sink to remove the event from different channel, divert it to a different output terminal.

Each sink configured channel is differentiated.


Data structure analysis of event events in "Flume" and "Source analysis" Flume

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.