flume1.8 Use Guide Learning sentiment (iv)

Source: Internet
Author: User

1. Flume Interceptors

Flume has the ability to modify/delete events in the process. This is done with the help of interceptors (interceptor). Interceptors (interceptors) are classes that implement Org.apache.flume.interceptor.Interceptor interfaces. A interceptor can be modified according to any criteria chosen by the developer of Interceptor, or even give up events. This can be achieved by specifying a series of interceptor in the configuration to generate the class name. Interceptors is specified as a blank delimiter list in the source configuration. If interceptor needs to discard events, it will not return the events in the list it needs to return. If interceptor discards all events, then it returns an empty list. Simple example:

Note: The interceptor build is passed to the type configuration property. The interceptors itself is configurable and can pass configuration values as if it were passed to other configurable components. In the example above, events is passed to Hostinterceptor first, and events is returned by Hostinterceptor and then passed to Timestampinterceptor alone. You can specify a fully qualified class name or alias timestamp. If you have multiple collectors writing to the same HDFs path, then you can also use Hostinterceptor.

1.1 Timestamp Interceptor

The interceptor inserts a second level of time into the event headers when the event is processed. The interceptor inserts a header with a key timestamp (or specified by the header property) whose value is the associated timestamp. The interceptors can keep an existing timestamp if it is already preconfigured in the configuration.

Agent A1 Example:

1.2 Host Interceptor

The interceptor is inserted into the hostname or IP address of the host running the agent. It is configured to insert headers with a key host or configuration key (the hostname or IP address whose value is host).

Example of Agent A1:

1.3 Static Interceptor

Static interceptor run the user to add a static header with static values to all events.

Example of Agent A1:

1.4 Remove Header Interceptor

The interceptor operates flume event headers by removing one or more headers. It can remove a statically defined header, a headers based on a regular expression, or a headers in a list. If these are not defined, or if there is no header match to the standard, Flume events will not be modified.

Note: If only one header needs to be removed, specifying it by name can provide better performance than the other two methods.

1.5 UUID Interceptor

The interceptor sets a universally unique identifier on all events that are intercepted.

1.6 Morphline Interceptor

The Interceptor filters events through the Morphline configuration file, which defines a chain of conversion commands from one command to another command pipeline record. For example, Morphline can ignore events, or change or insert some event headers through pattern matching based on regular expressions, or it can automatically detect and set a MIME type on intercepted events via Apache Tika.

Simple example Flume. conf file:

1.7 Search and Replace Interceptor

The interceptor provides a simple string-based Search-and-replace function based on a Java regular expression. Backtracking/group captures are also available. This interceptor uses the same rules as the Java Matcher.replaceall () method.

Example configuration:

Another example:

1.8 Regex Flitering Interceptor

The Interceptor selectively filters events by interpreting the event body as text and matching the text with the configured regular expression.

1.9 Regex Extractor Interceptor

This interceptor extracts the regular expression matching group using the specified regular expression and attaches the matching group as the headers of the event.

The serializers is used to map matches to header names and formatted header values; You only need to specify the header name and the default Org.apache.flume.interceptor.RegexExtractorInterceptorPassThroughSerializer will be used. This serializer simply maps the match to the specified header name and passes the value extracted by the regular expression.

Example 1:

If the Flume event Body contains 1:2:3:4FOOBAR5, you can use the following configuration:

The extracted event will contain the same body, but the following headers will be appended with the one=>1,two=>2,three=>3.

Example 2:

If the Flume event Body contains 2012-10-18 18:47:57,614 Some log line, you can use the following configuration:

The extracted event will contain the same body, but the following headers will be appended with the timestamp=>1350611220000.

Resources:

Https://flume.apache.org/FlumeUserGuide.html

flume1.8 Use Guide Learning sentiment (iv)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.