Flume Official document Translation--flume some knowledge points in 1.7.0 User Guide (unreleased version)

Source: Internet
Author: User
Tags time in milliseconds

Flume Official document translation--flume 1.7.0 User Guide (unreleased version) (i) Flume Official document translation--flume 1.7.0 User Guide (Unreleased version) (ii)Flume Properties

Property Name

Default

Description

Flume.called.from.service

If the specified then the Flume agent would continue polling for the config file even if the config file is no T found at the expected location. Otherwise, the Flume agent would terminate if the config doesn ' t exist at the expected location. No property value was needed when setting the (eg, just Specifying-dflume.called.from.service is enough)

If this property is specified, then the flume agent polls the configuration document even if the document is not found on the specified path. Additionally, the FLume agent will end if the configuration document is not in the specified location. This property does not need to set a value (for example, just specifying-dflume.called.from.service is sufficient)

Property:flume.called.from.service

Flume periodically polls, every seconds, for changes to the specified config file. A Flume agent loads a new configuration from the config file if either an existing file was polled for the first time, or I F an existing file ' s modification date had changed since the last time it was polled. Renaming or moving a file does not the change of its modification time. When a Flume agent polls a non-existent file then one of the things happens:1. When the agent polls a non-existent config File for the first time, then the agent behaves according to the Flume.called.from.service property. If the property is set and then the agent would continue polling (always at the same period–every-seconds). If The property isn't set, then the agent immediately terminates. ... OR ... 2. When the agent polls a non-existent config file and this are not the first time the file was polled, then the agent makes No config changes for this polling period. The agent continues polling rather than terminating.

Flume the configuration document is polled every 30 seconds for changes. The flume agent loads a new configuration document if the document was first polled or the modification time was changed after the last poll. Renaming and moving a document does not change the time the document was modified. When a flume agent polls a non-existent document, one of the following two scenarios occurs: 1. When the configuration file is not polled in the specified directory, the agent determines his behavior based on the attribute of the Flume.called.from.service property. If this property is set, polling is done in 30 second cycles, and if not set, stop immediately if not found. 2. If the agent does not poll the file at the specified path after the configuration file has been loaded, no changes will be made, and then polling continues.

log4j Appender (log4j log memory)

Appends log4j events to a flume agent ' s Avro source. A client using this appender must has the FLUME-NG-SDK in the classpath (eg, Flume-ng-sdk-1.8.0-snapshot.jar). Required properties is in bold.

Add log4j events to the Avro source of a flume agent. A client who wants to use this appender must have flume-ng-sdk under the classpath (for example, Flume-ng-sdk-1.8.0-snapshot.jar). The attributes you must have are bold in bold.

Property Name

Default

Description

Hostname

The hostname on which a remote Flume agent was running with an Avro source.

Host name of the remote flumeagent running Avro Source

Port

The port at which the remote Flume agent's Avro source is listening.

The port on which the remote Flume agent Avro Source is listening

Unsafemode

False

If true, the Appender would not throw a exceptions on failure to send the events.

If set to True, Appender will not throw an exception when sending events fails.

Avroreflectionenabled

False

Use Avro Reflection to serialize log4j events. (Do not use when users log strings)

Use Avro reflection to serialize log4j events.

Avroschemaurl

A URL from which the Avro schema can be retrieved.

A URL that is used to recover data from the Avro schema.

Sample log4j.properties File:

== = 41414trueclass' s logger to output to the flume Appender /c7>= debug,flume# ...

By default each event was converted to a string by calling toString (), or by using the log4j layout, if specified.

If the event is an instance of Org.apache.avro.generic.GenericRecord, Org.apache.avro.specific.SpecificRecord, or if the P Roperty avroreflectionenabled is set to true then the event would be serialized using Avro serialization.

Serializing every event with its Avro schema was inefficient, so it was good practice to provide a schema URL from which the Schema can is retrieved by the downstream sink, typically the HDFS sink. If Avroschemaurl is not specified and then the schema would be included as a Flume header.

Sample log4j.properties file configured to use Avro serialization:

Each events by default can be converted to a string by ToString (), or log4j layout is available if specified.

If events is a Org.apache.avro.generic.GenericRecord, An instance of the Org.apache.avro.specific.SpecificRecord class, or the value of its property avroreflectionenabled is true, is serialized using Avro serialization.

It is inefficient to serialize each event and its Avro schema, so a good practice is to provide a schemaurl that can be recovered from the dirty sink, usually the HDFs sink. If Avroschemaurl is not specified, the schema will be incorporated into the flume Haeder.

An example of a log4j attribute document that uses Avro serialization is as follows:

== = 41414true= HDFs://NAMENODE/PATH/TO/SCHEMA.AVSC Class ' s logger to output to the flume appender= debug,flume# ...  

Load balancing log4j Appender

Appends log4j events to a list of flume agents ' s Avro source. A client using this appender must has the FLUME-NG-SDK in the classpath (eg, Flume-ng-sdk-1.8.0-snapshot.jar). This appender supports a round-robin and random scheme for performing the load balancing. It also supports a configurable Backoff timeout so that's down agents is removed temporarily from the set of hosts. Required properties is in bold.

Add log4j events to the Avro source of a flume agent. A client who wants to use this appender must have flume-ng-sdk under the classpath (for example, Flume-ng-sdk-1.8.0-snapshot.jar). This log memory supports a cyclic and random schedule to perform load balancing. It also supports a configurable post-move timeout to remove the suspended agent from the host. The attributes of the bold character label are required.

Property Name

Default

Description

Hosts

A space-separated List of host:port at which Flume (through an avrosource) was listening for events

Lists the hosts that listen to events, with each host:port separated by a space.

Selector

Round_robin

Selection mechanism. Must be either round_robin, the RANDOM or custom FQDN to class this inherits from Loadbalancingselector.

Selection mechanism. A custom FQDN class that must inherit from Round_robin,random or Loadbalancingselector.

Maxbackoff

A Long value representing the maximum amount of time in milliseconds the Load balancing client would backoff from a node th At have failed to consume an event. Defaults to No Backoff

This value represents the maximum Backoff timeout in milliseconds, that is, when a node fails to consume the event, wait for the timeout to re-send the event. The default is no Backoff

Unsafemode

False

If true, the Appender would not throw a exceptions on failure to send the events.

If set to True, Appender will not throw an exception when sending events fails.

Avroreflectionenabled

False

Use Avro Reflection to serialize log4j events.

Use Avro reflection to serialize log4j events.

Avroschemaurl

A URL from which the Avro schema can be retrieved.

A URL that is used to recover data from the Avro schema.

Sample log4j.properties file configured using defaults:

#...log4j.appender.out2=org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppenderlog4j.appender.out2.Hosts= localhost:25430 localhost:25431# Configure aclass' s logger to output to the flume AppenderLog4j.logger.org.example.MyClass=debug,flume# ... Sample log4j.properties file configured using RANDOM load Balancing:#...log4j.appender.out2=org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppenderlog4j.appender.out2.Hosts= localhost:25430 localhost:25431Log4j.appender.out2.Selector=RANDOM # Configure aclass' s logger to output to the flume AppenderLog4j.logger.org.example.MyClass=debug,flume# ... Sample log4j.properties file configured using Backoff:#...log4j.appender.out2=org.apache.flume.clients.log4jappender.LoadBalancingLog4jAppenderlog4j.appender.out2.Hosts= localhost:25430 localhost:25431 localhost:25432Log4j.appender.out2.Selector=ROUND_ROBINlog4j.appender.out2.MaxBackoff= 30000# Configure aclass' s logger to output to the flume AppenderLog4j.logger.org.example.MyClass=debug,flume# ...

Security (Safety)

The HDFS sink, HBase sink, Thrift source, Thrift Sink and Kite Dataset sink all support Kerberos authentication. Refer to the corresponding sections for configuring the kerberos-related options.

Flume agent would authenticate to the Kerberos KDC as a single principal, which'll be is used by different components that r Equire Kerberos authentication. The principal and keytab configured for Thrift source, Thrift sink, HDFS sink, HBase sink and datasets sink should be the s AME, otherwise the component would fail to start.

HDFS sink, HBase sink, Thrift source, Thrift sink, and kite Dataset sink support Kerberos authentication. Please refer to the section configuring Kerberos-related options.

When different components in the agent require Kerberos authentication, the Flume agent acts as the principal to be validated by the Kerberos KDC. Thrift source, Thrift sink, HDFS sink, HBase sink and DataSet sink both the key and the body should be the same, otherwise the component cannot start.

Flume Official document Translation--flume some knowledge points in 1.7.0 User Guide (unreleased version)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.