Apache Flume Introduction and installation Deployment

Last Update:2018-07-07 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview

Flume is a highly available, highly reliable, distributed, massive log collection, aggregation, and transmission software provided by Cloudera.

The core of Flume is to collect data from the data source , and then send the collected data to the specified destination (sink). In order to ensure that the delivery process must be successful, before sending to the destination (sink), the data will be cached (channel), when the data really arrives at the destination (sink), flume delete their own cached data.

Flume supports the customization of various data senders for the collection of various types of data, while Flume supports the customization of various data recipients for the final storage of data. General acquisition requirements, through the simple configuration of flume can be achieved. It also has a good custom extension capability for special scenarios. Therefore,Flume can be used for most of the daily data collection scenarios .

Operating mechanism

　　The core role of the Flume system is that the agent,agent itself is a Java process that typically runs on the log collection node.

Each agent is equivalent to a data transfer agent with three components inside:

　　　　Source: A collection of sources for interfacing with data sources to obtain data;

Sink: Sinking, the purpose of collecting data for transmitting data to the next level agent or transmitting data to the final storage system;

Channel:agent internal data transmission channel for transmitting data from source to sink;

In the process of transmitting the whole data, the event is flowing, it is the basic unit of Flume internal data transmission. Event encapsulates the data that is being transmitted. If it is a text file, usually a row of records, the event is also the basic unit of the transaction. Event from source, to channel, to sink, is itself a byte array, and can carry headers (header information) information. An event represents the smallest complete unit of data, from an external data source, to an external destination.
A complete event includes: Event headers, event body, event information, and its event information is the journal record that Flume collects.

Flume collection system structure diagram simple structure:

Single Agent collects data

Complex structure

Tandem between multi-level agents

Flume Installation Deployment

Upload the installation package to the node on which the data source resides
Extract

TAR-ZXVF apache-flume-1.6.0-bin.tar.gz

Configure the acquisition scheme according to the data acquisition requirements, described in the configuration file (the file name can be arbitrarily customized)

Create a new file in the Conf directory of Flume

VI netcat-logger.conf

#从网络端口接收数据, sink to logger# capture configuration file, Netcat-logger.conf# Name the components on Thisagenta1.sources=r1a1.sinks=K1a1.channels=c1# Describe/Configure the Sourcea1.sources.r1.type=Netcata1.sources.r1.bind=Localhosta1.sources.r1.port= 44444# Describe The Sinka1.sinks.k1.type=logger# use a channel which buffers events in Memorya1.channels.c1.type=memorya1.channels.c1.capacity= 1000a1.channels.c1.transactionCapacity= 100# Bind The source and sink to the Channela1.sources.r1.channels=C1a1.sinks.k1.channel= C1

Specify the acquisition scheme configuration file to start the Flume agent on the appropriate node

Bin/flume-ng agent--conf conf--conf-file conf/netcat-logger.conf--name A1-dflume.root.logger=info,console

#--conf specifying flume with profile location (abbreviated-C)

#--conf-file specifies which of the acquisition schemes (-F)

#--name a name for this flume agent.

Test

-Y Telnet

Incoming data:
$ telnet localhost 44444
Trying 127.0.0.1 ...
Connected to Localhost.localdomain (127.0.0.1).
Escape character is ' ^] '.
Hello world! <ENTER>
Ok

Apache Flume Introduction and installation Deployment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Apache Flume Introduction and installation Deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Apache Flume Introduction and installation Deployment

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support