Apache Flume is a distributed, reliable, and efficient system that collects, aggregates, and moves data from disparate sources to a centralized data storage center. Apache Flume is not just used in log collection. Because data sources can be customized,flume can use the transfer of a large number of custom event data, including but not limited to website traffic information, social media information,Email information and other possible data. Flume is the top project of the Apache Software Fund. Official website http://flume.apache.org/.
First, installation
Flume provides a binary installation version, all we can choose to download the binary installation version directly, without compiling it yourself. Http://flume.apache.org/download.html, you can also choose Apache's archive library for other versions of the download, the address is http://archive.apache.org/dist/flume/. Since we are using the CentOS system, we download the software and install it directly using the command, and after the installation is complete, you can choose to add the Flume bin directory to the PATH environment variable.
wget http://archive.apache.org/dist/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz Decompression TAR-ZXVF Apache-flume-1.6.0-bin.tar.gz establishing a soft-connect CD. Ln-s Softs/apache-flume-1.6.0-bin Flume
Second Simple Flume Example
In the later version of flume1.x, great structural changes were made, and the main components of Flume were the agents, which consisted of source, channel and sink. The main purpose of source is to collect external data and send the data to the channel. The main function of channel is to store data as a channel of a data stream. The main function of sink is to read data from the channel and send the data to the next agnet or destination. Structure:
Here is the simplest example, all using flume, using Avro source, memory Channel,logger sink. The implementation function is: Avro listens on port 44444, then sends the data to Channel,sink to read the data, prints the data to the console.
# # Example:a single-node Flume configuration# name the compoents on Thisagenta1.sources=r1a1.sinks=S1a1.channels=c1# describe/Configure the Sourcea1.sources.r1.type=Netcata1.sources.r1.bind=0.0.0.0A1.sources.r1.port=44444# describe/Configure the Sinka1.sinks.s1.type=logger# describe/Configure the Channela1.channels.c1.type=memorya1.channels.c1.capacity=1000a1.channels.c1.transactionCapacity=100# bind all compoents of the source and sink to Channela1.sources.r1.channels=C1a1.sinks.s1.channel=c1
The start flume command is:
Bin/flume-ng agent-n a1-f conf/flume-conf.properties
The see indicates that the boot was successful, or that you are using the JPS command to see if there is a application process if there is a success.
Connect to send data via Telnet. command for Telnet IP port. Final result
[Flume]-Flume installation