Installation and configuration of flume
First, Resources Download
Resource Address: http://flume.apache.org/download.html
Program Address: http://apache.fayea.com/flume/1.6.0/apache-flume-1.6.0-bin.tar.gz
Source Address: http://mirrors.hust.edu.cn/apache/flume/1.6.0/apache-flume-1.6.0-src.tar.gz
Second, Installation and Construction
(1) the compiled package:
Unzip directly in the installation directory (rename optional)
cd/usr/local/
TAR–ZXVF apache-flume-1.6.0-bin.tar.gz
MV Apache-flume-1.6.0-bin Flume
(2) source Code compilation installation:
This method is more troublesome, to download the required packages, and then compile with the following command:
- Compile only: mvn clean compile
- Compiling and executing unit tests: MVN clean test
- Run unit tests separately: MVN clean test-dtest=<test1>,<test2>,...-Dfailifnotests=false
- Create a compressed package: mvn clean Install
- Skipping unit tests creating a compressed package: MVN Clean install–dskiptests
After the compilation is complete, and the executable package is run directly
Third, Run and configure
(1) Flume the configuration
# example.conf:a Single-node Flume Configuration
# Name The components in this agent
A1.sources = R1
A1.sinks = K1
A1.channels = C1
# Describe/configure The source
A1.sources.r1.type = Exec
A1.sources.r1.command = Tail-f/flume/test.log
# Describe The sink
A1.sinks.k1.type = HDFs
# Use a channel which buffers events in memory
A1.channels.c1.type = Memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
a1.sinks.k1.hdfs.path=hdfs://192.168.15.135:9000/flume/events/%y-%m-%d/%h%m/%s
A1.sinks.k1.hdfs.filePrefix = events-
A1.sinks.k1.hdfs.round = True
A1.sinks.k1.hdfs.roundValue = 10
A1.sinks.k1.hdfs.roundUnit = Minute
A1.sinks.k1.hdfs.useLocalTimeStamp = True
# Bind the source and sink to the channel
A1.sources.r1.channels = C1
A1.sinks.k1.channel = C1
The configuration file is divided into four parts: source, sink, channel, and correlation between them, and the relationship between Flume modules
Source is responsible for collecting data from webserver, sink is responsible for writing the collected and formatted logs to disk, other file systems or other log systems, the channel is responsible for connecting the source and the sink. Because of the presence of channel, source and sink are many-to-many relationships.
# example.conf:a Single-node Flume Configuration |
|
# Name The components in this agent |
A1 is the name of the agent. |
A1.sources = R1 |
Define a SOURCE:R1 |
A1.sinks = K1 |
Define a SINK:K1 |
A1.channels = C1 |
Define a CHANNEL:C1 |
# Describe/configure The source |
|
A1.sources.r1.type = Exec |
The type of R1 for A1 is exec (execution type) |
A1.sources.r1.command = Tail-f/flume/test.log |
A1 's R1 command to execute is tail a test.log |
# Describe The sink |
|
A1.sinks.k1.type = HDFs |
The sink type of A1 is HDFs |
# Use a channel which buffers events in memory |
|
A1.channels.c1.type = Memory |
The type of channel for A1 is the presence of memory |
a1.channels.c1.capacity = 1000 |
The capacity of the A1 is 1000 |
a1.channels.c1.transactionCapacity = 100 |
A1 's interactive capacity is 100 |
a1.sinks.k1.hdfs.path=hdfs://192.168.15.135:9000/flume/events/%y-%m-%d/%h%m/%s |
The path to the final stored file system of the A1 called K1 sink is: hdfs://... |
A1.sinks.k1.hdfs.filePrefix = events- |
Sink is prefixed with event-when the file is stored |
A1.sinks.k1.hdfs.round = True |
HDFs Configuration Items |
A1.sinks.k1.hdfs.roundValue = 10 |
HDFs Configuration Items |
A1.sinks.k1.hdfs.roundUnit = Minute |
HDFs Configuration Items |
A1.sinks.k1.hdfs.useLocalTimeStamp = True |
Set to true with local timestamp |
# Bind the source and sink to the channel |
|
A1.sources.r1.channels = C1 |
Bind the SOURCE-R1 to the CHANNEL-C1 |
A1.sinks.k1.channel = C1 |
Bind the SINK-K1 to the CHANNEL-C1 |
(2) Flume The Operation method is:
$ bin/flume-ng agent-n $agent _name-c conf-f conf/flume-conf.properties
-n Specifies the name of the proxy (agent);
-C conf Specifies the directory of the configuration file (mainly the directory of other configuration files such as logs);
-F This run of Flume configuration file, need to add path (mode is in project root path flume/)
Execute commands such as:
$ bin/flume-ng agent-n a1-c conf-f conf/example.conf
After successful execution, we can see the logs in logs's flume.log.
In addition, you can start by specifying the log output in the following ways:
$ bin/flume-ng Agent--conf conf--conf-file example.conf--name A1-dflume.root.logger=info,console
--conf: Same as-C;
--conf-file: Same as-F;
--name: Same as-N;
Flume.root.logger: Specify the log level and display mode, the above command is info, output to the terminal, if there is no such item, like the previous command, the default level is info, output to logfile.
Four, Notes
(1) an optional Source are:
- §avro Source
- §thrift Source
- §exec Source
- §JMS Source
- §spooling Directory Source
- §twitter 1% firehose Source (experimental)
- §kafka Source
- §netcat Source
- §sequence Generator Source
- §syslog Sources
- §http Source
- §stress Source
- §legacy Sources
- §custom Source
- §scribe Source
(2) an optional Sink are:
- §hdfs Sink
- §hive Sink
- §logger Sink
- §avro Sink
- §thrift Sink
- §irc Sink
- §file Roll Sink
- §null Sink
- §hbasesinks
- §morphlinesolrsink
- §elasticsearchsink
- §kite Dataset Sink
- §kafka Sink
- §custom Sink
- § (2) the following channel options are available:
- §memory Channel
- §jdbc Channel
- §kafka Channel
- §file Channel
- §spillable Memory Channel
- §pseudo Transaction Channel
- §custom Channel
Detailed Configuration reference: Http://flume.apache.org/FlumeUserGuide.html#flume-sources
Installation and configuration of flume