Reprint: http://blog.csdn.net/liuxiao723846/article/details/78133375
First, the scene of a description:
The Online API interface service prints logs on the local disk via log4j, installs Flume on the interface server, collects logs through the exec source, and then sends the flume to the rollup server via Avro Sink, Flume through Avro on the rollup server Source receives the log and then writes to the local disk by File_roll sink.
Assumption: API interface Server two 10.153.140.250 and 10.153.140.251, summary log server one 10.153.137.211
1. Flume configuration on API interface server:
1) Download, unzip, and install Flume on the API interface server:
[HTML]View PlainCopy
- cd/usr/local/
- wget http://mirror.bit.edu.cn/apache/flume/1.7.0/apache-flume-1.7.0-bin.tar.gz
- Tar-xvzf apache-flume-1.7.9-bin.tar.gz
- Vim/etc/profile
- Export ps1= "[\[email protected] '/sbin/ifconfig eth0|grep ' inet ' |awk-f ' [:]+ ' {print $4} ' \w]"' $ '
- Export Flume_home=/usr/local/apache-flume-1.6.0-bin
- Export path= $PATH: $FLUME _home/bin
2) Modify the flume-env.sh configuration file:
Cd/usr/local/flume/conf
Vim flume-env.sh
The inside specifies Java_home, simultaneously adds log4j.properties file in the Conf directory;
3) Flume configuration file:
agent1.sources = ngrinder agent1.channels = MC1 agent1.sinks = avro-sink agent1.sources.ngrinder.type = exec Agent1.sources.ngrinder.command = tail-f/data/logs/ttbrain/ttbrain-recommend-api.log agent1.sources.ngrinder.channels = MC1 agent1.channels.mc1.type = memory agent1.channels.mc1.capacity = +agent1.channels.mc1.keep-alive = Agent1.sinks.avro-sink.type = Avro Agent1.sinks.avro-sink.channel = MC1 agent1.sinks.avro-sink.hostname = 10.153.137.211 agent1.sinks.avro-sink.port = 4545
Note: This sink uses Avro, and the interface server's flume will send the log data to the summary log server via RPC;
4) Start:
Nohup flume-ng agent-c/usr/local/apache-flume-1.7.0-bin/conf-f/usr/local/apache-flume-1.7.0-bin/conf/ Test-tomcat-log.conf-n agent1 >/dev/null 2>&1 &
2. Flume configuration on the summary log server:
1) installation, decompression, configuration flume:
2) Flume configuration file:
collector1.sources = avroin collector1.channels = MC1 collector1.sinks = localout Collector1.sources.AvroIn.type = Avro collector1.sources.AvroIn.bind = 10.153.137.211 collector1.sources.AvroIn.port = 4545 collector1.sources.AvroIn.channels = MC1 collector1.channels.mc1.type = memory collector1.channels.mc1.capacity = collector1.channels.mc1.transactionCapacity = Collector1.sinks.LocalOut.type = file_roll collector1.sinks.LocalOut.sink.directory =/data/tomcat_log_bakCollector1.sinks.LocalOut.sink.rollInterval = 0 Collector1.sinks.LocalOut.channel = MC1
Description
A, the source used here is Avro, and API interface flume docking;
B, use File_roll's sink here, save log data to local disk;
Note: Bind can only write a machine IP or machine name, not writing localhost and so on.
3) Start:
Nohup flume-ng agent-c/usr/local/apache-flume-1.7.0-bin/conf-f/usr/local/apache-flume-1.7.0-bin/conf/tomcat_ Collection.conf-n collector1 -dflume.root.logger=info,console >/dev/null 2>&1 & /c4>
This is, we will find that the/data/tomcat_log_bak directory will generate logs collected from both interface servers.
Second, scenario two description:
The Online API interface service prints logs to the local disk via log4j, installs Flume on the interface server, collects logs through the exec source, and then sends the logs to Flume on the rollup server via Avro Sink, Flume on the rollup server by Avro The source receives the log and then backs it up to HDFs via HDFs sink.
Suppose there are two 10.153.140.250 and 10.153.140.251 for the API interface server, one for the server that summarizes the logs 10.153.137.211
1. Flume configuration on API interface server:
Ibid.
2, the summary server flume configuration:
1) Installation, decompression flume:
2) Flume configuration file:
Agent1.channels = Ch1
Agent1.sources = S1
agent1.sinks = log-sink1
Agent1.sources.s1.type = Avro
Agent1.sources.s1.bind = 10.153.135.113
Agent1.sources.s1.port = 41414
Agent1.sources.s1.threads = 5
Agent1.sources.s1.channels = Ch1
Agent1.channels.ch1.type = Memory
agent1.channels.ch1.capacity = 100000
agent1.channels.ch1.transactionCapacity = 100000
agent1.channels.ch1.keep-alive =
Agent1.sinks.log-sink1.type = HDFs
Agent1.sinks.log-sink1.hdfs.path = Hdfs://hadoop-jy-namenode/data/qytt/flume
Agent1.sinks.log-sink1.hdfs.writeformat = Text
Agent1.sinks.log-sink1.hdfs.filetype = DataStream
Agent1.sinks.log-sink1.hdfs.rollinterval = 0
Agent1.sinks.log-sink1.hdfs.rollsize = 60554432
Agent1.sinks.log-sink1.hdfs.rollcount = 0
Agent1.sinks.log-sink1.hdfs.batchsize = +
Agent1.sinks.log-sink1.hdfs.txneventmax = +
Agent1.sinks.log-sink1.hdfs.calltimeout = 60000
Agent1.sinks.log-sink1.hdfs.appendtimeout = 60000
Agent1.sinks.log-sink1.channel = Ch1
Description
A, the source used here is Avro, and API interface flume docking;
B, the sink here uses HDFs and can write data to HDFs, where the Namenode address of the Hadoop cluster needs to be specified. (hdfs://hadoop-jy-namenode/)
3) Start:
Nohup flume-ng agent-c/usr/local/apache-flume-1.7.0-bin/conf-f/usr/local/apache-flume-1.7.0-bin/conf/hdfs.conf-n Agent1>/dev/null 2>&1 &
At this point, we will generate the logs collected from the two interface servers in the/data/qytt/flume directory in HDFs.
Assuming there are two 10.153.140.250 and 10.153.140.251 API Interface Servers, we can deploy flume on the interface server,
Summary log of the server one 10.153.137.211
Flume uses the exec source to collect each end data to summarize to another server