First, the architecture scheme such as:
Second, the installation of the various components of the program are as follows:
1), Zookeeper+kafka
Http://www.cnblogs.com/super-d2/p/4534323.html
2) HBase
Http://www.cnblogs.com/super-d2/p/4755932.html
3) Flume Installation:
Installing and installing the JDK
Flume operating system requires more than 1.6 of the Java operating environment, download the JDK installation package from the Oracle Web site, unzip the installation:
To set the Java environment variable:
Installing flume
Download flume binary installation package from official website, unzip the installation:
Configuration
Source uses the Necat type, sink takes the File_roll type, fetches the data from the listening port, and saves it to the local file. Copy the configuration template:
The editing configuration is as follows:
Functional verification
1. Create the output directory
2. Start the service
The run log is located in the logs directory, or at startup, add options to the -Dflume.root.logger=INFO,console
foreground startup, output the print log, view the specific run log, and check the cause of the service exception.
3. Send data
输入
4. View the data file to view the/tmp/log/flume directory file:
Integration with Kafka
Flume can be flexibly integrated with Kafka, Flume focuses on data collection, and Kafka focuses on data distribution. The flume can be configured with a source of Kafka, or it can be configured with sink as Kafka. The configuration sink for the Kafka example is as follows
The data collected by Flume is distributed through Kafka to other big data platforms for further processing.
Corresponds to our architecture scenario:
The configuration of the flume is as follows:
# Flume Test File
# listens via Avro RPC on port 41414 and dumps data received to the log
Agent.channels = ch-1
Agent.sources = src-1
Agent.sinks = sink-1
Agent.channels.ch-1.type = Memory
Agent.channels.ch-1.capacity = 10000000
agent.channels.ch-1.transactioncapacity = 1000
Agent.sources.src-1.type = Avro
Agent.sources.src-1.channels = ch-1
Agent.sources.src-1.bind = 0.0.0.0
Agent.sources.src-1.port = 41414
Agent.sinks.sink-1.type = Logger
Agent.sinks.sink-1.channel = ch-1
Agent.sinks.sink-1.type = Org.apache.flume.sink.kafka.KafkaSink
Agent.sinks.sink-1.topic = Avro_topic
Agent.sinks.sink-1.brokerlist = ip:9092
Agent.sinks.sink-1.requiredacks = 1
Agent.sinks.sink-1.batchsize = 20
Agent.sinks.sink-1.channel = ch-1
Agent.sinks.sink-1.channel = ch-1
Agent.sinks.sink-1.type = HBase
agent.sinks.sink-1.table = Logs
agent.sinks.sink-1.batchsize = 100
agent.sinks.sink-1.columnfamily = Flume
Agent.sinks.sink-1.znodeparent =/hbase
Agent.sinks.sink-1.zookeeperquorum = ip:2181
Agent.sinks.sink-1.serializer = Org.apache.flume.sink.hbase.RegexHbaseEventSerializer
Note Flume to HBase to copy the relevant package to Flume
Demo
Https://github.com/super-d2/flume-log4j-example
Reference:
https://mos.meituan.com/library/41/how-to-install-flume-on-centos7/
Flume+kafka+hbase+elk