Speaking of headings, this is only a small part of the real-time architecture.
Download the latest version flume:apache-flume-1.6.0-bin.tar.gz
Unzip, modify Conf/flume-conf.properties name can write casually.
What I currently achieve is to read the data from the directory to write to the Kafka, the principle of the east of the Internet a lot of, only to connect the code:
a1.sources = R1
a1.sinks = K1
a1.channels = C1 A1.sources.r1.type
= spooldir a1.sources.r1.channels
= C1
A1.sources.r1.spoolDir =/data/pv/20150812/
A1.sources.r1.fileHeader = True
a1.channels = C1
A1.channels.c1.type = Memory
a1.channels.c1.capacity = 10000
a1.channels.c1.transactionCapacity = 10000
A1.channels.c1.byteCapacityBufferPercentage =
a1.channels.c1.byteCapacity = 800000
A1.sinks.k1.type = Org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = Testflume
A1.sinks.k1.brokerList = xxxx:9092,xxxx:9092,xxxx:9092
a1.sinks.k1.requiredAcks = 1
A1.sinks.k1.batchSize =
A1.sinks.k1.channel = C1
Start Flume:
./bin/flume-ng agent-n a1-c conf-f conf/flume-conf.properties
Go inside the Kafka to query the data:
./bin/kafka-console-consumer.sh --zookeeper Xxxx:2181/kafka--topic testflume
You can see that the data is continuously added to the Kafka.