1. Flume Create configuration file Flume-spark-tail-conf.properties
# The configuration file needs to define the sources, # the channels and the sinks.# Sources, channels and sinks are defined per agent, # in this case called ‘agent‘a2.sources = r2a2.channels = c2a2.sinks = k2### define sourcesa2.sources.r2.type = execa2.sources.r2.command = tail -F /opt/datas/spark_word_count.loga2.sources.r2.shell = /bin/bash -c### define channelsa2.channels.c2.type = memorya2.channels.c2.capacity = 1000a2.channels.c2.transactionCapacity = 100### define sinksa2.sinks.k2.type = avroa2.sinks.k2.hostname = bigdata.eclipse.coma2.sinks.k2.port = 9999### bind the sources and sinks to the channelsa2.sources.r2.channels=c2a2.sinks.k2.channel = c2
2. Upload the jar package required by flume to the Extraljars directory in Spark
用到的jar包如下:1) spark-streaming-flume_2.10-1.3.0.jar2) flume-avro-source-1.5.0-cdh5.3.6.jar3)flume-ng-sdk-1.5.0-cdh5.3.6.jar
3, start Sparkshell, command as follows
Bin/spark-shell \--Jars/opt/app/cdh5. 3. 6/spark-1.3. 0-bin-2.5. 0-cdh5. 3. 6/extraljars/spark-streaming-flume_2. Ten-1.3. 0.Jar,/opt/app/cdh5. 3. 6/spark-1.3. 0-bin-2.5. 0-cdh5. 3. 6/extraljars/flume-avro-source-1.5. 0-cdh5. 3. 6.Jar,/opt/app/cdh5. 3. 6/spark-1.3. 0-bin-2.5. 0-cdh5. 3. 6/extraljars/flume-ng-SDK-1.5. 0-cdh5. 3. 6.Jar\--MasterLocal[2]
4. Execute the following test code in Sparkshell command line mode
--Code: Import org. Apache. Spark. _import org. Apache. Spark. Streaming. _import org. Apache. Spark. Streaming. StreamingContext. _import org. Apache. Spark. Streaming. Flume. _val SSC = new StreamingContext (SC, Seconds (5) Val stream = Flumeutils. CreateStream(SSC,"Bigdata.eclipse.com",9999)//Val Eventscount = Stream. Count. Map(CNT ="Recevied"+ CNT +"flume events.")//Eventscount. Print() Val Wordcountstream = stream. Map(x= = New String (x. Event. GetBody. Array())). FlatMap(_. Split(" ")). Map((_,1)). Reducebykey(_ + _) Wordcountstream. Print() SSC. Start() SSC. Awaittermination()---
5. Open Flume
bin/flume-ng-c-n-f conf/flume-spark-tail-conf.-Dflume.root.logger=INFO,console
6, Spark_word_count.log add content to test
Flume and sparkstreaming Integration