2016 Big data spark "mushroom cloud" action flume integration spark streaming

Last Update:2016-10-01 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Recently, after listening to Liaoliang's 2016 Big Data spark "mushroom cloud" action, Flume,kafka and spark streaming need to be integrated.

Feel a moment difficult to get started, or start from the simple: my idea is that, flume produce data, and then output to spark streaming,flume source data is netcat (address: localhost, port 22222), The output is Avro (address: localhost, port is 11111). The processing of the Spark streaming is a direct output with several events.

One, configuration file

The Flume configuration file is as follows: Example5.properties

Note to add A1.sinks.k1.avro.useLocalTimeStamp = True, this sentence, otherwise, the general report such an error: "

unable to DeliverEvent. Exceptionfollows.

org.Apache.Flume.eventdeliveryexception: Java.Lang.NullPointerException: expectedtimestamp

”

Thank the brother for offering the solution:

Http://blog.selfup.cn/1601.html

a1.sources = r1a1.channels = c1a1.sinks = k1  a1.sources.r1.type  = netcata1.sources.r1.bind = 192.168.0.10a1.sources.r1.port =  22222a1.sources.r1.channels = c1  a1.channels.c1.type =  memorya1.channels.c1.capacity = 1000a1.channels.c1.transactioncapacity = 100   a1.sinks.k1.type = avroa1.sinks.k1.channel = c1a1.sinks.k1.hostname =  192.168.0.10a1.sinks.k1.port = 11111a1.sinks.k1.avro.uselocaltimestamp = true Second, write the processing code// Create a batch of streamingcontext,10 seconds Val ssc = new streamingcontext (Sparkconf, seconds ()) Val  hostname = args (0) Val port = args (1) .tointval storagelevel =  Storagelevel.memory_onlyval flumestream = flumeutils.createstream (Ssc, hostname, port ) Flumestream.count (). Map (cnt =>  "received "  + cnt +  " flume events."  ). Print ()//start running Ssc.start ()//Calculate complete Exit ssc.awaittermination () Ssc.stop ()      Here is a pit is always reported not to find flumeutils, in fact, he in Spark-examples-1.6.1-hadoop2.6.0.jar this bag, I through the source code added to the package Ah, is not val sparkconf  = new sparkconf (). Setappname ("Adclickedstreamingstats")   .setmaster ("local[5]"). SetJars (  list (   "/lib/spark-1.6.1/spark-streaming-kafka_2.10-1.6.1.jar",   "/lib/kafka-0.10.0/ Kafka-clients-0.10.0.1.jar ",  "/lib/kafka-0.10.0/kafka_2.10-0.10.0.1.jar ",  "/lib/ Spark-1.6.1/spark-streaming_2.10-1.6.1.jar ",  "/lib/kafka-0.10.0/metrics-core-2.2.0.jar ",    "/lib/kafka-0.10.0/zkclient-0.8.jar",   "/lib/spark-1.6.1/mysql-connector-java-5.1.13-bin.jar",    "/lib/spark-1.6.1/spark-examples-1.6.1-hadoop2.6.0.jar",   "/opt/ Spark-1.5.0-bin-hadoop2.6/sparkapps.jar )      no way, Overlord the bow, or it,   bin/spark-submit  --class com.dT.spark.flume.sparkstreamingflume   --jars /lib/spark-1.6.1/ spark-examples-1.6.1-hadoop2.6.0.jar    --master local[5] sparkapps.jar  192.168.0.10 11111     the rest of the dishes!    Run Test

1. Submit First

Submit in Spark, generate 11111 listener
* In/spark-submit--class Com.dt.spark.flume.SparkStreamingFlume
--jars/lib/spark-1.6.1/spark-examples-1.6.1-hadoop2.6.0.jar
--master local[5] Sparkapps.jar 192.168.0.10 11111

Otherwise, you will not be connected to port 11111.

2, Flume start

$ bin/flume-ng Agent--conf conf--conf-file example5.properties --name a1-dflume.root.logger=info,console

Because the Avro way, it will output to port 11111, then start 22222 port monitoring

There's a pit in the middle. Unable to create RPC client using hostname:192.168.0.10, port:11111

Such a mistake, the original is Bin/flume-ng agent--conf conf--conf-file conf/example5.properties--name A1-dflume.root.logger=info, Console

The name in the wrong

3. Trigger Data:

telnet localhost 22222

The input string, and then the effect appears on the console of the flume.

2016 Big data spark "mushroom cloud" action flume integration spark streaming

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

2016 Big data spark "mushroom cloud" action flume integration spark streaming

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

2016 Big data spark "mushroom cloud" action flume integration spark streaming

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support