Using flume data sources in spark

Source: Internet
Author: User

There are two ways, one is sparkstreaming in the driver from listening, flume to push the data, the other is sparkstreaming according to the time policy rotation to flume pull data.

At first I thought there was only the first method, but the Nima problem is that driver up the knot is flaky, so every time I restart streaming found that every time to change the flume, the egg pain died, later found there is the method, OK, the different method code written out, Actually, it doesn't change much. (The code is transferred from the official githup)

The first, listening port:

Package Org.apache.spark.examples.streamingimport Org.apache.spark.SparkConfimport Org.apache.spark.storage.StorageLevelimport Org.apache.spark.streaming._import Org.apache.spark.streaming.flume. _import org.apache.spark.util.intparam/** * Produces a count of events received from Flume. * * This should is used in conjunction with a avrosink in Flume. It would start * An Avro server on at the request Host:port address and listen for requests. * Your Flume Avrosink should is pointed to this address. * * Usage:flumeeventcount 

The second is that rotation take the data to flume actively.

Package Org.apache.spark.examples.streamingimport Org.apache.spark.SparkConfimport Org.apache.spark.storage.StorageLevelimport Org.apache.spark.streaming._import Org.apache.spark.streaming.flume. _import Org.apache.spark.util.IntParamimport java.net.inetsocketaddress/** * Produces a count of events received from Flu Me. * * This should is used in conjunction with the Spark Sink running in a Flume agent. See * The Spark Streaming Programming Guide for more details. * * Usage:flumepollingeventcount 

  

Using flume data sources in spark

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.