Run the first sparkstreaming program (and problem solving in the process)
Debug Spark Standalone in Windows IntelliJ idea
Sbt-assembly launches Scala Project
Develop and test Spark's environment and simple tests using idea
Running Scala programs based on Spark (SBT and command line methods)
is to practice the process of developing a Scala project to create a project
Create a Scala project named streaming, built with SBT, because I use the Spark version of 2.0.0, so the official Scala version is more than 0.13.6, so pay attention to compatibility issues
Add spark-streaming Dependency
adding dependencies in BUILD.SBT
Name: = "streaming"
version: = "1.0"
scalaversion: = "2.11.8"
librarydependencies + = "Org.apache.spark"% " spark-streaming_2.11 "%" 2.0.0 "%" provided "
sbt-assembly Add-ons
Add the following line in PROJECT/PLUGINS.SBT, and then recompile the project
Addsbtplugin ("com.eed3si9n"% "sbt-assembly"% "0.14.3")
Coding
Add the Qnetworkwordcount.scala file in the SRC directory
The code is as follows:
Import org.apache.log4j. {level, Logger}
Import org.apache.spark.SparkConf
import org.apache.spark.storage.StorageLevel
Import Org.apache.spark.streaming. {Seconds, StreamingContext}
/**
* Created by Doctorq on 16/9/2.
*/
Object qnetworkwordcount{
def main (args:array[string]): Unit = {
if (Args.length < 2) {
System.err.println ("Usage:networkwordcount
Run
Two ways to run, one is to run the IDE directly, one is the command line using Spark-submit run, either way to run, you need to perform NC-LK 9999 open channel
Idea Run
First configure the operating parameters
The red boxes are labeled as passed parameters, and then perform a look at the effect
Command line Run
First you need SBT packaging
Executing commands in the Spark directory
> bin/spark-submit--class "Com.iwaimai.huatuo.QNetworkWordCount"--master local[2]/users/doctorq/documents /developer/idea_workspace/streaming/target/scala-2.11/streaming-assembly-1.0.jar localhost 9999
Cluster operation
The value of--master will be local[*] modified to the master URL in the cluster (my machine is spark://doctorqdemacbook-pro.local:7077)
/users/doctorq/documents/developer/spark-2.0.0-bin-hadoop2.7/bin/spark-submit--class " Com.iwaimai.huatuo.QNetworkWordCount "--master spark://doctorqdemacbook-pro.local:7077/users/doctorq/documents/ Developer/idea_workspace/streaming/target/scala-2.11/streaming-assembly-1.0.jar localhost 9999
Summary
Mainly through such an example to comb the idea under the Scala development project has been packaged process spark streaming at the command line must be packaged first, using the Spark-submit run the sample is used in the local mode, so in WebUI do not see the information of the instance, If you want to use the cluster to run, we need to modify the value of the next--master to the master URL, for example, my local address is spark://doctorqdemacbook-pro.local:7077, do not forget the code does not set the master information