Java program way to start Spark business program

Source: Internet
Author: User

Website:

Http://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/launcher/package-summary.html

Referring to this example, I wrote the launcher, which can execute the business program written by Spark with the Java command line.

Today again to see an article, the following is the online users of the original:

Sometimes we need to start our spark application from the another Scala/java. So we can use Sparklauncher. We have a example in which we do spark application and run it with another Scala application.

Let's spark application code.


Import org.apache.spark.SparkConf
Import Org.apache.spark.SparkContext

Object Sparkapp extends app{
Val conf=new sparkconf (). Setmaster ("local[*]"). Setappname ("Spark-app")
Val sc=new sparkcontext (conf)
Val rdd=sc.parallelize (Array (2,3,2,1))
Rdd.saveastextfile ("result")
Sc.stop ()
}

This is all simple spark application, make a jar of the this application using SBT assembly, now we make a Scala application T Hrough which we start this spark application as follows:

Import Org.apache.spark.launcher.SparkLauncher

Object Launcher extends App {

Val spark = new Sparklauncher ()
. Setsparkhome ("/home/knoldus/spark-1.4.0-bin-hadoop2.6")
. Setappresource ("/home/knoldus/spark_launcher-assembly-1.0.jar")
. Setmainclass ("Sparkapp")
. Setmaster ("local[*]")
. Launch ();
Spark.waitfor ();

}

In the above code we use Sparklauncher object and set values of

Setsparkhome ("/home/knoldus/spark-1.4.0-bin-hadoop2.6") is use to set spark home which are use internally to call Spark Sub Mit.

The. Setappresource ("/home/knoldus/spark_launcher-assembly-1.0.jar") is the to specify jar of our spark application.

. Setmainclass ("Sparkapp") the entry point of the Spark program i.e driver.

. Setmaster ("local[*]") set the address of Master, where its start, now we run it on loacal machine.

. Launch () is simply to start our spark application.

Its a minimal requirement can also set many other configurations like pass arguments, add jar, set configurations etc .

For source code you can check out following git repo:

Spark_laucher is our Spark application

Launcher_app are our Scala application which start spark application

Change path according to and make a jar of Spark_laucher, run Launcher_app and do RDD in this directory as a R Esult of Spark application because we simple save it as a text file.

Https://github.com/phalodi/Spark-launcher

Presumably, you write a Myspark.jar file, which is written in the normal spark process.

And then write a launcher, my understanding is to write a program like Spark-class to start or invoke the Myspark.jar file you wrote above.

Key place: Setappresource,setmainclass,setmaster, set your Myspark.jar and the name of the class to run in this jar, and the mode of running. After testing Setmaster seems to support only "yarn-client". (The test results in the experiment, guess that because there is something exchanged in my code, this is only supported yanr-client. Theory supports yarn-cluster mode. Maybe it's my program problem.

The method used in Java.

Java-jar Spakr-launcher can not run your Myspark.jar file by using a script (such as Spark-submit) (such a way to run, see output to the screen results, in fact, the official web has such an example, Also has the output to the screen example, looks like uses the OutputStream, has forgotten ... )

Why use this way: because the Myspark business program needs to run in conjunction with a Web container, if it can run in a Java environment, Spark runs in a web container. (No test yet ....) )

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.