java程式方式啟動 spark業務程式

來源:互聯網
上載者:User

官網:

http://spark.apache.org/docs/1.4.0/api/java/org/apache/spark/launcher/package-summary.html

參照這個例子我寫出了launcher,可以用java 命令列執行spark編寫的業務程式

今天又搜尋了一下看到一篇文章,以下是網友的網上的原文:

Sometimes we need to start our spark application from the another scala/java application. So we can use SparkLauncher. we have an example in which we make spark application and run it with another scala application.

Let see our spark application code.


import org.apache.spark.SparkConf
import org.apache.spark.SparkContext

object SparkApp extends App{
val conf=new SparkConf().setMaster("local[*]").setAppName("spark-app")
val sc=new SparkContext(conf)
val rdd=sc.parallelize(Array(2,3,2,1))
rdd.saveAsTextFile("result")
sc.stop()
}

This is our simple spark application, make a jar of this application using sbt assembly, now we make a scala application through which we start this spark application as follows:

import org.apache.spark.launcher.SparkLauncher

object Launcher extends App {

val spark = new SparkLauncher()
.setSparkHome("/home/knoldus/spark-1.4.0-bin-hadoop2.6")
.setAppResource("/home/knoldus/spark_launcher-assembly-1.0.jar")
.setMainClass("SparkApp")
.setMaster("local[*]")
.launch();
spark.waitFor();

}

In the above code we use SparkLauncher object and set values for its like

setSparkHome(“/home/knoldus/spark-1.4.0-bin-hadoop2.6”) is use to set spark home which is use internally to call spark submit.

.setAppResource(“/home/knoldus/spark_launcher-assembly-1.0.jar”) is use to specify jar of our spark application.

.setMainClass(“SparkApp”) the entry point of the spark program i.e driver program.

.setMaster(“local[*]”) set the address of master where its start here now we run it on loacal machine.

.launch() is simply start our spark application.

Its a minimal requirement you can also set many other configurations like pass arguments, add jar , set configurations etc.

For source code you can check out following git repo:

Spark_laucher is our spark application

launcher_app is our scala application which start spark application

Change path according to you and make a jar of Spark_laucher, run launcher_app and see result RDD in this directory as a result of spark application because we simple save it as a text file.

https://github.com/phalodi/Spark-launcher

大概的意思是,你寫一個myspark.jar這樣的檔案,這個檔案按照正常的spark流程寫。

然後寫一個launcher,我的理解就是寫一個類似spark-class這樣的程式啟動或者調用你上面寫的那個myspark.jar檔案

關鍵地方:setAppResource,setMainClass,setMaster,分別設定你的myspark.jar和這個jar中要啟動並執行類名,還有啟動並執行模式。經過測試setMaster好像只支援“yarn-client”.(實驗中測試的結果,猜測因為My Code裡面有交換的東西,所以這裡得到只支援yanr-client.理論上支援yarn-cluster模式。可能是我的程式問題)

用java使用的方法。 

java -jar spakr-launcher 就可以不在使用指令碼(spark-submit這樣的方式)運行你的myspark.jar檔案了(這樣的方式運行,看不到輸出到螢幕的結果,其實官網上有這樣的例子,還有輸出到螢幕的例子,好像用的outputstream,忘記了。。。。。)

為什麼使用這樣的方式:因為myspark業務程式需要結合web容器運行,如果能在java的環境運行,spark運行在web容器裡。(還沒有測試.......)

相關文章

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.