spark開發環境配置

來源:互聯網
上載者:User

標籤:class   blog   code   http   tar   com   

以後spark,mapreduce,mpi可能三者集於同一平台,各自的側重點有所不用,相當於雲端運算與高效能運算的集合,互補,把spark的基礎看了看,現在把開發環境看看,主要是看源碼,最近Apache Spark源碼走讀系列挺好的,看了些。具體環境配置不是太複雜,具體可以看https://github.com/apache/spark

1、代碼下載

git clone  https://github.com/apache/spark.git

2、直接構建spark

我是基於hadoop2.2.0的,因此執行如下:

SPARK_HADOOP_VERSION=2.2.0 SPARK_YARN=true sbt/sbt assembly

3、具體使用參考https://github.com/apache/spark

Interactive Scala Shell

The easiest way to start using Spark is through the Scala shell:

./bin/spark-shell

Try the following command, which should return 1000:

scala> sc.parallelize(1 to 1000).count()
Interactive Python Shell

Alternatively, if you prefer Python, you can use the Python shell:

./bin/pyspark

And run the following command, which should also return 1000:

>>> sc.parallelize(range(1000)).count()
Example Programs

Spark also comes with several sample programs in the examples directory. To run one of them, use./bin/run-example <class> [params]. For example:

./bin/run-example SparkPi

will run the Pi example locally.

You can set the MASTER environment variable when running examples to submit examples to a cluster. This can be a mesos:// or spark:// URL, "yarn-cluster" or "yarn-client" to run on YARN, and "local" to run locally with one thread, or "local[N]" to run locally with N threads. You can also use an abbreviated class name if the class is in the examples package. For instance:

MASTER=spark://host:7077 ./bin/run-example SparkPi

Many of the example programs print usage help if no params are given.

Running Tests

Testing first requires building Spark. Once Spark is built, tests can be run using:

./sbt/sbt test


使用IDE,安裝 Intellj Idea,並安裝scala外掛程式

去idea官網下載idea的tar.gz包,解壓就行。運行idea,安裝scala外掛程式。

在源碼根目錄,使用如下命令

./sbt/sbt gen-idea

就產生了idea專案檔。使用 idea,點擊File->Open project,瀏覽到 incubator-spark檔案夾,開啟項目,就可以修改Spark代碼了。

 

具體參考:https://github.com/apache/spark

http://cn.soulmachine.me/blog/20140130/

聯繫我們

該頁面正文內容均來源於網絡整理,並不代表阿里雲官方的觀點,該頁面所提到的產品和服務也與阿里云無關,如果該頁面內容對您造成了困擾,歡迎寫郵件給我們,收到郵件我們將在5個工作日內處理。

如果您發現本社區中有涉嫌抄襲的內容,歡迎發送郵件至: info-contact@alibabacloud.com 進行舉報並提供相關證據,工作人員會在 5 個工作天內聯絡您,一經查實,本站將立刻刪除涉嫌侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.