Windows under the Spark development environment configuration

Source: Internet
Author: User

Http://www.cnblogs.com/davidwang456/p/5032766.html

Windows under the Spark development environment configuration

--This essay is provided by my colleague GE.

Windows under the Spark development environment configuration

Special Note: Under Windows Development Spark does not need to install Hadoop locally, but requires files such as Winutils.exe, Hadoop.dll, etc., provided you have installed Eclipse, MAVEN, JDK and other software

Spark Support JDK version recommendations are 1.8 and above, if developing Spark recommends setting the JDK build version to 1.8

My choice of spark is spark-1.4.0-bin-hadoop2.6.tgz, so take this version for example

First step: Download spark-1.4.0-bin-hadoop2.6.tgz to local, and unzip in local directory

Address: http://spark.apache.org/downloads.html

Second step: Download the Hadoop Toolkit under Windows (divided into 32-bit and 64-bit), create a new Hadoop directory locally, you must have a bin directory for example: D:\spark\hadoop-2.6.0\bin

Then put the files such as Winutil in the bin directory

Address: Https://github.com/sdravida/hadoop2.6_Win_x64/tree/master/bin

Step Three: Configure environment variables for Hadoop and spark:

Hadoop_home Example: D:\spark\hadoop-2.6.0

Spark_home

Spark_classpath

Join Spark and Hadoop in path

At this point, enter Spark-shell under the cmd command to configure OK in this Windows

Build your own spark MAVEN project Hello World program

For programmers who already have an Eclipse environment installed, there is no need to install the SAPRK development environment separately, since Spark is based on the Scala language, so if you want to see the source code, you need to install the Scala eclipse plugin

First step: Install the Scala eclipse plugin

Address: Http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site

First step: Create your own spark MAVEN project

Tick Create a simple project

The second step: Select the Maven-generated package, which must be selected here, because the spark program is typically packaged as a jar package

Others must fill in the required fields

Step three: Add the Spark jar package to the new MAVEN project you just created

Locate the Spark installation directory for the cluster installation and see in the Lib directory

Will be added to the MAVEN project's build path

Fourth step: Add your own Spark,hadoop maven dependency in the POM

For example:

<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-core_2.10</artifactId>

<version>1.5.2</version>

</dependency>

<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-sql_2.10</artifactId>

<version>1.5.2</version>

</dependency>

<dependency>

<groupId>org.apache.hadoop</groupId>

<artifactId>hadoop-client</artifactId>

<version>1.2.0</version>

</dependency>

Fifth: The entrance to the Spark program is the main function, so you can write your own Hello world and let it run and debug.

Public class Sparkmain implements Serializable {

Public Static void Main (string[] args) throws Exception {

Write your own Spark program

System.out.println ("Hello spark!");

}

}

Now everything are ready for you to run your main Class. enjoy!

Category: Big data and cloud computing good text to the top of the attention I collect the article One day not progress, is a step backwards
Follow-18
Fans-274 + plus attention00(Please comment on the article) «Previous: Source analysis Netty Server creation process vs. Java NiO server creation
» Next Post: Analyzing Netty components from netty-example posted on2015-

Windows under the Spark development environment configuration

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.