Http://www.cnblogs.com/davidwang456/p/5032766.html
Windows under the Spark development environment configuration
--This essay is provided by my colleague GE.
Windows under the Spark development environment configuration
Special Note: Under Windows Development Spark does not need to install Hadoop locally, but requires files such as Winutils.exe, Hadoop.dll, etc., provided you have installed Eclipse, MAVEN, JDK and other software
Spark Support JDK version recommendations are 1.8 and above, if developing Spark recommends setting the JDK build version to 1.8
My choice of spark is spark-1.4.0-bin-hadoop2.6.tgz, so take this version for example
First step: Download spark-1.4.0-bin-hadoop2.6.tgz to local, and unzip in local directory
Address: http://spark.apache.org/downloads.html
Second step: Download the Hadoop Toolkit under Windows (divided into 32-bit and 64-bit), create a new Hadoop directory locally, you must have a bin directory for example: D:\spark\hadoop-2.6.0\bin
Then put the files such as Winutil in the bin directory
Address: Https://github.com/sdravida/hadoop2.6_Win_x64/tree/master/bin
Step Three: Configure environment variables for Hadoop and spark:
Hadoop_home Example: D:\spark\hadoop-2.6.0
Spark_home
Spark_classpath
Join Spark and Hadoop in path
At this point, enter Spark-shell under the cmd command to configure OK in this Windows
Build your own spark MAVEN project Hello World program
For programmers who already have an Eclipse environment installed, there is no need to install the SAPRK development environment separately, since Spark is based on the Scala language, so if you want to see the source code, you need to install the Scala eclipse plugin
First step: Install the Scala eclipse plugin
Address: Http://download.scala-ide.org/sdk/lithium/e44/scala211/stable/site
First step: Create your own spark MAVEN project
Tick Create a simple project
The second step: Select the Maven-generated package, which must be selected here, because the spark program is typically packaged as a jar package
Others must fill in the required fields
Step three: Add the Spark jar package to the new MAVEN project you just created
Locate the Spark installation directory for the cluster installation and see in the Lib directory
Will be added to the MAVEN project's build path
Fourth step: Add your own Spark,hadoop maven dependency in the POM
For example:
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>1.5.2</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>1.2.0</version>
</dependency>
Fifth: The entrance to the Spark program is the main function, so you can write your own Hello world and let it run and debug.
Public class Sparkmain implements Serializable {
Public Static void Main (string[] args) throws Exception {
Write your own Spark program
System.out.println ("Hello spark!");
}
}
Now everything are ready for you to run your main Class. enjoy!
Category: Big data and cloud computing good text to the top of the attention I collect the article One day not progress, is a step backwards
Follow-18
Fans-274 + plus attention00(Please comment on the article) «Previous: Source analysis Netty Server creation process vs. Java NiO server creation
» Next Post: Analyzing Netty components from netty-example posted on2015-
Windows under the Spark development environment configuration