Build Scala+spark development environment with Eclipse and idea, respectively

Last Update:2015-07-29 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Install jdk1.7.0_60 and scala2.10.4 on the development machine and configure the relevant environment variables. Many online data, installation process ignored. In addition, Eclipse uses Luna4.4.1,idea to use the 14.0.2 version.

1. Eclipse Development Environment Building 1.1. Install the Scala plugin

Installing the Eclipse-scala-plugin plug-in, http://scala-ide.org/download/prev-stable.html

After unzipping, copy the plugins and features to the Eclipse directory and restart Eclipse.

Windows---open perspective, other ..., opens Scala, stating that the installation was successful.

1.2. Create a MAVEN project

Open file, New, and other ..., select maven Project:

Click Next to enter the project storage path:

Click Next and select Org.scala-tools.archetypes:

Click Next to enter artifact related information:

Click Finish. The project directory structure created by default is as follows:

To modify the Pom.xml file:

At this point, a default Scala project is completed.

2. Spark development Environment Build 2.1. Installing the Scala plugin

The idea version used by the development machine is IntelliJ Ieda 14.0.2. To enable the idea to support Scala development, you need to install the Scala plugin,

After the plug-in installation is complete, IntelliJ idea will require a reboot.

2.2. Create a MAVEN project

Click Create New Project to select the JDK installation directory in the Project SDK (it is recommended that the JDK version in the development environment be consistent with the JDK version on the Spark cluster). Click Maven on the left, tick create from archetype, select Org.scala-tools.archetypes:scala-archetype-simple:

After clicking Next, you can fill in Groupid,artifactid and version on demand (please ensure that Maven has been installed before). When you click Finish, MAVEN will automatically generate pom.xml and download dependent packages. The Scala version in Pom.xml needs to be modified in the same way as in the 1.2 section to create a MAVEN project under Eclipse.

At this point, one of the default Scala projects under idea is created.

3. WordCount Example Program 3.1. Modify the Pom file

Add spark and Hadoop dependent packages to the Pom file:

<!--Spark -<Dependency><groupId>Org.apache.spark</groupId><Artifactid>spark-core_2.10</Artifactid><version>1.1.0</version></Dependency><!--Spark Steaming -<Dependency><groupId>Org.apache.spark</groupId><Artifactid>spark-streaming_2.10</Artifactid><version>1.1.0</version></Dependency><!--HDFS -<Dependency><groupId>Org.apache.hadoop</groupId><Artifactid>Hadoop-client</Artifactid><version>2.6.0</version></Dependency>

View Code

Use the Maven-assembly-plugin plugin in <build></build> to package the dependent jar as well.

<plugin><Artifactid>Maven-assembly-plugin</Artifactid><version>2.5.5</version><Configuration><Appendassemblyid>False</Appendassemblyid><Descriptorrefs><Descriptorref>Jar-with-dependencies</Descriptorref></Descriptorrefs><Archive><Manifest><MainClass>Com.ccb.WordCount</MainClass></Manifest></Archive></Configuration><executions><Execution><ID>make-assembly</ID><Phase>Package</Phase><Goals><goal>Assembly</goal></Goals></Execution></executions></plugin>

View Code3.2. WordCount Example

WordCount is used to count the occurrences of all the words in the input file, code reference:

 PackageCOM.CCBImportOrg.apache.spark. {sparkconf, sparkcontext}Importorg.apache.spark.sparkcontext._/*** Count The total number of occurrences of all words in the input directory*/Object WordCount {def main (args:array[string]) {val Dirin= "Hdfs://192.168.62.129:9000/user/vm/count_in"Val dirout= "Hdfs://192.168.62.129:9000/user/vm/count_out"Val conf=Newsparkconf () Val SC=Newsparkcontext (Conf) Val line=Sc.textfile (Dirin) Val cnt= Line.flatmap (_.split ("")). Map ((_, 1)). Reducebykey (_ + _)//file split by space, count the number of wordsVal sortedcnt= Cnt.map (x = = (x._2, x._1)). Sortbykey (Ascending =false). Map (x = (x._2, x._1))//Sort by number of occurrences from highest to lowestsortedcnt.collect (). foreach (println)//Console Outputsortedcnt.saveastextfile (dirout)//writing to a text filesc.stop ()}}

View Code3.3. Submit Spark Execution

Use the MAVEN pacakge package to get the Sparktest-1.0-snapshot.jar and commit to the spark cluster to run.

To perform a command reference:

./spark-submit--name Wordcountdemo--class com.ccb.WordCount Sparktest-1.0-snapshot.jar

can get statistical results.

Build Scala+spark development environment with Eclipse and idea, respectively

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Build Scala+spark development environment with Eclipse and idea, respectively

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

Build Scala+spark development environment with Eclipse and idea, respectively

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support