Build a spark integrated development environment with eclipse

Source: Internet
Author: User
Tags scala ide

http://dongxicheng.org/framework-on-yarn/spark-eclipse-ide/

The previous article "Apache Spark Learning: Deploying Spark to Hadoop 2.2.0" describes how to use MAVEN compilation to build a spark jar package that can run directly on Hadoop 2.2.0, and this article builds on this Describes how to build a spark integrated development environment with eclipse. It is not recommended to use Eclipse to develop spark programs and read the source code, it is recommended to use IntelliJ idea, specific reference article: Apache Spark Quest: Build the development environment with IntelliJ idea.

(1) Preparatory work

Before the formal introduction, the following hardware and software preparation:

Software Preparation:

Eclipse Juno Version (4.2 version), can be downloaded directly here: Eclipse 4.2

Scala 2.9.3 version, window Installer can be downloaded directly here: Scala 2.9.3

The Eclipse Scala IDE plugin can be downloaded directly here: Scala IDE (for Scala 2.9.x and Eclipse Juno)

Hardware Preparation

A machine with Linux or a Windows operating system

(2) Building the Spark integrated development environment

I am operating under the Windows operating system, the process is as follows:

Step 1: Install Scala 2.9.3: Click Install directly.

Step 2: Copy all files from the features and plugins two directories in the Eclipse Scala IDE plugin to the corresponding directory after Eclipse decompression

Step 3: Restart Eclipse, click the box button in the upper right corner of Eclipse, as shown below, expand, click "Other ..." to see if there is a "Scala", if so, click Open, otherwise proceed to step 4.

Step 4: In Eclipse, select "Help" –> "Install New software ..." and fill in the Open card http://download.scala-ide.org/sdk/e38/scala29/ Stable/site, and press ENTER, you can see the following content, select the first two to install. (Since step 3 has already copied the jar package into eclipse, install it quickly, just dredge it) once the installation is complete, repeat steps 3 again.

(3) Developing spark programs using the Scala language

In Eclipse, select "File" –> "New" –> "Other ..." –> "Scala Wizard" –> "Scala Project", create a Scala project and name "Sparkscala".

Right-click on the "Saprkscala" project, select "Properties", in the pop-up box, select "Java Build Path" –> "libraties" –> "Add External JARs ..." in the following image, and then import the article " Apache Spark Learning: Deploying Spark to Hadoop 2.2.0

assembly/target/scala-2.9.3/ Directory of Spark-assembly-0.8.1-incubating-hadoop2.2.0.jar, this jar package can also be compiled by itself spark generated, placed in the Spark directory of the assembly/target/ The SCALA-2.9.3/directory.

Similar to creating a Scala project, add a Scala Class to the project named: WordCount, the entire project structure is as follows:

WordCount is the most classic word frequency statistics program, which will count the total number of occurrences of all words in the input directory, Scala code is as follows:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Import Org.apache.spark. _ Import Sparkcontext. _ Object WordCount {def main (args:array[string]) {if (args.length! = 3) {
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.