1) Preparatory work
1) Install JDK 6 or JDK 7 or JDK8 Mac's see http://docs.oracle.com/javase/8/docs/technotes/guides/install/mac_jdk.html
2) Install Scala 2.10.x (note version) See http://www.cnblogs.com/xd502djj/p/6546514.html
2) Download IntelliJ Idea's latest version (this article IntelliJ idea Community Edition 13.1.1 as an example, different versions, the interface layout may be different): http://www.jetbrains.com/idea/download/
3) After extracting the downloaded IntelliJ idea, install the Scala plugin, the process is as follows:
Select "Configure" –> "Plugins" –> "Browse repositories", enter Scala, then install
(2) Build spark source reading environment (need network)
One way is to select "Import Project" –> Select the Spark directory –> "SBT", then IntelliJ automatically recognize the SBT file and download the dependent external jar package, the whole process takes a very long time, Depending on the network environment of the machine (it is not recommended to operate under Windows, you may encounter various problems), it typically takes dozens of minutes to several hours. Note that git is used in the download process, so git should be installed beforehand.
The second method is to first build the IntelliJ project file on the Linux operating system and then open the project directly from "Open project" in IntelliJ idea. The method of generating the IntelliJ project file on Linux (need to install GIT, do not need to install SCALA,SBT automatically download) is: Under the Spark source code root directory, enter SBT/SBT Gen-idea
Note: If you read the source code under Windows, it is recommended that you build the project file under Linux and then import it into IntelliJ idea in Windows.
(3) Build Spark development environment
Create Scala project in IntelliJ idea and select File –> project Structure –> Libraries, select + to import the spark-hadoop corresponding package, For example, import Spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar (just import the jar package, others don't), and if the IDE doesn't recognize the Scala library, you'll need to import the Scala library in the same way. After that, you can develop a Scala program:
Once you have written the Scala program, you can run it directly in IntelliJ, in local mode, using the following method:
Click "Run" –> "Run Configurations", in the box that appears in the corresponding column "local", indicating that the parameter is passed to the main function, as shown, then click "Run" –> "Run" running the program.
If you want to make the program into a jar package and run it as a command line in the Spark cluster, you can follow these steps:
Select "File" –> "Project Structure" –> "Artifact", select "+" –> "Jar" –> "from Modules with dependencies", select the main function, and select the output jar location in the pop-up box and select "OK".
Finally, select "Build" –> "Build Artifact" to compile the build jar package. Specific as shown.
original articles, reproduced please specify: reproduced from Dong's blog
This article link address: http://dongxicheng.org/framework-on-yarn/apache-spark-intellij-idea/
Dong, author Introduction: http://dongxicheng.org/about/
Collection of articles for this blog:http://dongxicheng.org/recommend/
Apache Spark Quest: Building a development environment with IntelliJ idea