Spark Development Environment Configuration (Windows/intellij idea article)

Source: Internet
Author: User

Intellij idea is a pretty good IDE, popular in the Java/scala/groovy field. I used to be eclipse that set of development environment, although also good, but can not help curiosity driven, toss the idea, found to write the spark program, "Ah yo, good yo!" , so summarize the configuration process in the Windows system (in the configuration of the Mac is actually more simple), their own pondering when the time is still a bit hard, the online various tutorials new and old right and wrong, not trouble, plus the author of the most annoying is the download of various software installation configuration, sometimes a configuration did not do, Wasted a long time. Therefore, the process of self-exploration to summarize, convenient for spark enthusiasts reference.

Configuration Prerequisites
  1. JDK installation. Please go to the official website of Oracle to download the installation, and in the Command Line window to confirm that Java-version can return the version number, otherwise the system environment variable set location to confirm whether Java has been added to the path
  2. Scala download installation. Go to the official website http://www.scala-lang.org/download and install. As the 1th step, confirm that you can enter the Interactive command window by clicking on the command line, otherwise confirm the configuration of the environment variable.
  3. Spark Source code download. The official website http://spark.apache.org/downloads.html has a variety of Hadoop version of the pre-compiled version of the Spark Code, in theory, according to the version of Hadoop you are using the corresponding choice, this article only for configuration instructions, so choose one of them. I downloaded the spark1.5, corresponding to the hadoop2.6 precompiled version, decompression can be.
  4. Intellij idea download. You can download the free community version on the https://www.jetbrains.com/idea/.
Configuration start

1. Install the Scala plugin for idea

Steps as shown, the author has been installed, so the right shows that the plugin already exists. At the first installation, enter the Scala keyword search at plugins and click Install in the networked environment.

2. Create the project and import the appropriate dependency package

To create a new project, select Scala, and then click Next, where the project SDK is the Java JDK, and if it is not loaded by default, click New and manually navigate to the JDK directory to submit it. Scala SDK where if not loaded by default, click Create, in the pop-up window, the default check system click OK.

At this point, my project has grown into the following shape:

Next, we import the above downloaded spark source code. Follow the instructions, select Java at the + sign, then navigate to the directory location where you extracted the spark program from the above steps, select the Spark-assembly-1.5.0-hadoop2.6.0.jar file in the Lib directory, and confirm.

At this point, my project depends on the external library file, there is more out of the spark source this file, as shown in:

Program Development

If you have done your basic configuration, start coding.

Here are a few details to ask. If you find that right-click on the SRC directory does not find a file creation option such as Scala class, the attribute of this folder is incorrect, be sure to confirm that the SRC shown is selected as the Sources folder property.

The 2nd detail, when creating the class file, generally people like to follow the directory structure, such as the creation of main/scala/or Main/java such sub-folders, of course, so convenient code management, but the author in this operation, the subsequent compilation of files when the error. Google half a day, found that many people encounter the same problem, delete these subfolders, directly under the SRC Create class file, OK.

Finally, the coding link has been entered. This article is convenient for example, directly with the spark source code in the example program-SPARKLR to explain (these case programs are located in your spark extract directory \examples\src\main\scala\org\apache\spark\ examples). Right-click on the SRC folder to select Scala class, kind select object here, and then paste the above case procedure.

As of now, our project has grown to look like this:

Then, you have to configure the output, that is, you write the code compiled packaged into a jar package, where to go, what name. The operation is as follows:

After clicking OK, the output form is configured.

Compiling & packaging

Build, build artifacts-I, build, then start compiling, and at the bottom you can see the status in progress. If there is no error, it is success.

Go to the output Jar package directory and find out that a jar file was successfully generated. "Special tip: If the jar package is to be committed to a number of flat spark clusters, open the jar file and see if there is a Scala folder, delete it!" Otherwise, it may conflict with the Scala version on the line "

In this case, the entire configuration, development, and compilation process is completed successfully. As for the submission of the operation, it is not within the scope of this article, there is time to write a record.

Conclusion

The configuration of the development environment is one of the most boring and no sense of accomplishment in the programmer's work, especially the multi-functional points of various Ides, the mouse points, and sometimes do not know what the logical relationship is. Therefore, if this article can be fortunate for you to provide convenience, it is not worth my effort to summarize a pass. OK, here we go.

Spark Development Environment Configuration (Windows/intellij idea article)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.