IntelliJ idea to build spark development environment

IntelliJ idea to build spark development environment _intellij

Last Update:2018-08-22 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The installation and configuration of Spark is described in the Spark QuickStart Guide –spark installation and foundation use, where it also introduces the use of Spark-submit submission applications, but it is not possible to develop spark applications using VIM, which is handy for the IDE. This article describes using IntelliJ idea to build a spark development environment.

1, the installation of IntelliJ idea

Because Spark is installed in the Ubuntu environment, the idea here is also installed in Ubuntu. First is the download, to the official website download can be. After downloading, extract to the directory to be installed:

sudo tar-zxvf ideaiu-2016.1.tar.gz-c/usr/local/

I unpacked it in the/usr/local directory, and then changed the folder name:

MV ideaIU-2016.1 Idea

Then modify the user and user groups for the file:

sudo chown-r hadoop:hadoop idea

The Hadoop here is my username and group name. So idea was installed successfully.

To start idea, go into the Idea/bin directory and execute the idea.sh inside:

bin/idea.sh

So you can start idea. However, this is inconvenient, you can create a new file idea.desktop on the desktop, enter the following:

[Desktop Entry]
Name=ideaiu
comment=rayn-idea-iu
exec=/usr/local/idea/bin/idea.sh
icon=/usr/local/idea/bin/ Idea.png
terminal=false
type=application
categories=developer;

This creates a desktop shortcut.

2, MAVEN installation and configuration

Maven is a project management and build automation tool. As a programmer, there has been an experience of adding jar packs to a project in order to use a feature, with more frameworks and more jar packages to add, and Maven can automatically add the jar packages we need. First download maven on the MAVEN Web site:

After downloading, the following files are available under the downloads directory:

liu@binja:~/downloads$ ls
apache-maven-3.3.9-bin.tar.gz

Extract to the directory to be installed:

liu@binja:~/downloads$ sudo tar-zxvf apache-maven-3.3.9-bin.tar.gz-c/usr/local/

Similarly, modify the folder name and user name:

liu@binja:/usr/local$ sudo mv apache-maven-3.3.9/maven
liu@binja:/usr/local$ sudo chown-r liu:liu maven
liu@binja:/usr/local$ ll maven Total
drwxr-xr-x  6 Liu  Liu   4096  March 20:24./
Drwxr-xr-x root  4096  March 28 20:26. /
drwxr-xr-x  2 Liu  Liu   4096  March 20:24 bin/
drwxr-xr-x  2 Liu  Liu   4096  March 20:24 boot/
drwxr-xr-x  3 Liu  Liu   4096 November 00:38 conf/drwxr-xr-x  3 Liu  Liu   4096  March 20:24 lib/
-rw-r--r--  1 Liu  Liu  19335 November 00:44 LICENSE
- rw-r--r--  1 Liu  Liu    182 November one 00:44 NOTICE
-rw-r--r--  1 Liu  Liu   2541 November 11 00:38 README.txt

Then add maven to the environment variable:

sudo vim ~/.BASHRC

Add the following at the end:

Export path= $PATH:/usr/local/maven/bin

To make a change take effect:

liu@binja:/usr/local$ Source ~/.BASHRC

So maven is installed.

3, with the newly installed MAVEN configuration idea

Start idea brings Maven and configures its own installed Maven.

Select File->setting->build,execution,deployment->build Tools->maven at once, as shown in the following figure:

In the Maven home directory on the right, set up the MAVEN installation directory, I am here/usr/local/maven, set up the Mavne profile in user settings file, I use the default file here, in the local Repository set up the local package management warehouse, select the right side of the override, you can customize their own warehouse directory, later maven automatically download the package will be stored here.

When you click OK, MAVEN is configured. You can then create a MAVEN project.

4. Create Maven Project

Select File->new->new Project in turn, and the following interface appears:

On the left you can select the type of item, select Maven here, choose whether to use the template on the right, check the create from archetype above, and select the project template below, choose Scala's template here.

After all the way to next, fill in the GroupID and Artifactid, and take the name casually:

Then go all the way next, fill in the name of the item, OK.

This creates a successful new project, and the file structure of the new project is as follows:

One of the pom.xml is to configure our project's dependency pack. SRC is the directory where the project is stored, and there are two identical directories main and test, in which we write code in the main directory, test code, here, without testing, you can delete the test directory. The right side shows the contents of the Pom.xml file:

Check the Enable Auto-import in the upper-right corner, so idea will automatically download the dependent packages required by the project. Also pay attention to the middle Scala version and choose your own version.

You can add dependencies for items under the Dependencies tab in the following illustration:

Each dependency is under a dependency tag, which includes GroupID, Artifactid, and version. If you don't know the contents of a dependency package, you can query it here, and the query results have this information. For example, to query the spark of dependence, like the following results:

Select the dependencies you want to add, and then select the appropriate version number after entering, with some information that Maven needs, as well as information about other package management tools, such as SBT:

Can be copied into the Pom.xml file.

Maven automatically downloads the dependencies that are added in the Pom.xml and eliminates the hassle of not adding them ourselves.

Then you can write code, under Src/main/scala/com/liu new Scala class, select Type Object, fill in class name, you can write code. As an example, here is an example of a wordcount:

Package Com.liu

/**
  * Created by Hadoop on 16-3-28.
  * *
import Org.apache.spark. {sparkcontext,sparkconf}
Object Test {
  def main (args:array[string]): Unit ={
    val conf=new sparkconf ()
    Val sc=new sparkcontext (conf) C8/>val text=sc.textfile ("File:///usr/local/spark/README.md")
    Val Result=text.flatmap (_.split (")"). Map ((_,1) ). Reducebykey (_+_). Collect ()
    Result.foreach (println)
  }
}

This does not describe the specific meaning of the code. Once the code is written, you need to generate the jar package and commit it to spark.

The following steps to generate a jar package. Select File->project structure->artifacts in turn, as shown in the following figure:

The green plus sign between single hits, select Jar->from modules with dependencies, as shown below:

Select the main class for the item in main class, OK. The results are as follows:

In the middle of output layout will list all the dependencies, we want to submit to the spark, so do not need here spark and Hadoop dependencies, delete to save space, but do not delete the final compile output, otherwise it will not be a jar package. Click OK to complete the configuration.

After selecting Build->build artifact->build, you can generate a jar package, which results in the following figure:

There is an out folder in the image above, and a jar package below shows that the build was successful.

5. Submit Spark Application

After the jar package is generated, you can use Spark-submit to submit the application, using the following command:

Spark-submit--class "Com.liu.Test" ~/sparkdemo.jar

You can submit your application. The results are as follows:

Indicates the success of the operation, listing the count statistics for the word.

At this point, Spark's idea development environment was built successfully.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

IntelliJ idea to build spark development environment _intellij

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

IntelliJ idea to build spark development environment _intellij

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support