Using the Spark API to develop spark programs based on idea

Source: Internet
Author: User

Qingming Holiday toss for two days, summed up two ways to use the IDE for the Spark program, record:

The first method is simpler, both of which are compiled with SBT.

Note: There is no need to install the Scala program locally, otherwise there is a version compatibility issue when compiling the program.


First, based on the NON-SBT way


Create a Scala idea project

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1Uw_W6RTtXzAADv8hogz8w055.jpg "title=" 1.png " alt= "Wkiol1uw_w6rttxzaadv8hogz8w055.jpg"/>

We use the NON-SBT method, click "Next"

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1Uw_biRUEB1AAGcmEwvV7E233.jpg "title=" 1.png " alt= "Wkiol1uw_birueb1aagcmewvv7e233.jpg"/>

Name the project, others follow the default

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1Uw_KDQYIGlAAGRVxi3MMs169.jpg "title=" 1.png " alt= "Wkiom1uw_kdqyiglaagrvxi3mms169.jpg"/>

Click "Finish" to complete the project creation

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/95/wKiom1Uw_NrBpqj5AAEQMJ0boUU026.jpg "title=" 1.png " alt= "Wkiom1uw_nrbpqj5aaeqmj0bouu026.jpg"/>

Modify the properties of an item

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1Uw_lrSBokLAAIZ4fEOMCY459.jpg "title=" 1.png " alt= "Wkiol1uw_lrsboklaaiz4feomcy459.jpg"/>

First modify the Modules option

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/95/wKiom1Uw_S2Br-RlAAGxDLlS1PE811.jpg "title=" 1.png " alt= "Wkiom1uw_s2br-rlaagxdlls1pe811.jpg"/>

Create two folders under SRC and change their properties to source

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1Uw_q-STTmvAAIBzio05jU421.jpg "title=" 1.png " alt= "Wkiol1uw_q-sttmvaaibzio05ju421.jpg"/>

Modify libraries Below

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/95/wKiom1Uw_ZGgDh87AAGaa3ZcNZo061.jpg "title=" 1.png " alt= "Wkiom1uw_zggdh87aagaa3zcnzo061.jpg"/>

Bring in the jar packages needed for spark development

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1Uw_wOwSETZAAJWLRmQrpQ623.jpg "title=" 1.png " alt= "Wkiol1uw_wowsetzaajwlrmqrpq623.jpg"/>

After joining:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1Uw_yLAeabkAAFXMb20-k8196.jpg "title=" 1.png " alt= "Wkiol1uw_ylaeabkaafxmb20-k8196.jpg"/>

After the import package is complete, create a packages below the project Scala

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1Uw_z_h2uTKAAGsxUhzL30651.jpg "title=" 1.png " alt= "Wkiol1uw_z_h2utkaagsxuhzl30651.jpg"/>

Create an Object

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/95/wKiom1Uw_grxuoo4AAGXviNcKQw217.jpg "title=" 1.png " alt= "Wkiom1uw_grxuoo4aagxvinckqw217.jpg"/>

Building Spark Driver Code

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/95/wKiom1Uw_i-A-livAAG7j_bnMEs430.jpg "title=" 1.png " alt= "Wkiom1uw_i-a-livaag7j_bnmes430.jpg"/>

This program is the processing code of the Sogou log

Next package, use project structure's artifacts

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1Uw_6ODQug7AAECG7Txdu8331.jpg "title=" 1.png " alt= "Wkiol1uw_6odqug7aaecg7txdu8331.jpg"/>

Using the From modules with dependencies

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1Uw_m7zsayvAADwz1cykkI686.jpg "title=" 1.png " alt= "Wkiom1uw_m7zsayvaadwz1cykki686.jpg"/>

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1Uw_9mTLkG1AADiAufDfUA742.jpg "title=" 1.png " alt= "Wkiol1uw_9mtlkg1aadiaufdfua742.jpg"/>

Select Main Class

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1Uw_qKgh1ixAADxuLhxrNI042.jpg "title=" 1.png " alt= "Wkiom1uw_qkgh1ixaadxulhxrni042.jpg"/>

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1UxABbgGWegAAFxfcRbbgg320.jpg "title=" 1.png " alt= "Wkiol1uxabbggwegaafxfcrbbgg320.jpg"/>

Click "OK"

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1UxADTgc021AAGxXZGxmuI537.jpg "title=" 1.png " alt= "Wkiol1uxadtgc021aagxxzgxmui537.jpg"/>

Change the name to Firstsparkappjar

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/95/wKiom1Uw_vzTmRrdAAJFnvfO97g915.jpg "title=" 1.png " alt= "Wkiom1uw_vztmrrdaajfnvfo97g915.jpg"/>

Because Scala and spark are installed on each machine, you can erase both Scala and spark-related jar files.

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1UxAG2iD9sqAAFhP8reeY0753.jpg "title=" 1.png " alt= "Wkiol1uxag2id9sqaafhp8reey0753.jpg"/>

The next Build

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/95/wKiom1Uw_zihPn3dAAGI_9Z49rI498.jpg "title=" 1.png " alt= "Wkiom1uw_zihpn3daagi_9z49ri498.jpg"/>

Select "Build Artifacts"

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/95/wKiom1Uw_1nAAaUxAAB1Rl4ca0w127.jpg "title=" 1.png " alt= "Wkiom1uw_1naaauxaab1rl4ca0w127.jpg"/>

Select Build for the first time, then select rebuild for the same project, then wait for the compilation to complete

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1UxAMuz2YFaAABtCTV6Y8o755.jpg "title=" 1.png " alt= "Wkiol1uxamuz2yfaaabtctv6y8o755.jpg"/>

Run the program using Spark-submit below

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1UxAOPh4P5fAAEHPCkIfB4979.jpg "title=" 1.png " alt= "Wkiol1uxaoph4p5faaehpckifb4979.jpg"/>

Final task Run complete

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1Uw_6uC3GgwAAX9_Yv1lcs243.jpg "title=" 1.png " alt= "Wkiom1uw_6uc3ggwaax9_yv1lcs243.jpg"/>



Two, based on the SBT method


Development tools Download


Spark development requires the following development compilation tools:

1. Scala IDE: This article takes IntelliJ idea as an example to develop:

https://www.jetbrains.com/idea/download/

2. The SBT (Simple Build tool) compilation Tools Download:

Http://www.scala-sbt.org/download.html

After downloading the installation, execute the SBT command under DOS to download the jar package that it requires:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1UxAYLCmfo4AAARGOs6v50207.jpg "title=" 1.png " alt= "Wkiol1uxaylcmfo4aaargos6v50207.jpg"/>

The default jar package (. Idea-build,. Ivy2,. SBT) is downloaded to the C-drive user directory

(Note: Use the command SBT to make sure the Internet speed, preferably using a proxy download)


Development tool Configuration


1. IntelliJ Idea Development configuration:

(1) download Scala plugin: Select Configure under Plugins to enter

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1UxAfWj_cBXAAJ78Hyzj-4801.jpg "title=" 1.png " alt= "wkiol1uxafwj_cbxaaj78hyzj-4801.jpg"/> Select Install JetBrains plugin to search scala for download.

(2) Create an SBT-based Scala project:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1UxAM_REor9AAFW7Qe8sfM628.jpg "title=" 1.png " alt= "Wkiom1uxam_reor9aafw7qe8sfm628.jpg"/>

(3) set the project name and the Scala and SBT versions:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1UxAlLyBEBJAAJAjXx9JUU478.jpg "title=" 1.png " alt= "wkiol1uxallybebjaajajxx9juu478.jpg"/> Note:

    1. It is best to cancel the download two options, or overwrite the Sbtjar package in the previous user directory, causing the compilation to pass the error

    2. The version numbers of SBT and Scala can be seen in the C-drive user directory:

      C:\Users\ User \.sbt\boot\scala-2.10.4\org.scala-sbt\sbt\0.13.8

      This can be set up correctly at once, or you can modify the configuration file (BUILD.SBT modified Scala version, build.properties change SBT version) for synchronization.

    3. The project path does not use Chinese, otherwise it cannot be executed even if it is successfully compiled into a jar package.


(4) The SBT-based Scala program is demanding on the construction of the project, and the following directory structure needs to be established:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/95/wKiom1UxAViwFAJRAAIvAPorWiY092.jpg "title=" 1.png " alt= "Wkiom1uxaviwfajraaivaporwiy092.jpg"/>

(5) add Spark plugin:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/91/wKioL1UxAs7DpdfGAAHq6NSZnQE567.jpg "title=" 1.png " alt= "wkiol1uxas7dpdfgaahq6nsznqe567.jpg"/> Note: At the same time the plug-in needs to be added to the Lib directory (OS copy)


Preparation completed, spark program development below


Writing code

The following is an example of a class WordCount program:

To write the Spark program:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/91/wKioL1UxAxeQl_4FAAK1eT5vwAc317.jpg "title=" 1.png " alt= "Wkiol1uxaxeql_4faak1et5vwac317.jpg"/>

Please do not compile with inteiij idea (Chinese characters cause subsequent compilation not to pass)


Compiling and executing

(1) Use SBT to compile and package:

Under DOS into the engineering directory, use the SBT command to compile and package:

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M02/6B/95/wKiom1UxAfXDk-EOAACqxPsUF6A456.jpg "title=" 1.png " alt= "Wkiom1uxafxdk-eoaacqxpsuf6a456.jpg"/>

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M00/6B/95/wKiom1UxAgmTLLs3AADnj-Sjico004.jpg "title=" 1.png " alt= "Wkiom1uxagmtlls3aadnj-sjico004.jpg"/>

The default hit Jar is under the project catalog test\target\scala-2.10

(2) Upload the jar package to the server for execution:

Use the command:

spark-submit--class test--master yarn Test_2.10-0.1-snapshot.jar 100

More parameters see the official documentation

650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6B/91/wKioL1UxA5zQQcPFAAMbGsPlGXU256.jpg "title=" 1.png " alt= "Wkiol1uxa5zqqcpfaambgsplgxu256.jpg"/>

Using the Spark API to develop spark programs based on idea

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.