"Spark Mllib Express Treasure" basic 01Windows Spark development Environment Construction (Scala edition)

Source: Internet
Author: User
Tags pyspark scala ide spark mllib

Directory installation JDK installation Scala IDE for Eclipse configuration spark configuration Hadoop create Maven engineering Scala code entry 7 Item 8 Item 9

Installing the JDK

Requires installation of jdk1.8 or later.

Back to Catalog

installing Scala IDE for Eclipse

There is no need to install Scala, the IDE is integrated.

Official Download: http://scala-ide.org/download/sdk.html

Back to Catalog

Configure Spark

Download Spark, I downloaded the version

Official Download: http://spark.apache.org/downloads.html

  

Configuring Environment variables

Variable name: Spark_home variable Value: D:\spark (cannot have spaces)

Add to Path

To install the Pyspark package:

Command line execution: Pip install Pyspark

Back to Catalog

Configure Hadoop

There is no need to install full Hadoop, but files such as Hadoop.dll,winutils.exe are required. Download the corresponding version of hadoop2.7.1 according to the version of Spark that you downloaded.

Link: Https://pan.baidu.com/s/1jHRu9oE Password: wdf9

  

Configuring Environment variables

  

Add to Path

  

  Restart the computer !!! Environment variables only take effect!!!

Back to Catalog

Create a MAVEN project

Creating a MAVEN project can quickly introduce the jar packages needed for your project. Some important configuration information is included in the Pom.xml file. A MAVEN project is available here:

Link: https://pan.baidu.com/s/1hsLAcWc Password: NFTA

Import Maven Project:

You can copy the project I provided to workspace and then introduce

After introduction, some jar packages will be downloaded automatically, and wait a few minutes

Description Jar Package Download complete

Error:

change the version of Scala's dependencies:

running the Wordcount.scala program

  

Back to Catalog

Scala Code

 Packagecom.itmorn.mlImportOrg.apache.spark. {sparkcontext, sparkconf}object wordCount {def main (args:array[string]) {val conf=NewSparkconf (). Setmaster ("local"). Setappname ("WordCount")//Creating environment VariablesVal sc =NewSparkcontext (CONF)//create an instance of an environment variableVal data = Sc.textfile ("Data/wc.txt")//Read FileData.flatmap (_.split ("")). Map ((_, 1)). Reducebykey (_+_). Collect (). foreach (println)//Word Count  }}

Back to Catalog

Entry 7

  。

Back to Catalog

Entry 8

  。

Back to Catalog

"Spark Mllib Express Treasure" basic 01Windows Spark development Environment Construction (Scala edition)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.