Build the Spark development environment under Ubuntu

Source: Internet
Author: User

  • Ubuntu Basic Environment Configuration

      • installation JDK , download jdk-8u45-linux-x64.tar.gz , unzip to /opt/jdk1.8.0_45

    : http://www.oracle.com/technetwork/java/javase/downloads/index.html

      • installation Scala, Download scala-2.11.6.tgz , unzip to /opt/scala-2.11.6

    Address: http://www.scala-lang.org/

      • installation Spark , download spark-1.3.1-bin-hadoop2.6.tgz , unzip to /opt/spark-hadoop

    : http://spark.apache.org/downloads.html,


        • Configuring environment Variables , Edit /etc/profile, execute the following command      

Python@ubuntu: ~$ sudo gedit/etc/profile

The maximum number of files added:

#Seeting JDK JDK Environment Variables

Export java_home=/opt/jdk1.8.0_45

Export JRE_HOME=${JAVA_HOME}/JRE

Export Classpath=.:${java_home}/lib:${jre_home}/lib

Export Path=${java_home}/bin:${jre_home}/bin: $PATH

#Seeting Scala Scala Environment Variables

Export scala_home=/opt/scala-2.11.6

Export Path=${scala_home}/bin: $PATH

#setting Spark Spark environment variables

            export spark_home=/opt/spark-hadoop/

      #PythonPath spark pyspark python environment

          Export Pythonpath=/opt/spark-hadoop/python

      Restart the computer, make /etc/profile Permanent, temporary effective, open command window, execute source/etc/profile Takes effect in the current window

    • Test the installation Results

      • Open a Command window and switch to Spark root directory

      • Execution ./bin/spark-shell, Open Scala to the Spark The Connection window    

There is no error message during startup. scala> , start successfully

      • Execute ./bin/pyspark, open the Python Connection window to Spark

There is no error during startup and the startup succeeds when it appears as shown above.

      • access via browser: The following page appears

Test Spark available .

  • Python Hair Spark Application

      • Front set Pyspark add to Python

      • open spark install directory, python- build py4j, python

      • Open a command-line window, enter Python,python version is 2.7.6 , and note Spark not supported Python 3

      • input Import Pyspark , as shown, to prove the completion of the pre-development work

      • Use Pycharm to create a new item, use the code in the Red box to test:


Build the Spark development environment under Ubuntu

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.