Ubuntu 14.04 LTS install Spark 1.6.0 (pseudo-distributed)

Last Update:2016-03-10 Source: Internet

Author: User

Tags ssh access

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Ubuntu 14.04 LTS install Spark 1.6.0 (pseudo-distributed)

Software to be downloaded:

1.Hadoop-2.6.4.tar.gz download URL: http://hadoop.apache.org/releases.html

2. scala-2.11.7.tgz download URL: http://www.scala-lang.org/

3. spark-1.6.0-bin-hadoop2.6.tgz download URL: http://spark.apache.org/

4.jdk-8u73-linux-x64.tar.gz download URL: http://www.Oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

Root User Enabled

To simplify the permissions in Linux, I log on to and use the Ubuntu system as a root user. By default, the Ubuntu system does not enable the root user. We need to enable the root user, I refer to the URL to achieve the root user to open: http://jingyan.baidu.com/article/27fa73268144f346f8271f83.html.

1. Open the terminal (ctrl + Alt + T ):

2. Enter sudo gedit/usr/share/lightdm. conf. d/50-ubuntu.conf and press Enter. You may be prompted to enter the password, and the displayed edit box will pop up. In the edit box, enter greeter-show-manual-login = true to save the settings.

3. When the password is closed, return to the terminal window and enter sudo passwd root and press Enter. After the press enter, you will be asked to enter the password twice. If the password is successfully updated, the password is successfully updated.

4. After the instance is shut down and restarted, you can enter the root user name and password to log on to the GUI.

Install JAVA JDK

1. After login with the root user, cd to the jdk download and storage location, the use of tar-xf jdk-8u73-linux-x64.tar.gz for decompression, decompression, the use of cut command mv jdk put/usr/java directory.

2. use the apt-get install vim command to install the vim text editor, cd to the/etc directory, and use vim profile to modify the file and add it to the JAVA environment variable, open the profile file and add the following text at the end:

123	`export JAVA_HOME=/usr/java/jdk1.8.0_73export PATH=$JAVA_HOME/bin:$PATHexport CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar`

Enter source profile in terminal to make the environment variable take effect.

For more Spark tutorials, see the following:

Install and configure Spark in CentOS 7.0

Spark1.0.0 Deployment Guide

Install Spark0.8.0 in CentOS 6.2 (64-bit)

Introduction to Spark and its installation and use in Ubuntu

Install the Spark cluster (on CentOS)

Hadoop vs Spark Performance Comparison

Spark installation and learning

Spark Parallel Computing Model

3. test whether the JAVA configuration is successful. If the following information appears, enter java-version in terminal.

Install Hadoop

Hadoop installation mainly refer to the official website on the pseudo distributed installation tutorial, Reference URL: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html

1. Install ssh and rsync by using the following two commands:

12	`$ sudo apt-get install ssh$ sudo apt-get install rsync`

2.cdto the download directory of hadoop-2.6.4.tar.gz, decompress it using the tar-xf command, and cut the decompressed folder to the directory/opt using the mv command. For spark, scala is similar to this operation, so it is no longer cumbersome.

3. Edit the/etc/profile file and add the hadoop environment variable. Remember source profile.

4. After adding the hadoop environment variables, cd to the directory/opt/hadoop-2.6.4/etc/hadoop/, modify the hadoop-env.sh file and define the following variables:

1	`export JAVA_HOME=/usr/java/latest`

5. Pseudo-distributed also need to modify the etc/hadoop/core-site.xml file:

123456 <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property></configuration>

Modify the etc/hadoop/hdfs-site.xml file:

123456 <configuration> <property> <name>dfs.replication</name> <value>1</value> </property></configuration>

6. To avoid ssh access restrictions, you must first enter ssh localhost to check whether the ssh localhost can be completed without a password. If you do not need the following to generate a key:

123	`$ ssh-keygen -t dsa -P` `''` `-f ~/.ssh/id_dsa$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys$ chmod 0600 ~/.ssh/authorized_keys`

7. After the above steps are completed, hadoop's pseudo-distributed architecture is completed, and you can test whether the installation is successful.

For more details, please continue to read the highlights on the next page:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More