Ubuntu 14.04 LTS install Spark 1.6.0 (pseudo-distributed)
Ubuntu 14.04 LTS install Spark 1.6.0 (pseudo-distributed)
Software to be downloaded:
1.Hadoop-2.6.4.tar.gz download URL: http://hadoop.apache.org/releases.html
2. scala-2.11.7.tgz download URL: http://www.scala-lang.org/
3. spark-1.6.0-bin-hadoop2.6.tgz download URL: http://spark.apache.org/
4.jdk-8u73-linux-x64.tar.gz download URL: http://www.Oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
Root User Enabled
To simplify the permissions in Linux, I log on to and use the Ubuntu system as a root user. By default, the Ubuntu system does not enable the root user. We need to enable the root user, I refer to the URL to achieve the root user to open: http://jingyan.baidu.com/article/27fa73268144f346f8271f83.html.
1. Open the terminal (ctrl + Alt + T ):
2. Enter sudo gedit/usr/share/lightdm. conf. d/50-ubuntu.conf and press Enter. You may be prompted to enter the password, and the displayed edit box will pop up. In the edit box, enter greeter-show-manual-login = true to save the settings.
3. When the password is closed, return to the terminal window and enter sudo passwd root and press Enter. After the press enter, you will be asked to enter the password twice. If the password is successfully updated, the password is successfully updated.
4. After the instance is shut down and restarted, you can enter the root user name and password to log on to the GUI.
Install JAVA JDK
1. After login with the root user, cd to the jdk download and storage location, the use of tar-xf jdk-8u73-linux-x64.tar.gz for decompression, decompression, the use of cut command mv jdk put/usr/java directory.
2. use the apt-get install vim command to install the vim text editor, cd to the/etc directory, and use vim profile to modify the file and add it to the JAVA environment variable, open the profile file and add the following text at the end:
123 |
export JAVA_HOME=/usr/java/jdk1.8.0_73 export PATH=$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar |
Enter source profile in terminal to make the environment variable take effect.
For more Spark tutorials, see the following:
Install and configure Spark in CentOS 7.0
Spark1.0.0 Deployment Guide
Install Spark0.8.0 in CentOS 6.2 (64-bit)
Introduction to Spark and its installation and use in Ubuntu
Install the Spark cluster (on CentOS)
Hadoop vs Spark Performance Comparison
Spark installation and learning
Spark Parallel Computing Model
3. test whether the JAVA configuration is successful. If the following information appears, enter java-version in terminal.
Install Hadoop
Hadoop installation mainly refer to the official website on the pseudo distributed installation tutorial, Reference URL: http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/SingleCluster.html
1. Install ssh and rsync by using the following two commands:
12 |
$ sudo apt-get install ssh $ sudo apt-get install rsync |
2.cdto the download directory of hadoop-2.6.4.tar.gz, decompress it using the tar-xf command, and cut the decompressed folder to the directory/opt using the mv command. For spark, scala is similar to this operation, so it is no longer cumbersome.
3. Edit the/etc/profile file and add the hadoop environment variable. Remember source profile.
4. After adding the hadoop environment variables, cd to the directory/opt/hadoop-2.6.4/etc/hadoop/, modify the hadoop-env.sh file and define the following variables:
1 |
export JAVA_HOME=/usr/java/latest |
5. Pseudo-distributed also need to modify the etc/hadoop/core-site.xml file:
123456 |
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs: //localhost:9000</value> </property> </configuration> |
Modify the etc/hadoop/hdfs-site.xml file:
123456 |
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> </configuration> |
6. To avoid ssh access restrictions, you must first enter ssh localhost to check whether the ssh localhost can be completed without a password. If you do not need the following to generate a key:
123 |
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys |
7. After the above steps are completed, hadoop's pseudo-distributed architecture is completed, and you can test whether the installation is successful.
For more details, please continue to read the highlights on the next page: