Java installation first needs to download on Oracle's website
Create a JVM folder in the Lib directory
sudo mkdir /usr/lib/jvm
Then unzip the file into this folder
sudo tar zxvf jdk-8u40-linux-i586.tar.gz -C /usr/lib/jvm
Go to Unzip folder
cd /usr/lib/jvm
And then change a name for convenience.
sudo mv jdk1.8.0_40 Java
Open configuration file
sudo gedit ~/.bashrc
Add the following settings
export JAVA_HOME=/usr/lib/jvm/java
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH
To make it effective:
source ~/.bashrc
Complete
Install spark download Prebuild's Spark installation package
Extract
tar -xzf spark-1.2.0-bin-hadoop2.4.tgz
Then, in fact, this time Spark's Python mode is ready to use. Just CD in and run the Pyspark in the bin directory. But the next thing we do is link and choose a better location to store the files
Link
Put the file in a different place for the first half of the year.
sudo mv spark-1.2.1-bin-hadoop2.4 /srv/
Then set up a link
sudo ln -s /srv/spark-1.2.1-bin-hadoop2.4/ /srv/spark
You can change the ~/.bash_profile.
export SPARK_HOME=/srv/spark
export PATH=$SPARK_HOME/bin:$PATH
Complete
The command line executes the Pyspark directly
From for notes (Wiz)
Easy installation of Spark under Ubuntu 14.04