Build Hadoop2.6.0 + Spark1.4.0 standalone environment under Windows 7 Virtual Machine
For how to install and configure Hadoop, refer to my previous article: Build a pseudo-distributed environment of Hadoop2.6.0 under Windows 7 virtual machine.
This article describes how to build a spark1.4.0 standalone Environment Based on Hadoop2.6.0.
1. Software preparation
Scala-2.11.7.tgz
Spark-1.4.0-bin-hadoop2.6.tgz
Can be downloaded from the official website.
2. scala installation and configuration
Scala-2.11.7.tgz extract. I decompress the package to the/home/vm/tools/scala directory, and then configure ~ /. Bash_profile environment variable.
# Scala Export SCALA_HOME =/home/vm/tools/scala Export PATH = $ SCALA_HOME/bin: $ PATH |
Use source ~ /. Bash_profile takes effect.
Verify that scala is successfully installed:
Interactive use of scala:
3. spark installation and configuration
Unzip the spark-1.4.0-bin-hadoop2.6.tgz to the/home/vm/tools/spark directory and then configure ~ /. Bash_profile environment variable.
# Spark Export SPARK_HOME =/home/vm/tools/spark Export PATH = $ SPARK_HOME/bin: $ PATH |
Modify $ SPARK_HOME/conf/spark-env.sh
Export SPARK_HOME =/home/vm/tools/spark Export SCALA_HOME =/home/vm/tools/scala Export JAVA_HOME =/home/vm/tools/jdk Export SPARK_MASTER_IP = 192.168.62.129 Export spark _ worker_memory = 512 m |
Modify $ SPARK_HOME/conf/spark-defaults.conf
Spark. master spark: // 192.168.62.129: 7077 Spark. serializer org. apache. spark. serializer. KryoSerializer |
Modify $ SPARK_HOME/conf/spark-defaults.conf
192.168.62.129 this is the IP address of the machine I tested. |
Start spark
Cd/home/vm/tools/spark/sbin Sh start-all.sh |
Test whether Spark is successfully installed.
Cd $ SPARK_HOME/bin/ ./Run-example SparkPi |
SparkPi execution log:
1 vm@Ubuntu:~/tools/spark/bin$ ./run-example SparkPi 2 3 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 4 5 15/07/29 00:02:32 INFO SparkContext: Running Spark version 1.4.0 6 7 15/07/29 00:02:33 WARN NativeCodeLoader: Unable to load native-hadoop library