If you have to install hadoop my version hadoop2.3-cdh5.1.0
1. Download the maven package
2. Configure the m2_home environment variable and configure the maven bin directory to the path
3. Export maven_opts = "-xmx2g-XX: maxpermsize = 512 M-XX: reservedcodecachesize = 512 M"
Download the spark-1.0.2.gz package and decompress it on the official website
5. Go to the Spark extract package directory.
6. Run./make-distribution.sh -- hadoop 2.3.0-cdh5.1.0 -- With-yarn -- tgz
7. Long wait
8. The spark-1.0.2-bin-2.3.0-cdh5.1.0.tgz is generated in the current directory after completion
9. copy to the installation directory and decompress the package
10. Configure the configuration file under Conf
CP spark-env.sh.template spark-env.sh
Vim spark-env.sh.
Configuration parameters: Corresponding
Export java_home =/home/hadoop/JDK
Export hadoop_home =/home/hadoop/hadoop-2.3.0-cdh5.1.0
Export hadoop_conf_dir =/home/hadoop/hadoop-2.3.0-cdh5.1.0/etc/hadoop
Export spark_yarn_app_name = spark-on-yarn
Export spark_executor_instances = 1
Export spark_executor_cores = 2
Export spark_executor_memory = 3500 m
Export spark_driver_memory = 3500 m
Export spark_master_ip = Master
Export spark _ master_port = 7077
Export spark_worker_cores = 2
Export spark _ worker_memory = 3500 m
Export spark_worker_instances = 1
11. Configure slaves
Slave01
Slave02
Slave03
Slave04
Slave05
12. Distribution
Copy the spark installation directory to each slave Node
13. Start
Sbin/start-all.sh
14. Run the instance
$ Spark_home/bin/spark-submit -- class Org. apache. spark. examples. sparkpi -- master yarn-client -- num-executors 3 -- driver-memory 4G -- executor-memory 2g -- executor-cores 1/home/hadoop/spark/lib/spark-examples-1.0.2-hadoop2.3.0-cdh5.1.0.jar 100
15. failed to send the instance
On the yarn monitoring page, click the log to display a bunch of these errors.
Info [main] org. Apache. hadoop. IPC. Client: retrying connect to server: 0.0.0.0/0.0.0.0: 8030. Already tried 0 time (s ).
Info [main] org. Apache. hadoop. IPC. Client: retrying connect to server: 0.0.0.0/0.0.0.0: 8030. Already tried 0 time (s ).
Info [main] org. Apache. hadoop. IPC. Client: retrying connect to server: 0.0.0.0/0.0.0.0: 8030. Already tried 0 time (s ).
Info [main] org. Apache. hadoop. IPC. Client: retrying connect to server: 0.0.0.0/0.0.0.0: 8030. Already tried 0 time (s ).
16. Solve the Problem
The spark core package under the spark directory lib package to local, found that there is a yarn-defaul.xml file, open the discovery
<!-- Resource Manager Configs --> <property> <description>The hostname of the RM.</description> <name>yarn.resourcemanager.hostname</name> <value>0.0.0.0</value> </property>
Find resorcemanager locally. If the running node is not running on ResourceManager of the yarn node, how can it be found?
17. modify the configuration as follows:
<!-- Resource Manager Configs --> <property> <description>The hostname of the RM.</description> <name>yarn.resourcemanager.hostname</name> <value>master</value> </property>
18. Package and resend spark to each node
Spark cdh5 compilation and installation [spark-1.0.2 hadoop2.3.0 cdh5.1.0]