Start and view the cluster status
Step 1: Start the hadoop cluster, which is explained in detail in the second lecture. I will not go into details here:
After the JPS command is run on the master machine, the following process information is displayed:
When JPS is used on slave1 and slave2, the following process information is displayed:
Step 2: Start the spark Cluster
On the basis of the successful start of the hadoop cluster, to start the spark cluster, you need to use the "start-all.sh" under the spark sbin directory ":
Next start the spark cluster with start-all.sh!
The reader must note that at this time the "./start-all.sh" must be written to indicate that it is the "start-all.sh" under the current directory, because we also have a "start-all.sh" file in the bin directory where hadoop is configured!
In this case, we use JPs to discover that we have two new processes, "Master" and "worker", on the master node as expected!
At this time, slave1 and slave2 will display a new process "worker"
At this point, we can go to the spark cluster web page and access "http: // master: 8080": as shown below:
We can see from the page that we have three worker nodes and the information of these three nodes.
In this case, go to the spark bin directory and use the "Spark-shell" console:
Now we enter the spark shell world. Based on the output prompt, we can view sparkui from the Web perspective through "http: // master: 4040", as shown in:
Of course, you can also view some other information, such as environment:
At the same time, we can also take a look at executors:
We can see that for our shell, the driver is master: 50777.
So far, our spark cluster has been set up successfully, congratulations!
[Spark Asia Pacific Research Institute Series] the path to spark practice-Chapter 1 building a spark cluster (Step 3)