First, modify the core-site.xml file of the Master, where the content of the file is:
We changed the "localhost" domain name to "master ":
In the same operation, open the slave1 and slave2 node core-site.xml and change the "localhost" domain name to "master ".
Second, modify the master, slave1, slave2 mapred-site.xml file.
Go to the mapred-site.xml file of the master node, change the localhost domain name to master, save and exit.
Similarly, open the slave1 and slave2 node mapred-site.xml, change the "localhost" domain name to "master", save and exit.
Finally modify the master, slave1, slave2 hdfs-site.xml file:
We changed the value of "DFS. Replication" on three machines from 1 to 3, so that our data will have three copies:
Save and exit.
Step 4: Modify the masters and slaves files of the hadoop configuration files on the two machines.
First, modify the master's masters file:
Enter the file:
Change "localhost" to "master ":
Save and exit.
Modify the Server Load balancer file of the master node,
Enter the file:
Specific modification:
Save and exit.
From the configuration above, we can see that we use the master node as the master node and as the data processing node. This is due to the consideration of three copies of our data and the limited number of machines.
Copy the master configured masters and slaves files to the conf folder under the hadoop installation directory of slave1 and slave2 respectively:
Go to the slave1 or slave2 node to check the content of the masters and slaves files:
It is found that the copy is completely correct.
Now the hadoop cluster environment has been configured!
[Spark Asia Pacific Research Institute Series] the path to spark practice-Chapter 1 building a spark cluster (step 2) (3)