This article is based on the previous article, Ubuntu installs the Hadoop standalone version on the basis of
1, Configuration Core-site.xml
The/usr/local/hadoop/etc/hadoop/core-site.xml contains configuration information when Hadoop starts.
Open this file in the editor
sudo gedit/usr/local/hadoop/etc/hadoop/core-site.xml
Add the following between the <configuration></configuration> of the file:
<property> <name>fs. Default.name</name> <value>hdfs://localhost:9000</value> < /property>
Save, close the edit window.
2, Yarn-site.xml
The/usr/local/hadoop/etc/hadoop/yarn-site.xml contains configuration information for MapReduce when it is started.
Open this file in the editor
sudo gedit/usr/local/hadoop/etc/hadoop/yarn-site.xml
Add the following between the <configuration></configuration> of the file:
<property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle </value> </property> <property> <name> Yarn.nodemanager.aux-services.mapreduce.shuffle. class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </ Property>
Save, close the edit window
3. Create Mapred-site.xml
By default, there is a mapred.xml.template file under the/usr/local/hadoop/etc/hadoop/folder, and we want to copy the file and name it Mapred.xml, which is used to specify the framework used by MapReduce.
First enter the/usr/local/hadoop/etc/hadoop/directory
cd/usr/local/hadoop/etc/hadoop/
Copy and rename
CP Mapred-site.xml.template Mapred-site.xml
Editor opens this new file
sudo gedit mapred-site.xml
Add the following between the <configuration></configuration> of the file:
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
Save, close the edit window
4, Configuration Hdfs-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.
Create a folder First
cd/usr/local/hadoop/
mkdir HDFs
mkdir Hdfs/data
mkdir Hdfs/name
You can also create a folder under another path, the name can be different, but it needs to be consistent with the configuration in Hdfs-site.xml.
Editor opens Hdfs-site.xml
sudo gedit/usr/local/hadoop/etc/hadoop/hdfs-site.xml
Add the following between the <configuration></configuration> of the file:
<property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value> file:/usr/local/hadoop/hdfs/name</value> </property> <property> <name> dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/hdfs/data</value> </ Property>
Save, close the edit window
5. Format HDFs
HDFs Namenode-format
It only needs to be executed once, and if it is executed again after Hadoop has been used, all the data on the HDFS will be erased.
6. Start Hadoop
First enter the/usr/local/hadoop/directory
cd/usr/local/hadoop/
After the configuration and operation described above, you can start this single-node cluster
To execute a startup command:
sbin/start-dfs.sh
When you execute this command, if you have a yes/no prompt, enter Yes and return.
Next, execute:
sbin/start-yarn.sh
After executing these two commands, Hadoop will start and run
Browser opens http://localhost:50070/, you will see the HDFs administration page
Browser opens http://localhost:8088, you will see the Hadoop Process Management page
7. WordCount Test
First enter the/usr/local/hadoop/directory
cd/usr/local/hadoop/
Create input directory on DFS
Bin/hadoop fs-mkdir-p Input
Copy the README.txt from the Hadoop directory to the DFS new input
Hadoop fs-copyfromlocal README.txt Input
Run WordCount
Hadoop jar share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.7. 0-sources.jar org.apache.hadoop.examples.WordCount Input Output
View results after execution is complete
Hadoop fs-cat Output/*
Ubuntu Hadoop 2.7.0 Pseudo-Division installation