InUbuntu14.04 installation of Hadoop2.4.0 (standalone mode) based on configuration
First, configure the Core-site.xml
The/usr/local/hadoop/etc/hadoop/core-site.xml contains configuration information when Hadoop starts.
Open this file in the editor
sudo gedit/usr/local/hadoop/etc/hadoop/core-site.xml
Add the following between the <configuration></configuration> of the file:
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
Save, close the edit window.
The contents of the final modified file are as follows:
Second, the configuration Yarn-site.xml
The/usr/local/hadoop/etc/hadoop/yarn-site.xml contains configuration information for MapReduce when it is started.
Open this file in the editor
sudo gedit yarn-site.xml
Add the following between the <configuration></configuration> of the file:
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
Save, close the edit window
The contents of the final modified file are as follows
iii. Creating and configuring Mapred-site.xml
By default, there is a mapred.xml.template file under the/usr/local/hadoop/etc/hadoop/folder, and we want to copy the file and name it Mapred.xml, which is used to specify the framework used by MapReduce.
Copy and rename
CP Mapred-site.xml.template Mapred-site.xml
Editor opens this new file
sudo gedit mapred-site.xml
Add the following between the <configuration></configuration> of the file:
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
Save, close the edit window
The contents of the final modified file are as follows
iv. Configuration Hdfs-site.xml
/usr/local/hadoop/etc/hadoop/hdfs-site.xml is used to configure each host in the cluster to be available, specifying the directory on the host as Namenode and Datanode.
Create the folder as shown in
You can also create a folder under another path, the name can be different, but it needs to be consistent with the configuration in Hdfs-site.xml.
Editor opens Hdfs-site.xml
Add the following between the <configuration></configuration> of the file:
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop/hdfs/data</value>
</property>
Save, close the edit window
The contents of the final modified file are as follows:
v. Format HDFs
HDFs Namenode-format
It only needs to be executed once, and if it is executed again after Hadoop has been used, all the data on the HDFS will be erased.
vi. start Hadoop
After the configuration and operation described above, you can start this single-node cluster
To execute a startup command:
sbin/start-dfs.sh
When you execute this command, if you have a yes/no prompt, enter Yes and return.
Next, execute:
sbin/start-yarn.sh
After executing these two commands, Hadoop will start and run
Execute the JPS command and you will see Hadoop-related processes such as:
Browser opens http://localhost:50070/, you will see the HDFs administration page
Browser opens http://localhost:8088, you will see the Hadoop Process Management page
Seven, WordCountValidation
Create input directory on DFS
Bin/hadoop fs-mkdir-p Input
Copy the README.txt from the Hadoop directory to the DFS new input
Hadoop fs-copyfromlocal README.txt Input
Run WordCount
Hadoop jar Share/hadoop/mapreduce/sources/hadoop-mapreduce-examples-2.4.0-sources.jar Org.apache.hadoop.examples.WordCount Input Output
You can see the execution process
When you are finished running, view the Word statistics results
Hadoop Fs-cat output/*
Ubuntu Install Hadoop (pseudo distribution mode)