This article is a report on the installation of hadoop yarn in a standalone pseudo-distributed environment based on the hadoop official website installation tutorial. It is for your reference only.
1. The installation environment is as follows:
Operating System: ubuntu14.04
Hadoop version: hadoop-2.5.0
Java version: openjdk-1.7.0_55
2. Download Hadoop-2.5.0, http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.5.0/hadoop-2.5.0.tar.gz
The$ Hadoop_homE:/Home/baisong/hadoop-2.5.0(The user name is baisong ). In ~ /. Add environment variables to the bashrc file as follows:
Export hadoop_home =/home/baisong/hadoop-2.5.0
Then compile and run the following command:
$ Source ~ /. Bashrc
3. Install JDK and set the java_home environment variable. Add the following content at the end of the/etc/profile file:
Export java_home =/usr/lib/JVM/java-7-openjdk-i386 // depends on your own Java installation directory
Export Path = $ java_home/bin: $ path
Enter the following command to make the configuration take effect
$ Source/etc/profile
4. Configure SSH. Generate the key first. Run the following command and press enter to confirm that no input is required.
$ Ssh-keygen-T RSA
Write the public key to the authorized_keys file. The command is as follows:
$ Cat ~ /. Ssh/id_rsa.pub> ~ /. Ssh/authorized_keys
Finally, enter the following command as promptedYesYou can.
$ SSH localhost
$ SSH Hama
5. Modify the hadoop configuration file and go to the $ {hadoop_home}/etc/hadoop/directory.
1) set the environment variables, add the Java installation directory in the hadoop-env.sh, as follows:
Export java_home =/usr/lib/JVM/java-7-openjdk-i386
2) modify the core-site.xml and add the following content.
<Property>
<Name> fs. defaultfs </Name>
<Value> HDFS :/// localhost: 9000 </value>
</Property>
<Property>
<Name> hadoop. tmp. dir </Name>
<Value>/home/baisong/hadooptmp </value>
</Property>
Note: The hadoop. tmp. dir item is optional (you must manually create the hadooptmp folder in the preceding settings ).
3) modify the hdfs-site.xml and add the following content ".
<Property>
<Name> DFS. repliacation </Name>
<Value> 1 </value>
</Property>
4) Rename the mapred-site.xml.template to the mapred-site.xml and add the following content.
$ MV mapred-site.xml.template mapred-site.xml // rename
<Property>
<Name> mapreduce. Framework. Name </Name>
<Value> yarn </value>
</Property>
5) modify the yarn-site.xml and add the following content.
<Property>
<Name> yarn. nodemanager. Aux-services </Name>
<Value> mapreduce_shuffle </value>
</Property>
6. Format HDFS with the following command:
Bin/HDFS namenode-format // bin/hadoop namenode-format command expired
After formatting is successful, the DFS folder is created in/home/baisong/hadooptmp.
7. Start HDFS with the following command:
$ Sbin/start-dfs.sh
The following error is reported:
14/10/29 16:49:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [OpenJDK Server VM warning: You have loaded library /home/baisong/hadoop-2.5.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.localhost]sed: -e expression #1, char 6: unknown option to `s'VM: ssh: Could not resolve hostname vm: Name or service not knownlibrary: ssh: Could not resolve hostname library: Name or service not knownhave: ssh: Could not resolve hostname have: Name or service not knownwhich: ssh: Could not resolve hostname which: Name or service not knownmight: ssh: Could not resolve hostname might: Name or service not knownwarning:: ssh: Could not resolve hostname warning:: Name or service not knownloaded: ssh: Could not resolve hostname loaded: Name or service not knownhave: ssh: Could not resolve hostname have: Name or service not knownServer: ssh: Could not resolve hostname server: Name or service not known
Cause of analysis: hadoop_common_lib_native_dir and hadoop_opts environment variables are not set. In ~ Add the following content to the/. bashrc file and compile it.
Export hadoop_common_lib_native_dir = $ hadoop_home/lib/native
Export hadoop_opts = "-djava. Library. Path = $ hadoop_home/lib"
$ Source ~ /. Bashrc
Restart HDFS. The output is as follows, indicating that the startup is successful.
You can use the Web interface to view the running status of namenode. The URL is http: // localhost: 50070.
The command to stop HDFS is:
$ Sbin/stop-dfs.sh
8. Run the following command to start yarn:
$ Sbin/start-yarn.sh
You can use the Web interface to view the running status of namenode. The URL is http: // localhost: 8088.
The command to stop HDFS is:
$ Sbin/stop-yarn.sh
After HDFS and yarn are started, you can run the JPS command to check whether the startup is successful.
9. Run the test program.
1) run the following command to test and calculate pi:
$ Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar PI 20 10
2) To test grep, first upload the input file to HDFS. The command is as follows:
$ Bin/hdfs dfs-put ETC/hadoop Input
Run the grep program with the following command:
$ Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar grep input output 'dfs [A-Z.] +'
The output result is as follows:
10. Add environment variables for easy use of start-dfs.sh, start-yarn.sh and other commands (optional ).
In ~ /. Add environment variables to the bashrc file as follows:
Export Path = $ hadoop_home/bin: $ hadoop_home/sbin: $ path
Then compile and run the following command:
$ Source ~ /. Bashrc
Yes ~ The variable added in the/. bashrc file for reference.
Hadoop yarn installation-standalone pseudo distributed environment