Hadoop yarn installation-standalone pseudo distributed environment

Source: Internet
Author: User
Tags hdfs dfs

This article is a report on the installation of hadoop yarn in a standalone pseudo-distributed environment based on the hadoop official website installation tutorial. It is for your reference only.

1. The installation environment is as follows:

Operating System: ubuntu14.04

Hadoop version: hadoop-2.5.0

Java version: openjdk-1.7.0_55

2. Download Hadoop-2.5.0, http://mirrors.cnnic.cn/apache/hadoop/common/hadoop-2.5.0/hadoop-2.5.0.tar.gz

The$ Hadoop_homE:/Home/baisong/hadoop-2.5.0(The user name is baisong ). In ~ /. Add environment variables to the bashrc file as follows:

Export hadoop_home =/home/baisong/hadoop-2.5.0

Then compile and run the following command:

$ Source ~ /. Bashrc

3. Install JDK and set the java_home environment variable. Add the following content at the end of the/etc/profile file:

Export java_home =/usr/lib/JVM/java-7-openjdk-i386 // depends on your own Java installation directory
Export Path = $ java_home/bin: $ path

Enter the following command to make the configuration take effect

$ Source/etc/profile

4. Configure SSH. Generate the key first. Run the following command and press enter to confirm that no input is required.

$ Ssh-keygen-T RSA

Write the public key to the authorized_keys file. The command is as follows:

$ Cat ~ /. Ssh/id_rsa.pub> ~ /. Ssh/authorized_keys

Finally, enter the following command as promptedYesYou can.

$ SSH localhost

$ SSH Hama

5. Modify the hadoop configuration file and go to the $ {hadoop_home}/etc/hadoop/directory.

1) set the environment variables, add the Java installation directory in the hadoop-env.sh, as follows:

Export java_home =/usr/lib/JVM/java-7-openjdk-i386

2) modify the core-site.xml and add the following content.

<Property>
<Name> fs. defaultfs </Name>
<Value> HDFS :/// localhost: 9000 </value>
</Property>

<Property>
<Name> hadoop. tmp. dir </Name>
<Value>/home/baisong/hadooptmp </value>
</Property>

Note: The hadoop. tmp. dir item is optional (you must manually create the hadooptmp folder in the preceding settings ).

3) modify the hdfs-site.xml and add the following content ".

<Property>
<Name> DFS. repliacation </Name>
<Value> 1 </value>
</Property>

4) Rename the mapred-site.xml.template to the mapred-site.xml and add the following content.

$ MV mapred-site.xml.template mapred-site.xml // rename

<Property>
<Name> mapreduce. Framework. Name </Name>
<Value> yarn </value>
</Property>

5) modify the yarn-site.xml and add the following content.

<Property>
<Name> yarn. nodemanager. Aux-services </Name>
<Value> mapreduce_shuffle </value>
</Property>

6. Format HDFS with the following command:

Bin/HDFS namenode-format // bin/hadoop namenode-format command expired

After formatting is successful, the DFS folder is created in/home/baisong/hadooptmp.

7. Start HDFS with the following command:

$ Sbin/start-dfs.sh

The following error is reported:

14/10/29 16:49:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [OpenJDK Server VM warning: You have loaded library /home/baisong/hadoop-2.5.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.localhost]sed: -e expression #1, char 6: unknown option to `s'VM: ssh: Could not resolve hostname vm: Name or service not knownlibrary: ssh: Could not resolve hostname library: Name or service not knownhave: ssh: Could not resolve hostname have: Name or service not knownwhich: ssh: Could not resolve hostname which: Name or service not knownmight: ssh: Could not resolve hostname might: Name or service not knownwarning:: ssh: Could not resolve hostname warning:: Name or service not knownloaded: ssh: Could not resolve hostname loaded: Name or service not knownhave: ssh: Could not resolve hostname have: Name or service not knownServer: ssh: Could not resolve hostname server: Name or service not known

Cause of analysis: hadoop_common_lib_native_dir and hadoop_opts environment variables are not set. In ~ Add the following content to the/. bashrc file and compile it.

Export hadoop_common_lib_native_dir = $ hadoop_home/lib/native
Export hadoop_opts = "-djava. Library. Path = $ hadoop_home/lib"

$ Source ~ /. Bashrc

Restart HDFS. The output is as follows, indicating that the startup is successful.

You can use the Web interface to view the running status of namenode. The URL is http: // localhost: 50070.

The command to stop HDFS is:

$ Sbin/stop-dfs.sh

8. Run the following command to start yarn:

$ Sbin/start-yarn.sh

You can use the Web interface to view the running status of namenode. The URL is http: // localhost: 8088.

The command to stop HDFS is:

$ Sbin/stop-yarn.sh

After HDFS and yarn are started, you can run the JPS command to check whether the startup is successful.

9. Run the test program.

1) run the following command to test and calculate pi:

$ Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar PI 20 10

2) To test grep, first upload the input file to HDFS. The command is as follows:

$ Bin/hdfs dfs-put ETC/hadoop Input

Run the grep program with the following command:

$ Bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar grep input output 'dfs [A-Z.] +'

The output result is as follows:

10. Add environment variables for easy use of start-dfs.sh, start-yarn.sh and other commands (optional ).

In ~ /. Add environment variables to the bashrc file as follows:

Export Path = $ hadoop_home/bin: $ hadoop_home/sbin: $ path

Then compile and run the following command:

$ Source ~ /. Bashrc

Yes ~ The variable added in the/. bashrc file for reference.


Hadoop yarn installation-standalone pseudo distributed environment

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.