hadoop3-pseudo Distributed mode installation

Source: Internet
Author: User
Tags hdfs dfs

Today, I accidentally saw the release of HADOOP3 at the end of last year, ready to install an environment today. Installation Configuration

First download the installation package from the address below http://hadoop.apache.org/releases.html

Here i download the hadoop-3.0.0.tar.gz package, unzip the installation.

$ tar zxvf hadoop-3.0.0.tar.gz
$ cd hadoop-3.0.0/

Edit the etc/hadoop/hadoop-env.sh file, set the JAVA_HOME environment variable,

Export JAVA_HOME=/OPT/JDK8

Modify configuration file Core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs:// 

Modify the configuration file Hdfs-site.xml because it is a pseudo distributed mode, so the setting is replicated to 1.

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1< /value>
  </property>
</configuration>
Run HDFS format HDFS

The first time you start HDFS, you need to format it.

$ Bin/hdfs Namenode-format
Start HDFS
$ sbin/start-dfs.sh

After you start HDFS, you can view the HDFS status by accessing the following address from the browser. http://localhost:9870/ run MapReduce job

First create the home directory of the current user in HDFS, as follows

$ Bin/hdfs dfs-mkdir/user
$ bin/hdfs dfs-mkdir/user/<username>

Prepare data, run tests, and view results

$ Bin/hdfs dfs-mkdir input
$ bin/hdfs dfs-put etc/hadoop/*.xml input
$ bin/hadoop jar Share/hadoop/mapreduce/had Oop-mapreduce-examples-3.0.0.jar grep input Output ' dfs[a-z.] + '
$ bin/hdfs dfs-cat output/*

Delete Test results above

$ Bin/hdfs dfs-rm output/*
$ bin/hdfs dfs-rmdir output/
Stop HDFS
$ sbin/stop-dfs.sh
Run YARN Modifying Etc/hadoop/mapred-site.xml files
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value >yarn</value>
    </property>
    <property>
      <name>yarn.app.mapreduce.am.env </name>
      <value>HADOOP_MAPRED_HOME=/apps/hadoop-3.0.0</value>
    </property>
    <property>
      <name>mapreduce.map.env</name>
      <value>hadoop_mapred_home=/ apps/hadoop-3.0.0</value>
    </property>
    <property>
      <name> mapreduce.reduce.env</name>
      <value>HADOOP_MAPRED_HOME=/apps/hadoop-3.0.0</value>
    </property>
</configuration>
Modifying Etc/hadoop/yarn-site.xml files
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        < value>mapreduce_shuffle</value>
    </property>
</configuration>
Start YARN
$ sbin/start-yarn.sh

After startup, you can view job requests Http://192.168.0.192:8088/cluster run MapReduce jobs with the following address

$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.0.0.jar grep input Output ' dfs[a-z.] + '
$ bin/hdfs dfs-cat output/*
Stop YARN
$ sbin/stop-yarn.sh
Http://192.168.0.192:8088/cluster
problem

In the process of testing yarn, the beginning of the total occurrence of errors similar to the following, causing the job to run failed

[2018-01-30 22:40:02.211] Container [pid=22658,containerid=container_1517369701504_0003_01_000028] is running beyond virtual memory limits. Current usage:87.9 MB of 1 GB physical memory used; 2.6 GB of 2.1 GB virtual memory used. Killing container.

Finally found that the machine is not enough memory, resulting in yarn configuration on my machine unreasonable, so modified the Etc/hadoop/yarn-site.xml file, add the following two configuration items, and then restart the yarn on it.

<property>
   <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</ Value>
    <description>whether virtual memory limits'll is enforced for containers</description>
  </property>
 <property>
   <name>yarn.nodemanager.vmem-pmem-ratio</name>
    <value>4</value>
    <description>ratio between virtual memory to physical memory when setting memory Limits for containers</description>
  </property>
Reference: Https://stackoverflow.com/questions/21005643/container-is-running-beyond-memory-limits

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.