Apache Hadoop Introductory Tutorial Chapter Fourth

Source: Internet
Author: User

YARN that runs on a single node

You can run the MapReduce job on YARN with pseudo-distributed mode by setting several parameters and running the ResourceManager daemon and the NodeManager daemon.

Here are the steps to run.

(1) configuration

Etc/hadoop/mapred-site.xml:

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
1
2
3
4
5
6
Etc/hadoop/yarn-site.xml:

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
1
2
3
4
5
6
(2) Start the ResourceManager daemon and the NodeManager daemon

$ sbin/start-yarn.sh
1
(3) Browse ResourceManager's network interface, their address defaults to:

resourcemanager-http://localhost:8088/
1
(4) Running the MapReduce job

(5) After completing all the actions, stop the daemon:

$ sbin/stop-yarn.sh
1

    1. How to operate fully distributed mode

For a fully distributed model, refer to the "Installation configuration on Apache Hadoop cluster" section below.

Installation configuration on Apache Hadoop cluster

This section describes how to install, configure, and manage a Hadoop cluster that can scale from a small cluster of several nodes to a very large cluster of thousands of nodes.

    1. Prerequisite

Make sure that you have all the necessary software installed on each node in your cluster, and that installing a Hadoop cluster typically extracts the installation software to all the machines in the cluster, referring to the previous section, "Installation configuration on Apache Hadoop single node."

Typically, a machine in a cluster is designated as a NameNode and another machine as a ResourceManager. These are all master. Other services, such as the WEB application proxy server and the MapReduce Job history server, run on a dedicated hardware or shared infrastructure, depending on the load.

The remaining machines in the cluster act as DataNode and NodeManager. These are all slave. "'

Many people know that I have big data training materials, all naïve thought I have a full set of big data development, Hadoop, spark and other video learning materials. I want to say that you are right, I do have big data development, Hadoop, Spark's full set of video materials.
If you are interested in big data development You can add a group to receive free learning materials: 763835121

Apache Hadoop Introductory Tutorial Fourth chapter

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.