Hadoop Official Documentation Learning Note (1)--resource manager HA

Source: Internet
Author: User
Tags failover

Resource Manager ha is a feature that Hadoop has introduced since 2.4, providing redundancy in a active/standby way to eliminate the risk of a single point of failure.


1, the overall structure:

650) this.width=650; "Src=" http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/images/ Rm-ha-overview.png "alt=" Overview of ResourceManager High Availability "/>


2, failover: There are two forms of automatic and manual.


Manual: If you switch manually, use the Yarn haadmin command to first convert the active node to standby and then the standby node to active.


Automatic: RM has an zookeeper-based node election mechanism that determines which is the active node. There is no need to deploy a ZKFC daemon like HDFs because the RM embeds such functionality.


after the RM ha, all nodes and clients are listed with all the RM nodes, and the connection is traversed by polling until an active node is found. If the active nodes are down, they will continue polling. This action is implemented as a Org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider class. You can override this behavior logic by re-implementing the class and specifying a new class name in the Yarn.client.failover-proxy-provider configuration item.


3. Configuration

Configuration Properties Description
Yarn.resourcemanager.zk-address Address of the Zk-quorum. Used both for the State-store and embedded leader-election.
Yarn.resourcemanager.ha.enabled Enable RM HA.
Yarn.resourcemanager.ha.rm-ids List of logical IDs for the RMs. e.g., "Rm1,rm2".
yarn.resourcemanager.hostname. Rm-id for each  rm-id , specify the hostname the RM corresponds to. Alternately, one could set each of the RM ' s service addresses.
yarn.resourcemanager.address. Rm-id for each  rm-id , specify Host:port for clients to submit jobs. If set, overrides the hostname set in  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.scheduler.address. Rm-id for each  rm-id , specify scheduler Host:port for Applicationmasters to obtain resources. If set, overrides the hostname set in  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.resource-tracker.address. Rm-id for each  rm-id , specify Host:port for nodemanagers to connect. If set, overrides the hostname set in  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.admin.address. Rm-id for each  rm-id , specify Host:port for administrative commands. If set, overrides the hostname set in  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.webapp.address. Rm-id for each  rm-id , specify Host:port of the RM Web application corresponds to. Need this if you set  yarn.http.policy  to  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.webapp.https.address. Rm-id for each  rm-id , specify Host:port of the RM HTTPS Web application corresponds to. Need this if you set  yarn.http.policy  to  yarn.resourcemanager.hostname. Rm-id .
yarn.resourcemanager.ha.id identifies the RM in the ensemble. This is optional; However, if set, admins has to ensure, and the RMs has their own IDs in the config.
yarn.resourcemanager.ha.automatic-failover.enabled enable automatic failover; By default, it was enabled only when HA was enabled.
yarn.resourcemanager.ha.automatic-failover.embedded use embedded leader-elector to pick the Active RM while automatic failover is enabled. By default, it was enabled only when HA was enabled.
Yarn.resourcemanager.cluster-id Identifies the cluster. Used by the elector to ensure a RM doesn ' t take over as Active for another cluster.
Yarn.client.failover-proxy-provider The class to is used by clients, AMs and NMs to failover to the Active RM.
Yarn.client.failover-max-attempts The max number of times Failoverproxyprovider should attempt failover.
Yarn.client.failover-sleep-base-ms The sleep base (in milliseconds) is used for calculating the exponential delay between failovers.
Yarn.client.failover-sleep-max-ms The maximum sleep time (in milliseconds) between failovers.
Yarn.client.failover-retries The number of retries per attempt to connect to a ResourceManager.
Yarn.client.failover-retries-on-socket-timeouts The number of retries per attempt to connect to a ResourceManager on socket timeouts.


4. Example (Minimum configuration)

<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master1</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master2</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master1:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>zk1:2181,zk2:2181,zk3:2181</value>
</property>


5. Management Command


To view node status:
$ yarn Rmadmin-getservicestate RM1
Active

$ yarn Rmadmin-getservicestate RM2
Standby


Failover:

$ yarn Rmadmin-transitiontostandby RM1
$ yarn rmadmin-transitiontoactive RM2

Attention:
When automatic failover is turned on, the system refuses to manage the HA state in order to prevent brain fissures or other inconsistencies. If you are very aware of your behavior, you can specify the-forcemanual option in the Toggle command.


This article is from the Big Data Learning Quest blog, so be sure to keep this source http://bigdata1024.blog.51cto.com/6098731/1869156

Hadoop Official Documentation Learning Note (1)--resource manager HA

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.