Big data: From Getting Started to XX (ix)

Last Update:2016-10-23 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

The ResourceManager (RM) is responsible for tracking the resources in a cluster, and scheduling applications (e.g., mapred UCE jobs).

Prior to Hadoop 2.4, the ResourceManager are the single point of failure in a YARN cluster. The High Availability feature adds redundancy in the form of a active/standby ResourceManager pair to remove this Otherwi Se single point of failure

At any point in time, only one ResourceManager is active, and the remaining one or more is the standby state. State switching can be toggled either manually via the CLI or through the integrated Failover-controller. If the switch is automatic, you have to use the Zookeepe. The configuration of yarn ha automatic switching mode is described in detail below.

1, first modify the Yarn-site.xml file, the following blue font part of the new additions.

[Email protected] hadoop]$ VI yarn-site.xml

<?xml version= "1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
You are not a use of this file except in compliance with the License.
Obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to writing, software
Distributed under the License is distributed on a "as is" BASIS,
Without warranties or CONDITIONS of any KIND, either express or implied.
See the License for the specific language governing permissions and
Limitations under the License. See accompanying LICENSE file.
-

<configuration>
<!--Site specific YARN configuration Properties--
<!--add Start 20160627--
<property>
<description>the address of the Applications manager interface in the rm.</description>
<name>yarn.resourcemanager.address</name>
<value>hadoop01:8032</value>
</property>
<property>
<description>the address of the scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop01:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop01:8031</value>
</property>
<property>
<description>the address of the RM admin interface.</description>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop01:8033</value>
</property>
<property>
<description>the http address of the RM Web application.</description>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop01:8088</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<!--add End 20160627--

<!--add Start 20161012--
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>rmCluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop02</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>hadoop01:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>hadoop02:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop01:2181,hadoop02:2181,hadoop03:2181</value>
</property>
<!--add End 20161012--

</configuration>

2. On the HADOOP01 server, start the Hadoop cluster. （"..." Part is the path abbreviation), the output shows that start-all.sh only started a ResourceManager.

[Email protected] hadoop]$ start-all.sh

This script is Deprecated. Instead Use start-dfs.sh and start-yarn.sh
16/07/04 12:17:09 WARN util. nativecodeloader:unable to load Native-hadoop library for your platform ... using Builtin-java classes where applicable
Starting namenodes on [Hadoop01 HADOOP02]
Hadoop02:starting Namenode, logging to/.../hadoop-hadoop-namenode-hadoop02.out
Hadoop01:starting Namenode, logging to/.../hadoop-hadoop-namenode-hadoop01.out
Hadoop02:starting Datanode, logging to/.../hadoop-hadoop-datanode-hadoop02.out
Hadoop01:starting Datanode, logging to/.../hadoop-hadoop-datanode-hadoop01.out
Hadoop03:starting Datanode, logging to/.../hadoop-hadoop-datanode-hadoop03.out
Starting journal nodes [Hadoop01 hadoop02 HADOOP03]
Hadoop02:starting Journalnode, logging to/.../hadoop-hadoop-journalnode-hadoop02.out
Hadoop01:starting Journalnode, logging to/.../hadoop-hadoop-journalnode-hadoop01.out
Hadoop03:starting Journalnode, logging to/.../hadoop-hadoop-journalnode-hadoop03.out
Starting ZK Failover Controllers on NN hosts [Hadoop01 HADOOP02]
Hadoop02:starting ZKFC, logging to/.../hadoop-hadoop-zkfc-hadoop02.out
Hadoop01:starting ZKFC, logging to/.../hadoop-hadoop-zkfc-hadoop01.out
Starting Yarn Daemons
starting ResourceManager, logging to/.../yarn-hadoop-resourcemanager-hadoop01.out
Hadoop01:starting NodeManager, logging to/.../yarn-hadoop-nodemanager-hadoop01.out
Hadoop02:starting NodeManager, logging to/.../yarn-hadoop-nodemanager-hadoop02.out
Hadoop03:starting NodeManager, logging to/.../yarn-hadoop-nodemanager-hadoop03.out

3, check the Hadoop cluster boot process, HADOOP01 machine shared the following processes.

[Email protected] hadoop]$ JPS
5239 NodeManager
4839 Journalnode
5288 Jps
4632 DataNode
5032 Dfszkfailovercontroller
4521 NameNode
5116 ResourceManager

4, start the ResourceManager on the HADOOP02 machine.

[Email protected] ~]$ yarn-daemon.sh start ResourceManager
Starting ResourceManager, logging to/home/hadoop/hadoop-2.7.2//logs/yarn-hadoop-resourcemanager-hadoop02.out

5, respectively check the status of two ResourceManager

[email protected] ~]$ yarn rmadmin-getservicestate RM1
Active
[email protected] ~]$ yarn rmadmin-getservicestate RM2
Standby

6. View ResourceManager status through graphical interface

Indicates that the ResourceManager on the HADOOP01 is an active state

650) this.width=650; "Width=" 501 "height=" 207 "title=" snap1.jpg "style=" WIDTH:706PX;HEIGHT:267PX; "alt=" Wkiom1glgiczepwgaajqge9s9kw421.jpg-wh_50 "src=" http://s1.51cto.com/wyfs02/M00/89/34/ Wkiom1glgiczepwgaajqge9s9kw421.jpg-wh_500x0-wm_3-wmp_4-s_319531072.jpg "/>

ResourceManager on table name HADOOP02 is standby status

650) this.width=650; "Width=" 501 "height=" 211 "title=" Snap2.jpg "style=" width:685px;height:321px; "alt=" Wkiom1glgpbzyjk4aajvfk8cxd4294.jpg-wh_50 "src=" http://s1.51cto.com/wyfs02/M01/89/34/ Wkiom1glgpbzyjk4aajvfk8cxd4294.jpg-wh_500x0-wm_3-wmp_4-s_2000338235.jpg "/>

7. Manually simulate the failover test on the HADOOP02 server

[email protected] ~]$ yarn Rmadmin-transitiontostandby RM1
Automatic failover is enabled for [email protected]
Refusing to manually manage HA state, since it may cause
A split-brain scenario or other incorrect state.
If you is very sure you know what is doing, please
Specify the--FORCEMANUAL flag.

8. Re-check the status of ResourceManager on the HADOOP02 server

[email protected] ~]$ yarn rmadmin-getservicestate RM2
Active

9. Check ResourceManager status via Web page

Shows that the ResourceManager on the HADOOP01 server is no longer accessible.

650) this.width=650; "Width=" 502 "height=" 224 "title=" snap4.jpg "style=" width:721px;height:341px; "alt=" WKioL1gLg_ 3z95qyaadv_hglqrs783.jpg-wh_50 "src=" Http://s4.51cto.com/wyfs02/M00/89/31/wKioL1gLg_3z95qyAADv_hGLQrs783.jpg-wh _500x0-wm_3-wmp_4-s_1231144003.jpg "/>

Displays the status of active on the HADOOP02 server

650) this.width=650, "width=" 502 "height=", "title=" snap3.jpg "style=" WIDTH:707PX;HEIGHT:304PX; "alt=" wKioL1gLhF_ Adb96aajtrjclyiu337.jpg-wh_50 "src=" Http://s2.51cto.com/wyfs02/M02/89/31/wKioL1gLhF_ADB96AAJtrjClYIU337.jpg-wh_ 500x0-wm_3-wmp_4-s_2954610084.jpg "/>

10, in addition, when we access the standby state of ResourceManager is, the system automatically redirects the page to the active state of ResourceManager.

Assuming a standby RM is up and running, the standby automatically redirects all Web requests to the Active, except for th E "About" page.

This article from "Shen Jinqun" blog, declined reprint!

Big data: Getting Started to XX (ix)

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More