HA-Federation-HDFS + Yarn cluster deployment mode

Last Update:2015-09-03 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

After an afternoon's attempt, I finally set up the cluster, and it didn't feel much necessary to complete the setup. So I should study it and lay the foundation for building the real environment.

The following is a cluster deployment of Ha-Federation-hdfs + Yarn.

First, let's talk about my Configuration:

The four nodes are started respectively:

1. bkjia117: active namenode,

2. bkjia118 standby namenode, journalnode, datanode

3. bkjia119 active namenode, journalnode, datanode

4. bkjia120 standby namenode, journalnode, datanode

This is simply because the computer cannot hold the virtual machine. In fact, all the nodes here should be on different servers. To put it simply, 117 and 119 are active namenode, 118 and 120 are standby namenode, and datanode and journalnode are respectively placed on 118.119.120.

10 thousand words are omitted here .. The problems and records are as follows:

1. Start journalnode. This journalnode says I am not too clear about what he is doing ~~, Follow-up research. Start journalnode on each node:

[Bkjia @ bkjia118 Hadoop-2.6.0] $ sbin/hadoop-daemon.sh start journalnode
Starting journalnode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-journalnode-bkjia118.bkjia.out
[Bkjia @ bkjia118 hadoop-2.6.0] $ jps
11447 JournalNode
Jps 11485

2. An error is reported when formatting namenode: (no firewall is found at last... Password-free login does not mean firewall off)

15/08/20 02:12:45 INFO ipc. client: Retrying connect to server: bkjia119/192.168.75.119: 8485. already tried 8 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia118/192.168.75.118: 8485. already tried 8 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia120/192.168.75.120: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia119/192.168.75.119: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 WARN namenode. NameNode: Encountered exception during format:
Org. apache. hadoop. hdfs. qjournal. client. QuorumException: Unable to check if JNs are ready for formatting. 2 exceptions thrown:
192.168.75.120: 8485: No Route to Host from failed to bkjia120: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119: 8485: No Route to Host from failed to bkjia119: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
At org. apache. hadoop. hdfs. qjournal. client. QuorumException. create (QuorumException. java: 81)
At org. apache. hadoop. hdfs. qjournal. client. QuorumCall. rethrowException (QuorumCall. java: 223)
At org. apache. hadoop. hdfs. qjournal. client. QuorumJournalManager. hasSomeData (QuorumJournalManager. java: 232)
At org. apache. hadoop. hdfs. server. common. Storage. confirmFormat (Storage. java: 884)
At org. apache. hadoop. hdfs. server. namenode. FSImage. confirmFormat (FSImage. java: 171)
At org. apache. hadoop. hdfs. server. namenode. NameNode. format (NameNode. java: 937)
At org. apache. hadoop. hdfs. server. namenode. NameNode. createNameNode (NameNode. java: 1379)
At org. apache. hadoop. hdfs. server. namenode. NameNode. main (NameNode. java: 1504)
15/08/20 02:12:47 INFO ipc. client: Retrying connect to server: bkjia118/192.168.75.118: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:47 FATAL namenode. NameNode: Failed to start namenode.
Org. apache. hadoop. hdfs. qjournal. client. QuorumException: Unable to check if JNs are ready for formatting. 2 exceptions thrown:
192.168.75.120: 8485: No Route to Host from failed to bkjia120: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119: 8485: No Route to Host from failed to bkjia119: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
At org. apache. hadoop. hdfs. qjournal. client. QuorumException. create (QuorumException. java: 81)
At org. apache. hadoop. hdfs. qjournal. client. QuorumCall. rethrowException (QuorumCall. java: 223)
At org. apache. hadoop. hdfs. qjournal. client. QuorumJournalManager. hasSomeData (QuorumJournalManager. java: 232)
At org. apache. hadoop. hdfs. server. common. Storage. confirmFormat (Storage. java: 884)
At org. apache. hadoop. hdfs. server. namenode. FSImage. confirmFormat (FSImage. java: 171)
At org. apache. hadoop. hdfs. server. namenode. NameNode. format (NameNode. java: 937)
At org. apache. hadoop. hdfs. server. namenode. NameNode. createNameNode (NameNode. java: 1379)
At org. apache. hadoop. hdfs. server. namenode. NameNode. main (NameNode. java: 1504)
15/08/20 02:12:47 INFO util. ExitUtil: Exiting with status 1
15/08/20 02:12:47 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43

Formatted successfully!

[Bkjia @ bkjia117 hadoop-2.6.0] $ bin/hdfs namenode-format-clusterId hadoop-cluster

15/08/20 02:22:05 INFO namenode. FSNamesystem: Append Enabled: true
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map INodeMap
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 1.0% max memory 889 MB = 8.9 MB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 20 = 1048576 entries
15/08/20 02:22:06 INFO namenode. NameNode: Caching file names occuring more than 10 times
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map cachedBlocks
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 0.25% max memory 889 MB = 2.2 MB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 18 = 262144 entries
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. threshold-pct = 0.9990000128746033
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. min. datanodes = 0
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. extension = 30000
15/08/20 02:22:06 INFO namenode. FSNamesystem: Retry cache on namenode is enabled
15/08/20 02:22:06 INFO namenode. FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map NameNodeRetryCache
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 15 = 32768 entries
15/08/20 02:22:06 INFO namenode. NNConf: ACLs enabled? False
15/08/20 02:22:06 INFO namenode. NNConf: XAttrs enabled? True
15/08/20 02:22:06 INFO namenode. NNConf: Maximum size of an xattr: 16384
15/08/20 02:22:08 INFO namenode. FSImage: Allocated new BlockPoolId: BP-971817124-192.168.75.117-1440062528650
15/08/20 02:22:08 INFO common. Storage: Storage directory/home/bkjia/hadoop/hdfs/name has been successfully formatted.
15/08/20 02:22:10 INFO namenode. NNStorageRetentionManager: Going to retain 1 images with txid> = 0
15/08/20 02:22:10 INFO util. ExitUtil: Exiting with status 0
15/08/20 02:22:10 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at bkjia117/192.168.75.117
**************************************** ********************/

3. Enable namenode:

[Bkjia @ bkjia117 hadoop-2.6.0] $ sbin/hadoop-daemon.sh start namenode
Starting namenode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-namenode-bkjia117.out
[Bkjia @ bkjia117 Co hadoop-2.6.0] $ jps
18550 NameNode
Jps 18604

4. Format standby namenode

[Bkjia @ bkjia119 hadoop-2.6.0] $ bin/hdfs namenode-bootstrapStandby
15/08/20 02:36:26 INFO namenode. NameNode: STARTUP_MSG:
/*************************************** *********************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = bkjia119/192.168.75.119
STARTUP_MSG: args = [-bootstrapStandby]
STARTUP_MSG: version = 2.6.0
.....
.....
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git-r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins 'on 2014-11-13T21: 10Z
STARTUP_MSG: java = 1.8.0 _ 51
**************************************** ********************/
15/08/20 02:36:26 INFO namenode. NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/08/20 02:36:26 INFO namenode. NameNode: createNameNode [-bootstrapStandby]
========================================================== ==================
About to bootstrap Standby ID nn2 from:
Nameservice ID: hadoop-cluster1
Other Namenode ID: nn1
Other NN's HTTP address: http: // bkjia117: 50070
Other NN's IPC address: bkjia117/192.168.75.117: 8020
Namespace ID: 1244139539
Block pool ID: BP-971817124-192.168.75.117-1440062528650
Cluster ID: hadoop-cluster
Layout version:-60
========================================================== ==================
15/08/20 02:36:28 INFO common. Storage: Storage directory/home/bkjia/hadoop/hdfs/name has been successfully formatted.
15/08/20 02:36:29 INFO namenode. TransferFsImage: Opening connection to http: // bkjia117: 50070/imagetransfer? Getimage = 1 & txid = 0 & storageInfo =-60: 1244139539: 0: hadoop-cluster
15/08/20 02:36:30 INFO namenode. TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
15/08/20 02:36:30 INFO namenode. TransferFsImage: Transfer took 0.01 s at 0.00 KB/s
15/08/20 02:36:30 INFO namenode. TransferFsImage: Downloaded file fsimage. ckpt_000000000000000 size 352 bytes.
15/08/20 02:36:30 INFO util. ExitUtil: Exiting with status 0
15/08/20 02:36:30 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at bkjia119/192.168.75.119
**************************************** ********************/

5. Enable standby namenode

[Bkjia @ bkjia119 hadoop-2.6.0] $ sbin/hadoop-daemon.sh start namenode
Starting namenode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-namenode-bkjia119.out
[Bkjia @ bkjia119 hadoop-2.6.0] $ jps
14401 JournalNode
15407 NameNode
Jps 15455

On the web, the other two displays are in standy status:

Use this command to switch nn1 to the active status:

Bin/hdfs haadmin-ns hadoop-cluster1-transitionToActive nn1

The other two principles are the same:

Enable all datanode, which can be used only when ssh password-free logon is configured. Refer:

[Bkjia @ bkjia117 hadoop-2.6.0] $ sbin/hadoop-daemons.sh start datanode

If the token is enabled, the preset values are 192.168.1.118, 192.168.1.119, and 192.168.1.120.

Start yarn

[Bkjia @ bkjia117 Co hadoop-2.6.0] $ sbin/start-yarn.sh

Starting yarn daemons
Starting resourcemanager, logging to/home/bkjia/hadoop-2.6.0/logs/yarn-bkjia-resourcemanager-bkjia117.out
Bkjia118: nodemanager running as process 14812. Stop it first.
Bkjia120: nodemanager running as process 14025. Stop it first.
Bkjia119: nodemanager runing as process 17590. Stop it first.
[Bkjia @ bkjia117 Co hadoop-2.6.0] $ jps
NameNode
Jps
ResourceManager

You can also see that there are three datanode

Let's end with a summary ...... if you learn big data by yourself, it is enough to have a simple deployment, so that you can run your program in hdfs. Such a cluster should be at the end, or do the research in detail when necessary. Please wait for the next stage ~~

How does Hadoop modify the size of HDFS file storage blocks?

Copy local files to HDFS

Download files from HDFS to local

Upload local files to HDFS

Common commands for HDFS basic files

Introduction to HDFS and MapReduce nodes in Hadoop

Hadoop practice Chinese version + English version + Source Code [PDF]

Hadoop: The Definitive Guide (PDF]

This article permanently updates the link address:

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

HA-Federation-HDFS + Yarn cluster deployment mode

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

HA-Federation-HDFS + Yarn cluster deployment mode

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support