HA-Federation-HDFS + Yarn cluster deployment mode

Source: Internet
Author: User

HA-Federation-HDFS + Yarn cluster deployment mode

After an afternoon's attempt, I finally set up the cluster, and it didn't feel much necessary to complete the setup. So I should study it and lay the foundation for building the real environment.

The following is a cluster deployment of Ha-Federation-hdfs + Yarn.

First, let's talk about my Configuration:

The four nodes are started respectively:

1. bkjia117: active namenode,

2. bkjia118 standby namenode, journalnode, datanode

3. bkjia119 active namenode, journalnode, datanode

4. bkjia120 standby namenode, journalnode, datanode

This is simply because the computer cannot hold the virtual machine. In fact, all the nodes here should be on different servers. To put it simply, 117 and 119 are active namenode, 118 and 120 are standby namenode, and datanode and journalnode are respectively placed on 118.119.120.

10 thousand words are omitted here .. The problems and records are as follows:

1. Start journalnode. This journalnode says I am not too clear about what he is doing ~~, Follow-up research. Start journalnode on each node:

[Bkjia @ bkjia118 Hadoop-2.6.0] $ sbin/hadoop-daemon.sh start journalnode
Starting journalnode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-journalnode-bkjia118.bkjia.out
[Bkjia @ bkjia118 hadoop-2.6.0] $ jps
11447 JournalNode
Jps 11485

2. An error is reported when formatting namenode: (no firewall is found at last... Password-free login does not mean firewall off)

15/08/20 02:12:45 INFO ipc. client: Retrying connect to server: bkjia119/192.168.75.119: 8485. already tried 8 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia118/192.168.75.118: 8485. already tried 8 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia120/192.168.75.120: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 INFO ipc. client: Retrying connect to server: bkjia119/192.168.75.119: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:46 WARN namenode. NameNode: Encountered exception during format:
Org. apache. hadoop. hdfs. qjournal. client. QuorumException: Unable to check if JNs are ready for formatting. 2 exceptions thrown:
192.168.75.120: 8485: No Route to Host from failed to bkjia120: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119: 8485: No Route to Host from failed to bkjia119: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
At org. apache. hadoop. hdfs. qjournal. client. QuorumException. create (QuorumException. java: 81)
At org. apache. hadoop. hdfs. qjournal. client. QuorumCall. rethrowException (QuorumCall. java: 223)
At org. apache. hadoop. hdfs. qjournal. client. QuorumJournalManager. hasSomeData (QuorumJournalManager. java: 232)
At org. apache. hadoop. hdfs. server. common. Storage. confirmFormat (Storage. java: 884)
At org. apache. hadoop. hdfs. server. namenode. FSImage. confirmFormat (FSImage. java: 171)
At org. apache. hadoop. hdfs. server. namenode. NameNode. format (NameNode. java: 937)
At org. apache. hadoop. hdfs. server. namenode. NameNode. createNameNode (NameNode. java: 1379)
At org. apache. hadoop. hdfs. server. namenode. NameNode. main (NameNode. java: 1504)
15/08/20 02:12:47 INFO ipc. client: Retrying connect to server: bkjia118/192.168.75.118: 8485. already tried 9 time (s); retry policy is RetryUpToMaximumCountWithFixedSleep (maxRetries = 10, sleepTime = 1000 MILLISECONDS)
15/08/20 02:12:47 FATAL namenode. NameNode: Failed to start namenode.
Org. apache. hadoop. hdfs. qjournal. client. QuorumException: Unable to check if JNs are ready for formatting. 2 exceptions thrown:
192.168.75.120: 8485: No Route to Host from failed to bkjia120: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
192.168.75.119: 8485: No Route to Host from failed to bkjia119: 8485 failed on socket timeout exception: java.net. NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
At org. apache. hadoop. hdfs. qjournal. client. QuorumException. create (QuorumException. java: 81)
At org. apache. hadoop. hdfs. qjournal. client. QuorumCall. rethrowException (QuorumCall. java: 223)
At org. apache. hadoop. hdfs. qjournal. client. QuorumJournalManager. hasSomeData (QuorumJournalManager. java: 232)
At org. apache. hadoop. hdfs. server. common. Storage. confirmFormat (Storage. java: 884)
At org. apache. hadoop. hdfs. server. namenode. FSImage. confirmFormat (FSImage. java: 171)
At org. apache. hadoop. hdfs. server. namenode. NameNode. format (NameNode. java: 937)
At org. apache. hadoop. hdfs. server. namenode. NameNode. createNameNode (NameNode. java: 1379)
At org. apache. hadoop. hdfs. server. namenode. NameNode. main (NameNode. java: 1504)
15/08/20 02:12:47 INFO util. ExitUtil: Exiting with status 1
15/08/20 02:12:47 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at 43.49.49.59.broad.ty.sx.dynamic.163data.com.cn/59.49.49.43

Formatted successfully!

[Bkjia @ bkjia117 hadoop-2.6.0] $ bin/hdfs namenode-format-clusterId hadoop-cluster

15/08/20 02:22:05 INFO namenode. FSNamesystem: Append Enabled: true
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map INodeMap
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 1.0% max memory 889 MB = 8.9 MB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 20 = 1048576 entries
15/08/20 02:22:06 INFO namenode. NameNode: Caching file names occuring more than 10 times
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map cachedBlocks
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 0.25% max memory 889 MB = 2.2 MB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 18 = 262144 entries
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. threshold-pct = 0.9990000128746033
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. min. datanodes = 0
15/08/20 02:22:06 INFO namenode. FSNamesystem: dfs. namenode. safemode. extension = 30000
15/08/20 02:22:06 INFO namenode. FSNamesystem: Retry cache on namenode is enabled
15/08/20 02:22:06 INFO namenode. FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis
15/08/20 02:22:06 INFO util. GSet: Computing capacity for map NameNodeRetryCache
15/08/20 02:22:06 INFO util. GSet: VM type = 64-bit
15/08/20 02:22:06 INFO util. GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB
15/08/20 02:22:06 INFO util. GSet: capacity = 2 ^ 15 = 32768 entries
15/08/20 02:22:06 INFO namenode. NNConf: ACLs enabled? False
15/08/20 02:22:06 INFO namenode. NNConf: XAttrs enabled? True
15/08/20 02:22:06 INFO namenode. NNConf: Maximum size of an xattr: 16384
15/08/20 02:22:08 INFO namenode. FSImage: Allocated new BlockPoolId: BP-971817124-192.168.75.117-1440062528650
15/08/20 02:22:08 INFO common. Storage: Storage directory/home/bkjia/hadoop/hdfs/name has been successfully formatted.
15/08/20 02:22:10 INFO namenode. NNStorageRetentionManager: Going to retain 1 images with txid> = 0
15/08/20 02:22:10 INFO util. ExitUtil: Exiting with status 0
15/08/20 02:22:10 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at bkjia117/192.168.75.117
**************************************** ********************/

3. Enable namenode:

[Bkjia @ bkjia117 hadoop-2.6.0] $ sbin/hadoop-daemon.sh start namenode
Starting namenode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-namenode-bkjia117.out
[Bkjia @ bkjia117 Co hadoop-2.6.0] $ jps
18550 NameNode
Jps 18604

4. Format standby namenode

[Bkjia @ bkjia119 hadoop-2.6.0] $ bin/hdfs namenode-bootstrapStandby
15/08/20 02:36:26 INFO namenode. NameNode: STARTUP_MSG:
/*************************************** *********************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = bkjia119/192.168.75.119
STARTUP_MSG: args = [-bootstrapStandby]
STARTUP_MSG: version = 2.6.0
.....
.....
STARTUP_MSG: build = https://git-wip-us.apache.org/repos/asf/hadoop.git-r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins 'on 2014-11-13T21: 10Z
STARTUP_MSG: java = 1.8.0 _ 51
**************************************** ********************/
15/08/20 02:36:26 INFO namenode. NameNode: registered UNIX signal handlers for [TERM, HUP, INT]
15/08/20 02:36:26 INFO namenode. NameNode: createNameNode [-bootstrapStandby]
========================================================== ==================
About to bootstrap Standby ID nn2 from:
Nameservice ID: hadoop-cluster1
Other Namenode ID: nn1
Other NN's HTTP address: http: // bkjia117: 50070
Other NN's IPC address: bkjia117/192.168.75.117: 8020
Namespace ID: 1244139539
Block pool ID: BP-971817124-192.168.75.117-1440062528650
Cluster ID: hadoop-cluster
Layout version:-60
========================================================== ==================
15/08/20 02:36:28 INFO common. Storage: Storage directory/home/bkjia/hadoop/hdfs/name has been successfully formatted.
15/08/20 02:36:29 INFO namenode. TransferFsImage: Opening connection to http: // bkjia117: 50070/imagetransfer? Getimage = 1 & txid = 0 & storageInfo =-60: 1244139539: 0: hadoop-cluster
15/08/20 02:36:30 INFO namenode. TransferFsImage: Image Transfer timeout configured to 60000 milliseconds
15/08/20 02:36:30 INFO namenode. TransferFsImage: Transfer took 0.01 s at 0.00 KB/s
15/08/20 02:36:30 INFO namenode. TransferFsImage: Downloaded file fsimage. ckpt_000000000000000 size 352 bytes.
15/08/20 02:36:30 INFO util. ExitUtil: Exiting with status 0
15/08/20 02:36:30 INFO namenode. NameNode: SHUTDOWN_MSG:
/*************************************** *********************
SHUTDOWN_MSG: Shutting down NameNode at bkjia119/192.168.75.119
**************************************** ********************/

5. Enable standby namenode

[Bkjia @ bkjia119 hadoop-2.6.0] $ sbin/hadoop-daemon.sh start namenode
Starting namenode, logging to/home/bkjia/hadoop-2.6.0/logs/hadoop-bkjia-namenode-bkjia119.out
[Bkjia @ bkjia119 hadoop-2.6.0] $ jps
14401 JournalNode
15407 NameNode
Jps 15455

On the web, the other two displays are in standy status:

Use this command to switch nn1 to the active status:

Bin/hdfs haadmin-ns hadoop-cluster1-transitionToActive nn1

The other two principles are the same:

Enable all datanode, which can be used only when ssh password-free logon is configured. Refer:

[Bkjia @ bkjia117 hadoop-2.6.0] $ sbin/hadoop-daemons.sh start datanode

If the token is enabled, the preset values are 192.168.1.118, 192.168.1.119, and 192.168.1.120.

Start yarn

[Bkjia @ bkjia117 Co hadoop-2.6.0] $ sbin/start-yarn.sh

Starting yarn daemons
Starting resourcemanager, logging to/home/bkjia/hadoop-2.6.0/logs/yarn-bkjia-resourcemanager-bkjia117.out
Bkjia118: nodemanager running as process 14812. Stop it first.
Bkjia120: nodemanager running as process 14025. Stop it first.
Bkjia119: nodemanager runing as process 17590. Stop it first.
[Bkjia @ bkjia117 Co hadoop-2.6.0] $ jps
NameNode
Jps
ResourceManager

You can also see that there are three datanode

Let's end with a summary ...... if you learn big data by yourself, it is enough to have a simple deployment, so that you can run your program in hdfs. Such a cluster should be at the end, or do the research in detail when necessary. Please wait for the next stage ~~

How does Hadoop modify the size of HDFS file storage blocks?

Copy local files to HDFS

Download files from HDFS to local

Upload local files to HDFS

Common commands for HDFS basic files

Introduction to HDFS and MapReduce nodes in Hadoop

Hadoop practice Chinese version + English version + Source Code [PDF]

Hadoop: The Definitive Guide (PDF]

This article permanently updates the link address:

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.