1. Background of HDFS Federation
In hadoop 1.0, the single namenode Design of HDFS brings about many problems, including single point of failure (spof), memory restriction, and cluster scalability and lack of isolation mechanisms (different businesses use the same namenode to affect each other) to solve these problems, hadoop 2.0 introduces the HA solution and HDFS Federation based on shared storage, which focuses on HDFS Federation.
HDFS Federation means that a HDFS cluster can have multiple namenode at the same time. These namenode manage a part of data separately and share all the storage resources of datanode. This design solves the following problems in a single namenode:
(1) scalability of HDFS clusters. Multiple namenode sub-directories allow a cluster to expand to more nodes, which no longer limits the number of file storage as in 1.0 due to memory restrictions.
(2) more efficient performance. Multiple namenode manage different data and provide external services at the same time, which will provide users with higher read/write throughput.
(3) Good isolation. You can assign different service data to different namenode for management as needed, so that there is little impact between different services.
Note that HDFS Federation cannot solve single point of failure (spof). That is to say, each namenode has a single point of failure (spof) problem, you need to deploy a backup namenode for each namenode to cope with the impact of namenode disconnection on the business.
2. installation environment
Since HDFS Federation still has the single point of failure (spof) problem, we should consider building together ha and HDFS Federation. Each node role is assigned as follows:
User Name |
IP address |
Namenode (Active) |
Secondarynamenode |
Namenode (Standby) |
Journalnode |
Datanode |
Zookeeper |
Owner Group |
Centos94 |
192.168.1.94 |
Y |
Y |
|
|
Y |
|
Cloud-1 |
Centos105 |
192.168.1.105 |
|
Y |
Y |
|
Y |
|
Centos95 |
192.168.1.95 |
Y |
Y |
|
Y |
Y |
Y |
Cloud-2 |
Centos112 |
192.168.1.112 |
|
Y |
Y |
Y |
Y |
Y |
Centos111 |
192.168.1.111 |
|
|
|
Y |
Y |
Y |
|
Software Version:
Hadoop: hadoop-2.2.0.tar.gz (applicable to 64-bit systems after source code self-compilation)
Zookeeper: zookeeper-3.4.6.tar.gz
For more information about the installation environment preparations, see hadoop, hbase, and hive Integrated Installation documents.
The following are some parameters:
Ha + Federation, All nodes common part hdfs-site.xml
<Property>
<Name> DFS. namenode. Name. dir </Name>
& Lt; Value & gt;/home/admin/hadoop-2.2.0/dfs/Name & lt;/value & gt;
</Property>
<Property>
<Name> DFS. datanode. Data. dir </Name>
<Value>/home/Administrator/hadoop-2.2.0/dfs/Data </value>
</Property>
<Property>
<Name> DFS. nameservices </Name>
<Value> cloud-1, cloud-2 </value>
</Property>
<Property>
<Name> DFS. Replication </Name>
<Value> 3 </value>
</Property>
<Property>
<Name> DFS. Ha. namenodes. Cloud-1 </Name>
<Value> centos94, centos105 </value>
</Property>
<Property>
<Name> DFS. Ha. namenodes. Cloud-2 </Name>
<Value> centos95, centos112 </value>
</Property>
<Property>
<Name> DFS. namenode. rpc-address.cloud-1.centos94 </Name>
<Value> centos94: 9000 </value>
</Property>
<Property>
<Name> DFS. namenode. http-address.cloud-1.centos94 </Name>
<Value> centos94: 50070 </value>
</Property>
<Property>
<Name> DFS. namenode. rpc-address.cloud-1.centos105 </Name>
<Value> centos105: 9000 </value>
</Property>
<Property>
<Name> DFS. namenode. http-address.cloud-1.centos105 </Name>
<Value> centos105: 50070 </value>
</Property>
<Property>
<Name> DFS. namenode. rpc-address.cloud-2.centos95 </Name>
<Value> centos95th: 9000 </value>
</Property>
<Property>
<Name> DFS. namenode. http-address.cloud-2.centos95 </Name>
<Value> centos95th: 50070 </value>
</Property>
<Property>
<Name> DFS. namenode. rpc-address.cloud-2.centos112 </Name>
<Value> centos112: 9000 </value>
</Property>
<Property>
<Name> DFS. namenode. http-address.cloud-2.centos112 </Name>
<Value> centos112: 50070 </value>
</Property>
<Property>
<Name> DFS. journalnode. edits. dir </Name>
<Value>/home/admin/hadoop-2.2.0/tmp/journal </value>
</Property>
<Property>
<Name> DFS. Ha. Fencing. methods </Name>
<Value> sshfence </value>
</Property>
<Property>
<Name> DFS. Ha. Fencing. Ssh. Private-key-files </Name>
<Value>/home/admin/. Ssh/id_rsa </value>
</Property>
In cloud-1And cloud-2Different Configurations:
Cloud-1
<Property>
<Name> DFS. namenode. Shared. edits. dir </Name>
<Value> qjournal: // centos95: 8485; centos111: 8485; centos112: 8485/cloud-1 </value>
<Description> when two namenode of cloud-1 share the edits file directory, the journalnode cluster is used for maintenance. </description>
</Property>
<Property>
<Name> DFS. Ha. automatic-failover.enabled.cloud-1 </Name>
<Value> true </value>
</Property>
<Property>
<Name> DFS. Client. failover. Proxy. provider. Cloud-1 </Name>
<Value> org. Apache. hadoop. HDFS. server. namenode. Ha. configuredfailoverproxyprovider </value>
</Property>
Cloud-2
<Property>
<Name> DFS. namenode. Shared. edits. dir </Name>
<Value> qjournal: // centos95: 8485; centos111: 8485; centos112: 8485/cloud-2 </value>
<Description> when two namenode of cloud-2 Share the edits file directory, the journalnode cluster is used for maintenance. </description>
</Property>
<Property>
<Name> DFS. Ha. automatic-failover.enabled.cloud-2 </Name>
<Value> true </value>
</Property>
<Property>
<Name> DFS. Client. failover. Proxy. provider. Cloud-2 </Name>
<Value> org. Apache. hadoop. HDFS. server. namenode. Ha. configuredfailoverproxyprovider </value>
</Property>
Configuration:Core-site.xml (All nodes)
<Configuration>
<Property>
<Name> fs. defaultfs </Name>
<Value> HDFS: // cloud-1 </value>
<Description> This is the default HDFS path. Cloud-1 is used in centos94 and centos105 nodes, and cloud-2 is used in centos95 and centos112 nodes. </description>
</Property>
<Property>
<Name> hadoop. tmp. dir </Name>
<Value>/home/admin/hadoop-2.2.0/tmp </value>
</Property>
<Property>
<Name> HA. zookeeper. Quorum </Name>
<Value> centos95: 2181, centos111: 2181, centos112: 2181 </value>
<Description> zookeeper cluster <description>
</Property>
</Configuration>
Configure slaves
VI slaves
Centos94
Centos95
Centos111
Centos112
Centos105
Configure mapred-site.xml (all nodes)
<Configuration>
<Property>
<Name> mapreduce. Framework. Name </Name>
<Value> yarn </value>
</Property>
</Configuration>
Configure yarn-site.xml (all nodes)
<Configuration>
<! -- Site specific yarn configurationproperties -->
<Property>
<Name> yarn. nodemanager. Aux-services </Name>
<Value> mapreduce_shuffle </value>
</Property>
<Property>
<Name> yarn. nodemanager. aux-services.mapreduce.shuffle.class </Name>
<Value> org. Apache. hadoop. mapred. shufflehandler </value>
</Property>
</Configuration>
Start:
1,StartZookeeper
Run the command on centos95, centos111, and centos112.
Bin/zkserver. Sh start
2,StartJournalnode
Run the following command on centos95, centos111, and centos112:
Sbin/hadoop-daemon.shstart journalnode
3,InZookeeperInitialize ha in ClusterStatus (Only needs to be executed for the first time)
Run on centos95 and centos112 (on the namenodes node ):
Bin/hdfszkfc-formatzk
When creating the Federation environment, you must maintain the value of $ {cluster_id} to ensure that all NN resources in the same cluster can be shared. The specific method is to format the first NN, obtain the value of $ {cluster_id}, and format other NN with the following command:
HDFS namenode-format-clusterid $ {cluster_id}
4,In cloud-1Centos94Run the following command on the node ):
./HDFS namenode-format-clusterid hadoop (name specified by yourself or generated by the cluster itself)
Sbin/hadoop-daemon.sh start namenode
The generated hadoop-cluster ID is shared by the entire cluster. Ensure that the two nameservice can share all the datanodes. Otherwise, the clusterids generated after the two nameservice are inconsistent, and datanode is randomly registered to different namenode.
Synchronize the metadata of the master nn on the centos105 (standby NN) node:
Bin/HDFS namenode-bootstrapstandby
Start slave NN:
Sbin/hadoop-daemon.sh start namenode
Start zkfc on centos94 and centos105:
Sbin/hadoop-daemon.sh start zkfc
After the command is executed, one node of hadoop0 and hadoop1 will become active.
5,In cloud-2Centos95Execute the format on the node:
./Hdfsnamenode-format-clusterid hadoop
Sbin/hadoop-daemon.sh start namenode
Synchronize the metadata of the master nn on the centos112 (standby NN) node:
Bin/HDFS namenode-bootstrapstandby
Start slave NN:
Sbin/hadoop-daemon.sh start namenode
Start zkfc on centos95 and centos112:
Sbin/hadoop-daemon.shstart zkfc
6,Start all datanode
Run: sbin/hadoop-daemons.sh startdatanode on the active namenode Node
7,Effect after startup:
We can see that the clusterid of the four is consistent.
8. Start Yarn
Run the following command on centos 94:
Sbin/start-yarn.sh
9. Disable the Cluster
Run the following command on the master node where Rm and NN are located:
Stop yarn:
Stop-yarn.sh
Stop HDFS:
Stop-dfs.sh
Stop zookeeper:
Zkserver. Sh stop
10. Summary
Question 1: After formatting, The CIDS of the two namespaces are inconsistent.
Solution: delete all TMP and DFS files and reformat them to start.
11. Install hbase cluster in hadoop cluster after Federation and HA Configuration
When you configure hbase clusters in each federation, several federations have several hbase clusters, while hbase clusters are isolated in parallel. You need to place the hadoop configuration file hdfs-site.xml files in each federation under the conf directory of hbase, and then configure hbase. rootdir to the federated prefix.
12. Solution
After converting NN from non-ha to Ha, clear/hbase/splitwal
Perform the following steps on zookeeper node:
1. Run/usr/lib/zookeeper/bin/zkcli. Sh.
2. ls/hbase/splitwal. If yes, run step 3.
3. RMR/hbase/splitwal
4. Restart hbase
Put the hadoop hdfs-site.xml and core-site.xml under hbase/Conf, and then restart hbase.
13. Handling of hmaster startup failure in the traditional hadoop mode (master-slave structure) (if no data exists)
1. Run ~ /Zookeeper/bin/zkcli. Sh
2. ls/hbase/splitwal. If yes, run step 3.
3. RMR/hbase/splitwal
4. Restart hbase
HDFS Federation and namenode ha