Deploying Hadoop clusters on Linux HA-QJM Chapter

Source: Internet
Author: User
Tags failover zookeeper

Environment

The basic article needs 4 machines (one Namenode, three datanode);

HA Chapter requires 8 machines: two Namenode (one for active NN, the other for standby nn), three Datanode, three zookeeper (you can also omit these three, zookeeper daemon on other machines). It actually requires 3 journalnode, but because it's relatively lightweight, it's deployed here on the Datanode.


The following information is configured on three zookeeper machines:

1 Creating a Hadoop user

2 do ssh password-free login

3 Modifying host Names

4 Installing the JDK

5 Download Zookeeper installation package

: Http://mirror.nus.edu.sg/apache/zookeeper

Download zookeeper-3.4.6 to/opt/directory, unzip

6 Modifying/etc/profile

Export zoo_home=/opt/zookeeper-3.4.6
Export Zoo_log_dir=/opt/zookeeper-3.4.6/logs

To make it effective:
Source/etc/profile

7 Setting up Zookeeper data storage directory:

Mkdir/opt/zookeeper-3.4.6/data

8 Create the configuration file under $zoo_home/conf:

VI zoo.cfg Add the following content:

# The number of milliseconds of each tick
ticktime=2000
# The number of ticks that initial
# Synchronization phase can take
initlimit=10
# The number of ticks that can pass between
# Sending a request and getting an acknowledgement
Synclimit=5
# The directory where the snapshot is stored.
# do not use/tmp for storage,/tmp here are just
# example Sakes.
Datadir=/opt/zookeeper-3.4.6/data
# The port at which the clients would connect
clientport=2181
server.1=10.9.214.167:31316:31317
server.2=10.9.214.18:31316:31317
server.3=10.9.214.211:31316:31317

9 Create the file myID in the/opt/zookeeper-3.4.6/data/directory, and write the content, Zookeeper1 write 1,zookeeper2 write 2,zookeeper3 write 3, such as:

Echo 1 >/opt/zookeeper-3.4.6/data/myid


10 Start the Zookeeper service:

CD $ZOO _home

./bin/zkserver.sh Start


11 Verification

To test whether the zookeeper cluster is successful, execute the following command in the $zoo_home directory, if no error indicates that the cluster was created successfully:

./bin/zkcli.sh-server localhost:31315


Hadoop configuration files only need to modify core-site.xml and Hdfs-site.xml

Configure Core-site.xml

<property>

<name>hadoop.tmp.dir</name>

<value>/opt/hadoop-2.6.0/tmp</value>

</property>

<property>

<name>fs.default.name</name>

<value>hdfs://10.9.214.151:9000</value>

</property>

<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>10.9.214.151</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>

<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster_haohzhang</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>10.9.214.167:2181,10.9.214.18:2181,10.9.214.211:2181</value>
</property>

Configure Hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/opt/hadoop-2.6.0/hdfs/name</value>
</property>

<property>
<name>dfs.dataname.data.dir</name>
<value>file:/opt/hadoop-2.6.0/hdfs/data</value>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster_haohzhang</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster_haohzhang</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster_haohzhang.nn1</name>
<value>10.9.214.151:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster_haohzhang.nn2</name>
<value>10.9.214.15:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster_haohzhang.nn1</name>
<value>10.9.214.151:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster_haohzhang.nn2</name>
<value>10.9.214.15:50070</value>
</property>
<property>

<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://10.9.214.158:8485;10.9.214.160:8485;10.9.214.149:8485/cluster_haohzhang</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster_haohzhang</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop-2.6.0/journalnode</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>


Operation Details

1 Remove all Namenode and Datanode,journalnode on the metadata first

2 starting three Journalnode processes

hadoop-daemon.sh Start Journalnode

3 Formatting Namenode

Performed on a namenode:

HDFs Namenode-format

This step will connect the Journalnode, and the Journalnode will be formatted as well.

4 start HDFs on the namenode that you just formatted:

CD $HADOOP _home/sbin;./start-dfs.sh

5 Perform on another namenode:

HDFs Namenode-bootstrapstandby

6 Verifying manual fail over

Execute on any one of the Namenode:

HDFs Haadmin-help <command>

can see the command usage, here we use

HDFs haadmin-getservicestate Nn1hdfs haadmin-getservicestate nn2

Gets the status of two namenode in two states: Standby, active

To manually toggle the status:

HDFs haadmin-failover nn1 nn2

Success, NN2 becomes active state.

7 Automatic switching with zookeeper

7.1 Initialize ZKFC on one of the Namenode

HDFs Zkfc-formatzk

This step attempts to connect to port 2181 on zookeeper and creates a znode inside the zookeeper

7.2 Starting HDFs on the Namenode

CD $HADOOP _home;./start-dfs.sh

7.3 Verify that the process has started successfully

[Email protected] sbin]$ JPS
12277 NameNode
12871 Jps
12391 Dfszkfailovercontroller

[Email protected] hadoop-2.6.0]$ JPS
7698 DataNode
7787 Journalnode
7933 Jps

7.4 Verifying failover automatic switching

Kill all Hadoop processes on the active Namenode:

Kill-9 <pid of Nn>

Then see if another namenode has changed from standby to Active state,

Note: In the configuration file, the health state is detected every 5 minutes by default

This article from the "Linux and Network" blog, reproduced please contact the author!

Deploying Hadoop clusters on Linux HA-QJM Chapter

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.