Hadoop (CDH4 release) Cluster deployment (deployment script, namenode high availability, hadoop Management)

Source: Internet
Author: User
Tags hdfs dfs hadoop ecosystem saltstack

Preface

After a while of hadoop deployment and management, write down this series of blog records.

To avoid repetitive deployment, I have written the deployment steps as a script. You only need to execute the script according to this article, and the entire environment is basically deployed. The deployment script I put in the Open Source China git repository (http://git.oschina.net/snake1361222/hadoop_scripts ).

All the deployment in this article is based on CDH4 of cloudera. CDH4 is a series of yum packages in the hadoop ecosystem packaged by cloudera, And put CDH4 in its own yum repository, it can greatly improve the simplicity of hadoop environment deployment.

The deployment process in this article covers the HA Implementation of namenode, The hadoop management solution, the synchronization of hadoop configuration files, and the rapid deployment of scripts ).

Environment preparation

A total of five machines are used as the hardware environment, all of which are centos 6.4

  • Namenode & resourcemanager master server: 192.168.1.1

  • Namenode & resourcemanager Backup Server: 192.168.1.2

  • Datanode & nodemanager server: 192.168.1.100 192.168.1.101 192.168.1.102

  • Zookeeper server cluster (for namenode high-availability automatic failover): 192.168.1.100 192.168.1.101

  • Jobhistory server (used to record mapreduce logs): 192.168.1.1

  • NFS for namenode HA: 192.168.1.100

Environment deployment 1. Add the YUM repository to CDH4 1. the best way is to put the cdh4 package in the self-built yum warehouse. For how to build a self-built yum warehouse, see self-built YUM Warehouse 2. if you do not want to build a self-built yum repository, perform the following operations on all hadoop machines to add the yum repository to cdn4.
wget http://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpmsudo yum --nogpgcheck localinstall cloudera-cdh-4-0.x86_64.rpm
2. Create an NFS server for namenode HA. 1. log on to 192.168.1.100 and execute the following script. CreateNFS. sh
#!/bin/bashyum -y install rpc-bind nfs-utilsmkdir -p /data/nn_ha/echo "/data/nn_ha  *(rw,root_squash,all_squash,sync)" >> /etc/exports/etc/init.d/rpcbind start/etc/init.d/nfs  startchkconfig  --level 234 rpcbind   onchkconfig  -level 234 nfs  on
Iii. Hadoop Namenode & resourcemanager master server environment deployment 1. log on to 192.168.1.1, create a script directory, and copy the script from the git Repository


yum –y install gitmkdir –p /opt/cd /opt/git clone http://git.oschina.net/snake1361222/hadoop_scripts.git/etc/init.d/iptables stop
2. Modify the hostname
sh /opt/hadoop_scripts/deploy/AddHostname.sh
3. modify the configuration file of the deployment script
Vim/opt/kingsoft/hadoop_scripts/deploy/config # Add the address of the master server, that is, the address of the namenode master server master = "192.168.1.1" # Add the nfs server address nfsserver = "192.168.1.100"
4. Edit the hosts file (this file will be synchronized to all machines in the hadoop cluster)
vim /opt/hadoop_scripts/share_data/resolv_host127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4::1         localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.1.1 nn.dg.hadoop.cn192.168.1.2 nn2.dg.hadoop.cn192.168.1.100 dn100.dg.hadoop.cn192.168.1.101 dn101.dg.hadoop.cn192.168.1.102 dn102.dg.hadoop.cn
5. Run the deployment script CreateNamenode. sh.
sh /opt/hadoop_scripts/deploy/CreateNamenode.sh
6. Build a saltstack master

PS: similar to puppet's open-source server management tool, it is lightweight. It is used to manage hadoop clusters and schedule datanode. For details about saltstack, see SaltStack deployment and use.

A. Install
yum -y install salt salt-master
B. modify the configuration file '/etc/salt/Master'. The items to be modified are marked below.
Modify the listening IP Address: interface: 0.0.0.0 multi-thread pool: worker_threads: 5 enable the task cache: Official descriptions enable the cache to carry 5000 minion) job_cache Enable Automatic authentication: auto_accept: True

C. enable the Service

/etc/init.d/salt-master startchkconfig  salt-master on
7. My sample configuration has been copied during deployment, so you only need to modify some configuration files. A./etc/hadoop/conf/hdfs-site.xml (in fact, it is based on the actual modification of the Host Name address)
<Property> <name> dfs. namenode. rpc-address.mycluster.ns1 </name> <value> nn.dg.hadoop.cn: 8020 </value> <description> defines the rpc address of ns1 </description> </property> <name> dfs. namenode. rpc-address.mycluster.ns2 </name> <value> nn2.dg.hadoop.cn: 8020 </value> <description> defines the rpc address of NS2. </description> </property> <name> ha. zookeeper. quorum </name> <value> dn100.dg.hadoop.cn: 2181, dn101.dg.hadoop.cn: 2181, dn102.dg.hadoop.cn: 2181, </value> <description> specify the ZooKeeper Cluster machine list for HA </description> </property>
B. Mapred-site.xml
<property> <name>mapreduce.jobhistory.address</name> <value>nn.dg.hadoop.cn:10020</value></property><property> <name>mapreduce.jobhistory.webapp.address</name> <value>nn.dg.hadoop.cn:19888</value></property>


C. Yarn-site.xml
<property>  <name>yarn.resourcemanager.resource-tracker.address</name>  <value>nn.dg.hadoop.cn:8031</value></property><property>  <name>yarn.resourcemanager.address</name>  <value>nn.dg.hadoop.cn:8032</value></property><property>  <name>yarn.resourcemanager.scheduler.address</name>  <value>nn.dg.hadoop.cn:8030</value></property><property>  <name>yarn.resourcemanager.admin.address</name>  <value>nn.dg.hadoop.cn:8033</value></property>


Iii. Hadoop Namenode & resourcemanager backup server environment deployment 1. log on to 192.168.1.2, create a script directory, and synchronize the script from the master server
/etc/init.d/iptables stopmkdir –p /opt/hadoop_scriptsrsync –avz 192.168.1.1::hadoop_s   /opt/hadoop_scripts
2. Run the deployment script CreateNamenode. sh.
sh /opt/hadoop_scripts/deploy/CreateNamenode.sh
3. Synchronize hadoop configuration files
rsync –avz 192.168.1.1::hadoop_conf  /etc/hadoop/conf
4. Deploy the saltstack Client
sh /opt/hadoop_scripts/deploy/salt_minion.sh
4. zookeeper server cluster deployment

Zookeeper is an open-source distributed service. It is used for auto fail over of namenode.

1. Install
yum install zookeeper zookeeper-server
2. modify the configuration file /Etc/zookeeper/conf/zoo. cfg
MaxClientCnxns = 50 # The number of milliseconds of each ticktickTime = 2000 # The number of ticks that the initial # synchronization phase can takeinitLimit = 10 # The number of ticks that can pass between # sending request and getting an acknowledgementsyncLimit = 5 # the directory where the snapshot is stored. dataDir =/var/lib/zookeeper # the port at which the clients will connectclientPort = 2181 # All machines in the zookeeper cluster are specified here, the machines in this configuration cluster are the same server.1 = dn100.dg.hadoop.cn: 2888: 3888server. 2 = dn101.dg.hadoop.cn: 2888: 3888
3. Specify the id of the current machine and enable the Service
# For example, the current machine is 192.168.1.100 (dn100.dg.hadoop.cn), it is server.1, the id is 1, SO: echo "1">/var/lib/zookeeper/myidchown-R zookeeper. zookeeper/var/lib/zookeeper/service zookeeper-server init/etc/init. d/zookeeper-server startchkconfig zookeeper-server on # and so on, deploy 192.168.1.101
5. Deploy the datanode & nodemanager servers. 1. log on to the datanode server, create a script directory, and synchronize the scripts from the master server.
/etc/init.d/iptables stopmkdir –p /opt/hadoop_scriptsrsync –avz 192.168.1.1::hadoop_s   /opt/hadoop_scripts
2. Modify the hostname and run the deployment script CreateDatanode. sh.
sh /opt/hadoop_scripts/deploy/AddHostname.shsh /opt/hadoop_scripts/deploy/CreateDatanode.sh
Cluster Initialization

Now, the environment of the hadoop cluster has been deployed.

1. HA high-availability initialization of namenode 1. Execute the zookeeper failover function formatting on the namenode master server (192.168.1.1)
sudo –u hdfs hdfs zkfc –formatZK
2. Start the zookeeper cluster service (192.168.1.100 192.168.1.101)
/etc/init.d/zookeeper-server start
3. Service zkfc of the namenode Master/Slave server (192.168.1.1 192.168.1.2)
/etc/init.d/hadoop-hdfs-zkfc start
4. Format hdfs on the namenode master server (192.168.1.1)
# Ensure that sudo-u hdfs hadoop namenode-format is formatted by hdfs users
5. For the first time to set up high availability of namenode, it takes a lot of time to copy the data under name. dir to the namenode Backup Server. a. Run the command on the master server (192.168.1.1 ).
tar -zcvPf /tmp/namedir.tar.gz /data/hadoop/dfs/name/nc -l 9999 < /tmp/namedir.tar.gz
B. Run the command on the slave server (192.168.1.2 ).
wget 192.168.1.1:9999 -O /tmp/namedir.tar.gztar -zxvPf /tmp/namedir.tar.gz
6. Start both master and slave services
/etc/init.d/hadoop-hdfs-namenode start/etc/init.d/hadoop-yarn-resourcemanager start
7. view the hdfs web Interface

Http: // 192.168.1.1: 9080 http: // 192.168.1.2: 9080 # If you see on the web interface that both namenode are in the backup status, that is auto fail over configuration is not successful # view zkfc log (/var/log/hadoop-hdfs/hadoop-hdfs-zkfc-nn.dg.s.kingsoft.net.log) # view the log of the zookeeper cluster (/var/log/zookeeper. log)
8. Now you can try to disable the namenode master service to see if the master-slave switchover is possible. 2. Enable the hdfs cluster.

By now, all hadoop deployment has been completed. Now, start the cluster and verify the effect.

1. Start all datanode servers
# Do not remember the previously created saltstack management tool. log on to the saltstack master (192.168.1.1) and execute salt-v "dn *" cmd. run "/etc/init. d/hadoop-hdfs-datanode start"
2. Check the hdfs web interface to see if all of them are live nodes. 3. If there is no problem, try hdfs now.
# Create a tmp directory sudo-u hdfs dfs-mkdir/tmp # create a 10 Gb empty file and calculate its MD5 value, put hdfsdd if =/dev/zero of =/data/test_10G_file bs = 1G count = 10md5sum/data/test_10G_filesudo-u hdfs dfs-put/data/test_10G_file/tmpsudo- u hdfs dfs-ls/tmp # Now you can try to disable a datanode, then pull out the test file and calculate MD5 again to see if it is the same as sudo-u hdfs dfs-get/tmp/test_10G_file/tmp/md5sum/tmp/test_10G_file
3. Enable yarn Cluster

In addition to hdfs for Distributed Storage of big data, hadoop also has more important components, distributed computing (mapreduce ). Now let's start the mapreducev2 yarn Cluster

1. log on to the master server and set resourcemanager to 192.168.1.1)
/etc/init.d/hadoop-yarn-resourcemanager start
2. Start all nodemanager services
# Log on to the saltstack master and run salt-v "dn *" cmd. run "/etc/init. d/hadoop-yarn-nodemanager start"
3. check the yarn task tracing page (http: // 192.168.1.1: 9081/) to see if all nodes have been added. hadoop uses a benchmark mapreduce instance to test whether the yarn environment is normal.
# TestDFSIO test HDFS read/write performance, write 10 files, each file 1 GB. su hdfs-hadoop jar/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.2.1-tests.jar TestDFSIO-write-nrFiles 10-fileSize 1000 # Sort Test MapReduce # output data to the random-data DIRECTORY hadoop jar/ usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar randomwriter random-data # Run sort program hadoop jar/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar sort random-data sorted-data # Verification sorted-data File Sorting hadoop jar/usr/lib/hadoop-mapreduce/hadoop-mapreduce-client-jobclient-2.0.0-cdh4.2.1-tests.jar testmapredsort-sortInput random-data \-sortOutput sorted-data
Hadoop cluster management 1. Add datanode & nodemanager nodes to 1. Modify the hosts table. For example, if a node 192.168.1.103 needs to be added
vim /opt/hadoop_scripts/share_data/resolv_host127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4::1         localhost localhost.localdomain localhost6 localhost6.localdomain6192.168.1.1 nn.dg.hadoop.cn192.168.1.2 nn2.dg.hadoop.cn192.168.1.100 dn100.dg.hadoop.cn192.168.1.101 dn101.dg.hadoop.cn192.168.1.102 dn102.dg.hadoop.cn192.168.1.103 dn103.dg.hadoop.cn
2. Modify the hostname, synchronize the script directory, and execute the deployment
mkdir –p /opt/hadoop_scriptsrsync –avz 192.168.1.1::hadoop_s   /opt/hadoop_scriptssh /opt/hadoop_scripts/deploy/CreateDatanode.shsh /opt/hadoop_scripts/deploy/AddHostname.sh


3. enable the Service
/etc/init.d/hadoop-hdfs-datanode start/etc/init.d/hadoop-yarn-nodemanager start
Ii. Modify the hadoop configuration file

Generally, a hadoop configuration is maintained in a hadoop cluster. This hadoop configuration needs to be distributed to all Members in the cluster. The practice here is salt + rsync

# Modify the hadoop configuration file/etc/hadoop/conf/of the namenode master server, and then execute the following command to synchronize it to all the Members in the cluster sync_h_conf # The Script directory also needs to be maintained, for example, in the hosts file/opt/hadoop_scripts/share_data/resolv_host, run the following command to synchronize the modification to all the members of the Cluster sync_h_script # In fact, these two commands are the alias of my own salt command, check/opt/hadoop_scripts/profile. d/hadoop. sh
Iii. Monitoring

A common solution is ganglia and nagios monitoring. ganglia collects a large number of metrics and uses graphical programs. nagios will trigger an alarm when a metric exceeds the threshold.

In fact, hadoop has an interface to provide our own Monitoring Program, and this interface is relatively simple, through which you can access http: // 192.168.1.1: 9080/jmx, the return value is in JSON format, the content is also very detailed. However, it is a waste to return a large JSON string for each query. In fact, the interface also provides detailed update queries. For example, if I only want to find the system information, you can call the interface http: // 192.168.1.1: 9080/jmx? Qry = java. lang: type = OperatingSystem. Qry is followed by the value of the key "name" of the entire JSON.


Summary

I have encountered many difficulties in deploying the hadoop cluster. I plan to write the problems I encountered in the next article. If you have any problems with the deployment in this article, contact me and exchange ideas. QQ: 83766787. Of course you are also welcome to modify the deployment script together, git address is http://git.oschina.net/snake1361222/hadoop_scripts

This article from the "lxcong O & M Technology" blog, please be sure to keep this source http://lxcong.blog.51cto.com/7485244/1241004

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.