The construction of Hadoop distributed cluster

Last Update:2018-07-26 Source: Internet

Author: User

Tags mkdir tmp folder zookeeper nameserver

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Hadoop2.0 has released a stable version, adding a lot of features, such as HDFs HA, yarn, and so on. The newest hadoop-2.4.1 also adds yarn HA
Note: The hadoop-2.4.1 installation package provided by Apache is compiled on a 32-bit operating system because Hadoop relies on some C + + local libraries, so if you install hadoop-2.4.1 on a 64-bit operation you need to re-compile on the 64 operating system (It is recommended that the first installation with 32-bit system, I will compile the 64-bit is also uploaded to the group share, if interested can compile their own)
Pre-preparation is not detailed, the class is introduced 1. Modify the Linux hostname 2. Modify IP 3. Modify the mapping relationship between host name and IP ##### #注意 ##### #如果你们公司是租用的服务器或是使用的云主机 (such as Huawei host, Alibaba Cloud host, etc.)/etc/ Hosts inside to configure is the network IP address and host name Mapping Relationship 4. Turn off firewall 5.ssh free login 6. Install JDK, configure environment variables, etc.
Cluster Planning: Host name IP installed software running process weekend01 192.168.1.201 jdk, Hadoop NameNode, Dfszkfailovercontroller (ZKFC) weekend02 192.168.1.202 JDK, Hadoop NameNode, Dfszkfailovercontroller (ZKFC) weekend03 192.168.1.203 jdk, Hadoop ResourceManager Weekend04 192.168.1.204 jdk, Hadoop ResourceManager weekend05 192.168.1.205 jdk, Hadoop, zookeeper DataNode, NodeManager, Journalnode, Quorumpeermain weekend06 192.168.1.206 jdk, Hadoop, zookeeper DataNode, NodeManager, Journalnode, Quorumpeermain weekend07 192.168.1.207 jdk, Hadoop, zookeeper DataNode, NodeManager, Journalnode, Quorumpeermain
Description: 1. In hadoop2.0, it is usually composed of two namenode, one in active state and the other in standby state. Active Namenode provides services externally, while standby Namenode does not provide services to the outside, synchronizing only the state of active namenode so that it can switch quickly when it fails. Hadoop2.0 officially provides two types of HDFs ha solutions, one for NFS and the other for QJM. Here we use the simple QJM. In this scenario, the primary and standby Namenode synchronize metadata information between a set of Journalnode, and a single piece of data is considered successful if it is successfully written to most journalnode. Typically configure an odd number of journalnode there is also a zookeeper cluster configured for ZKFC (Dfszkfailovercontroller) failover, which automatically switches when active Namenode is hung out standby Namenode for standby State 2. Hadoop-2.2.0 still has a problem, that is, there is only one ResourceManager, there is a single point of failure, hadoop-2.4.1 solve the problem, there are two ResourceManager, one is active, one is standby, state by Zookeep Er for coordinated installation steps: 1. Installation configuration Zooekeeper cluster (on WEEKEND05) 1.1 decompression TAR-ZXVF zookeeper-3.4.5.tar.gz-c/weekend/1.2 Modify Configuration Cd/weekend/zook EEPER-3.4.5/CONF/CP zoo_sample.cfg zoo.cfg Vim zoo.cfg modified: Datadir=/weekend/zookeeper-3.4.5/tmp added at last: server.1= weekend05:2888:3888 server.2=weekend06:2888:3888 server.3=weekend07:2888:3888 Save exit and then create a TMP folder mkdir/weekend/ ZOOKEEPER-3.4.5/TMP Create an empty file Touch/weekend/zookeeper-3.4.5/tmp/myid finally write ID to the file echo 1 >/weekend/zookeeper-3.4.5/ Tmp/myid 1.3 Copy the configured zookeeper to the other nodes (first create a weekend directory under the weekend06, weekend07 root directory: mkdir/weekend) scp-r/weekend/zookeeper-3.4.5/ Weekend06:/weekend/scp-r/weekend/zookeeper-3.4.5/weekend07:/weekend/
Note: Modify WEEKEND06, weekend07 corresponding/weekend/zookeeper-3.4.5/tmp/myid content Weekend06:echo 2 >/weekend/zookeeper-3.4.5/tmp/ myID Weekend07:echo 3 >/weekend/zookeeper-3.4.5/tmp/myid
2. Install configure Hadoop cluster (operate on WEEKEND01) 2.1 Unzip TAR-ZXVF hadoop-2.4.1.tar.gz-c/weekend/2.2 Configure HDFs (hadoop2.0 All profiles are in $hadoop_ Home/etc/hadoop directory) #将hadoop添加到环境变量中 vim/etc/profile export java_home=/usr/java/jdk1.7.0_55 export hadoop_home=/ weekend/hadoop-2.4.1 export path= $PATH: $JAVA _home/bin: $HADOOP _home/bin
The configuration files for #hadoop2.0 are all under $hadoop_home/etc/hadoop cd/home/hadoop/app/hadoop-2.4.1/etc/hadoop
2.2.1 Modify HADOOP-ENV.SH Export java_home=/home/hadoop/app/jdk1.7.0_55
2.2.2 Modify Core-site.xml <configuration> <!--specify HDFs nameservice to ns1---<property> <name> The default file system used in the Fs.defaultfs</name> collection <value>hdfs://ns1/</value> now changes to nameserver location </property> <!--Specify Hadoop temp directory-<property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/app/ Hadoop-2.4.1/tmp</value>//refers to each Hadoop working directory. If this node is Namenode, a directory of name will be created under this directory, and if this node is Datanode, a directory of data will be created in this directory </property>
<!--Specify zookeeper address--Specify address to zookeeper <property> <name>ha.zookeeper.quorum</name> < Value>weekend05:2181,weekend06:2181,weekend07:2181</value> can be added after the machine is more </property> </ Configuration>
2.2.3 Modify Hdfs-site.xml <configuration> <!--specifies that HDFs Nameservice is ns1 and needs to be consistent in Core-site.xml--< Property> <name>dfs.nameservices</name>//define the name of a nameservice <value>ns1</value>// If you have multiple nameserver can be added "," Add Ns1,ns2 </property> <!--ns1 Below are two namenode, respectively nn1,nn2--<property> <name>dfs.ha.namenodes.ns1</name>//This is the property of configuration nameserver <value>nn1,nn2</value>// There are two namenode under the nameserver. You can take two IDs. This is the logical ID and the system does not know which two hosts it points to. Below to define the specific </property>
<!--NN1 RPC communication address--<property> <name>dfs.namenode.rpc-address.ns1.nn1</name>// Define the communication address for RPC for NN1 <value>weekend01:9000</value>//Specify a host </property>

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

The construction of Hadoop distributed cluster

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

The construction of Hadoop distributed cluster

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support