Fully Distributed Hadoop and Hbase Construction

Source: Internet
Author: User
Tags builtin hadoop fs
Bytes

1. Architecture changes from Hadoop1.0 to 2.0: 650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/1213314049-0.jpg "title =" image 1.png "alt =" wKioL1UNG-aSt10OAAHl295Gnjw111.jpg "/> 1. Hadoop 2.0 consists of three branches: HDFS, MapReduce, and YARN.

1. Architecture changes from Hadoop1.0 to 2.0

650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/1213314049-0.jpg "title =" image 1.png "alt =" wKioL1UNG-aSt10OAAHl295Gnjw111.jpg "/>

1. Hadoop 2.0 consists of three branches: HDFS, MapReduce, and YARN.

2. HDFSNN Federation and HA

3. MapReduce runs on YARN's MR

4. YARN Resource Management System


Ii. HDFS 2.0

1. Solve single point of failure and memory limitations in HDFS 1.0.

2. solving single point of failure

Hdfs ha is resolved through master-slave NameNode

If the primary NameNode fails, switch to the standby NameNode.

3. Solve memory limitations

HDFS Federation (Federated)

Horizontal scaling supports multiple NameNode

Each NameNode is in charge of some directories.

All NameNode share all DataNode storage resources

4. The usage remains unchanged only when the architecture changes.

Transparent to HDFS users

For commands and APIs in HDFS 1.0, you can still use $ hadoop fs-ls/user/hadoop/$ hadoop fs-mkdir/user/hadoop/data


Iii. hadoop 2.0 HA

1. Master-slave NameNode

2. solving single point of failure

The primary NameNode provides external services. The standby NameNode synchronizes the metadata of the primary NameNode to be switched.

All DataNode report data block information to both NameNode

3. Two switching options

Manual switchover: commands can be used to switch between the master and slave databases. HDFS can be used for upgrades.

Automatic Switch Based on Zookeeper

4. Zookeeper-based automatic failover Solution

Zookeeper Failover Controller monitors the NameNode health status and registers NameNode with Zookeeper

After the NameNode fails, ZKFC becomes active for the NameNode competitive lock to obtain the ZKFC lock.


Iv. Environment Construction

192.168.1.2 master

192.168.1.3 slave1

192.168.1.4 slave2

Hadoop versionhadoop-2.2.0.tar.gz

Hbase versionhbase-0.98.11-hadoop2-bin.tar.gz.

Zookeeper versionzookeeper-3.4.5.tar.gz

JDK versionjdk-7u25-linux-x64.gz.


1. Host HOSTS file configuration

[root@master ~]# cat /etc/hosts192.168.1.2 master192.168.1.3 slave1192.168.1.4 slave2[root@slave1 ~]# cat /etc/hosts192.168.1.2 master192.168.1.3 slave1192.168.1.4 slave2[root@slave2 ~]# cat /etc/hosts192.168.1.2 master192.168.1.3 slave1192.168.1.4 slave2


2. Configure mutual trust between nodes

[root@master ~]# useradd hadoop[root@slave1 ~]# useradd hadoop[root@slave2 ~]# useradd hadoop[root@master ~]# passwd hadoop[root@slave1 ~]# passwd hadoop[root@slave2 ~]# passwd hadoop[root@master ~]# su - hadoop[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave1[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub slave2[hadoop@master ~]$ ssh-copy-id -i ~/.ssh/id_rsa.pub master



3. JDK environment Configuration

[Root @ master ~] # Tar jdk-7u25-linux-x64.gz [root @ master ~] # Mkdir/usr/java [root @ master ~] # Music jdk-7u25-linux-x64.gz/usr/java [root @ master ~] # Cd/usr/java/[root @ master java] # ln-s jdk1.7.0 _ 25 jdk # Modify/etc/profile, add export JAVA_HOME =/usr/java/jdkexport CLASSPATH = $ CLASSPATH: $ JAVA_HOME/lib: $ JAVA_HOME/jre/libexport PATH =/usr/java/jdk/bin: $ PATH [root @ master ~] # Source/etc/profile [root @ master ~] # Java-versionjava version "1.7.0 _ 25" Java (TM) SE Runtime Environment (build 1.7.0 _ 25-b15) Java HotSpot (TM) 64-Bit Server VM (build 23.25-b01, mixed mode) # Same operations for slave1 and slave2


4. Hadoop Installation

[Root @ master ~] # Tar zxvf hadoop-2.2.0.tar.gz [root @ master ~] # Music hadoop-2.2.0/home/hadoop/[root @ master ~] # Cd/home/hadoop/[root @ master hadoop] # ln-s hadoop-2.2.0 hadoop [root @ master hadoop] # chown-R hadoop. hadoop/home/hadoop/[root @ master ~] # Cd/home/hadoop/etc/hadoop # modify the hadoop-env.sh File export JAVA_HOME =/usr/java/jdkexport HADOOP_HEAPSIZE = 200 # modify the mapred-env.sh File export JAVA_HOME =/usr/java/ jdkexport HADOOP_JOB_HISTORYSERVER_HEAPSIZE = 1000 # modify the yarn-env.sh File export JAVA_HOME =/usr/java/jdkJAVA_HEAP_MAX =-Xmx300mYARN_HEAPSIZE = 100 # modify the core-site.xml File
 
  
   
    
Fs. defaultFS
   
   
    
Hdfs: // master: 9000
   
  
          
   
    
Hadoop. tmp. dir
           
   
    
/Home/hadoop/tmp
   
  
  
   
    
Hadoop. proxyuser. hadoop. hosts
   
   
    
*
   
  
  
   
    
Hadoop. proxyuser. hadoop. groups
   
   
    
*
   
  
 # Modifying hdfs-site.xml files
 
  
   
    
Dfs. namenode. secondary. http-address
   
   
    
Master: 9001
   
  
  
   
    
Dfs. namenode. name. dir
   
   
    
/Home/hadoop/dfs/name
   
  
  
   
    
Dfs. datanode. data. dir
   
   
    
/Home/hadoop/dfs/data
   
  
  
   
    
Dfs. replication
   
   
    
2
   
  
  
   
    
Dfs. webhdfs. enabled
   
   
    
True
   
  
 # Modifying mapred-site.xml files
 
  
   
    
Mapreduce. framework. name
   
   
    
Yarn
   
  
  
   
    
Mapreduce. jobhistory. address
   
   
    
Master: 10020
   
  
  
   
    
Mapreduce. jobhistory. webapp. address
   
   
    
Master: 19888
   
  
  
   
    
Mapreduce. map. memory. mb
   
   
    
512
   
  
  
   
    
Mapreduce. map. cpu. vcores
   
   
    
1
   
  
  
   
    
Mapreduce. reduce. memory. mb
   
   
    
512
   
  
 # Modifying yarn-site.xml files
 
  
   
    
Yarn. nodemanager. aux-services
   
   
    
Mapreduce_shuffle
   
  
  
   
    
Yarn. nodemanager. aux-services.mapreduce.shuffle.class
   
   
    
Org. apache. hadoop. mapred. ShuffleHandler
   
  
  
   
    
Yarn. resourcemanager. address
   
   
    
Master: 8032
   
  
  
   
    
Yarn. resourcemanager. schedager. address
   
   
    
Master: 8030
   
  
  
   
    
Yarn. resourcemanager. resource-tracker.address
   
   
    
Master: 8031
   
  
  
   
    
Yarn. resourcemanager. admin. address
   
   
    
Master: 8033
   
  
  
   
    
Yarn. resourcemanager. webapp. address
   
   
    
Master: 8088
   
  
  
   
    
Yarn. scheduler. minimum-allocation-mb
   
   
    
100
   
  
  
   
    
Yarn. scheduler. maximum-allocation-mb
   
   
    
200
   
  
  
   
    
Yarn. schedres. minimum-allocation-vcores
   
   
    
1
   
  
  
   
    
Yarn. schedres. maximum-allocation-vcores
   
   
    
2
   
  
 # Modify slaves file slave1slave2 # Modify/home/hadoop /. bashrcexport HADOOP_DEV_HOME =/home/hadoop/hadoopexport PATH = $ PATH: $ HADOOP_DEV_HOME/binexport PATH = $ PATH: $ export/sbinexport export =$ {HADOOP_DEV_HOME} export =$ {export} export YARN_HOME =$ {export} export HADOOP_CONF_DIR =$ {HADOOP_DEV_HOME}/etc /hadoopexport HDFS_CONF_DIR =$ {HADOOP_DEV_HOME}/etc/hadoopexport YARN_CONF_DIR =$ {HADOOP_DEV_HOME}/etc/hadoop # transfers all the files modified above to slave1, slave2 Node



5. Start hdfs on the master node

[Hadoop @ master ~] $ Cd/home/hadoop/sbin/[hadoop @ master sbin] $. /start-dfs.sh 15/03/21 00:49:35 WARN util. nativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableStarting namenodes on [master] master: starting namenode, logging to/home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-namenode-master.outslave2: starting datanode, logging to/home/hadoop/hado Op-2.2.0/logs/hadoop-hadoop-datanode-slave2.outslave1: starting datanode, logging to/home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-datanode-slave1.outStarting secondary namenodes [master] master: starting secondarynamenode, logging to/home/hadoop/hadoop-2.2.0/logs/hadoop-hadoop-secondarynamenode-master.out # view the process [hadoop @ master ~] $ Jps39093 Jps38917 SecondaryNameNode38767 NameNode [root @ slave1 ~] # Jps2463 Jps2379 DataNode [root @ slave2 ~] # Jps2463 Jps2379 DataNode # start jobhistory [hadoop @ master sbin] $ mr-jobhistory-daemon.sh start historyserverstarting historyserver, logging to/home/hadoop/hadoop-2.2.0/logs/mapred-hadoop-historyserver-master.out



6. Start yarn

[Hadoop @ master ~] $ Cd/home/hadoop/sbin/[hadoop @ master sbin] $. /start-yarn.sh starting yarn daemonsstarting resourcemanager, logging to/home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-resourcemanager-master.outslave2: starting nodemanager, logging to/home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-nodemanager-slave2.outslave1: starting nodemanager, logging to/home/hadoop/hadoop-2.2.0/logs/yarn-hadoop-nodemanager-slav E1.out # view the process [hadoop @ master sbin] $ jps39390 Jps38917 SecondaryNameNode39147 ResourceManager38767 NameNode [hadoop @ slave1 ~] $ Jps2646 Jps2535 NodeManager2379 DataNode [hadoop @ slave2 ~] $ Jps8261 Jps8150 NodeManager8004 DataNode


7. view the hdfs File System

[hadoop@master sbin]$ hadoop fs -ls /15/03/21 15:56:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableFound 2 itemsdrwxr-xr-x   - hadoop supergroup          0 2015-03-20 17:46 /hbasedrwxrwx---   - hadoop supergroup          0 2015-03-20 16:56 /tmp



8. Install Zookeeper

[Root @ master ~] # Tar zxvf zookeeper-3.4.5.tar.gz-C/home/hadoop/[root @ master ~] # Cd/home/hadoop/[root @ master hadoop] # ln-s zookeeper-3.4.5 zookeeper [root @ master hadoop] # chown-R hadoop. hadoop/home/hadoop/zookeeper [root @ master hadoop] # cd zookeeper/conf/[root @ master conf] # cp zoo_sample.cfg zoo. cfg # modify zoo. export datadir =/home/hadoop/zookeeper/datadataLogDir =/home/hadoop/zookeeper/logsserver.1 = 192.168.1.2: 7000: 7001server. 2 = 192.168.1.3: 7000: 7001server. 3 = 192.168.1.4: 7000: 7001 # in sl Ave1, slave2 executes the same operation [hadoop @ master conf] # cd/home/hadoop/zookeeper/data/[hadoop @ master data] # echo 1> myid [hadoop @ slave1 data] # echo 2> myid [hadoop @ slave2 data] # echo 3> myid # Start zookeeper [hadoop @ master ~] $ Cd zookeeper/bin/[hadoop @ master bin] $./zkServer. sh start [hadoop @ slave1 ~] $ Cd zookeeper/bin/[hadoop @ slave1 bin] $./zkServer. sh start [hadoop @ slave2 ~] $ Cd zookeeper/bin/[hadoop @ slave2 bin] $./zkServer. sh start



9. Hbase Installation

[Root @ master ~] # Tar zxvf hbase-0.98.11-hadoop2-bin.tar.gz-C/home/hadoop/[root @ master ~] # Cd/home/hadoop/[root @ master hadoop] # ln-s hbase-0.98.11-hadoop2 hbase [root @ master hadoop] # chown-R hadoop. hadoop/home/hadoop/hbase [root @ master hadoop] # cd/home/hadoop/hbase/conf/# modify the hbase-env.sh File export JAVA_HOME =/usr/java/jdkexport HBASE_HEAPSIZE = 50 # modifying hbase-site.xml files
 
  
   
    
Hbase. rootdir
   
   
    
Hdfs: // master: 9000/hbase
   
  
  
   
    
Hbase. cluster. distributed
   
   
    
True
   
  
              
   
    
Hbase. zookeeper. property. clientPort
               
   
    
2181
       
  
        
   
    
Hbase. zookeeper. quorum
         
   
    
Master, slave1, slave2
   
  
 # Modify the regionservers file slave1slave2 # transfer the modified file to slave1 and slave2



10. Start Hbase on the master node.

[Hadoop @ master ~] $ Cd hbase/bin/[hadoop @ master bin] $. /start-hbase.sh master: starting zookeeper, logging to/home/hadoop/hbase/bin /.. /logs/hbase-hadoop-zookeeper-master.outslave1: starting zookeeper, logging to/home/hadoop/hbase/bin /.. /logs/hbase-hadoop-zookeeper-slave1.outslave2: starting zookeeper, logging to/home/hadoop/hbase/bin /.. /logs/hbase-hadoop-zookeeper-slave2.outstarting master, logging to/home/hadoo P/hbase/bin /.. /logs/hbase-hadoop-master-master.outslave1: starting regionserver, logging to/home/hadoop/hbase/bin /.. /logs/hbase-hadoop-regionserver-slave1.outslave2: starting regionserver, logging to/home/hadoop/hbase/bin /.. /logs/hbase-hadoop-regionserver-slave2.out # view the process [hadoop @ master bin] $ jps39532 QuorumPeerMain38917 SecondaryNameNode39147 ResourceManager39918 HMaster38767 NameNode40027 Jps [h Adoop @ slave1 data] $ jps3021 hregionserver3104jps2535 NodeManager2379 DataNode2942 HQuorumPeer [hadoop @ slave2 ~] $ Jps8430 HRegionServer8351 HQuorumPeer8150 NodeManager8558 Jps8004 DataNode # verify [hadoop @ master bin] $. /hbase shell2015-03-21 16:11:44, 534 INFO [main] Configuration. deprecation: hadoop. native. lib is deprecated. instead, use io. native. lib. availableHBase Shell; enter 'help
 
  
'For list of supported commands. Type "exit
  
   
"To leave the HBase ShellVersion 0.98.11-hadoop2, clerk, Tue Mar 3 00:23:49 PST 2015 hbase (main): 001: 0> listTABLE SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar: file:/home/hadoop/hbase-0.98.11-hadoop2/lib/slf4j-log4j12-1.6.4.jar! /Org/slf4j/impl/StaticLoggerBinder. class] SLF4J: Found binding in [jar: file:/home/hadoop/hadoop-2.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar! /Org/slf4j/impl/StaticLoggerBinder. class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.2015-03-21 16:11:56, 499 WARN [main] util. nativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable0 row (s) in 1.9010 seconds => []
  
 




11. view the cluster status

HDFS UIhttp: // 192.168.1.2: 50070/dfshealth. jsp

650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/1213311525-1.jpg "title =" qq 50321_1353.png "alt =" wKioL1UNKangFROXAASShwIuv2E204.jpg "/>


YARN UIhttp: // 192.168.1.2: 8088/cluster

650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/1213312L9-2.jpg "title =" qq 50321_1530.png "alt =" wKioL1UNKg3ztIHXAALvk_7_in4772.jpg "/>


Jobhistory UIhttp: // 192.168.1.2: 19888/jobhistory


650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/1213311C4-3.jpg "title =" qq 50321162126.jpg "alt =" wKiom1UNKkPhxPw2AAIHcsxpKeg884.jpg "/>


HBASE UIhttp: // 192.168.1.2: 60010/master-status

650) this. width = 650; "src =" http://www.68idc.cn/help/uploads/allimg/151111/121331J33-4.jpg "title =" qq 503220.1715.png "alt =" wKioL1UNKnfzNKHkAAKUUKivsdg997.jpg "/>

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.