HBase is an open-source NoSQL Scalable Distributed Database. It is column-oriented and suitable for storing very large volumes of loose data. HBase is suitable for real-time business environments that perform random read/write operations on Big data. For more information about HBase, see the HBase project website.
The environment in this article is consistent with that in the previous section-completely distributed Hadoop cluster configuration. OS is Ubuntu Server 10.04 and HBase is 0.20.6.
HRegionServer & HQuorumPeer: dm1, IP: 192.168.0.17;
HRegionServer & HQuorumPeer: dm2, IP: 192.168.0.18;
HRegionServer & HQuorumPeer: dm3, IP: 192.168.0.9;
HMaster & NameNode: dm4, IP: 192.168.0.10; (SecondaryNameNode)
Although it is unreasonable to put secondarynamenode and namenode on the same machine. However, considering that this is only a small experiment cluster (hardware environment is not allowed), and the timing snapshot of xenserver is guaranteed, SecondaryNameNode will not be deployed on other machines.
The main thing is still configuration work, still put HBase under/home, edit/home/hbase/conf under the hbase-site.xml, hbase-default.xml, hbase-env.sh these files. The procedure is as follows:
1. EditAll machinesThe command is as follows:
?
1 |
vi /home/hbase/conf/hbase-site.xml |
Edit the file as shown in the following code. Note:
1. First, pay attention to hdfs: // dm4: 9000/hbase,Must be associated with your Hadoop cluster'sCore-site.xml file configuration stay exactly consistentIf your Hadoop hdfs uses other ports, modify them here. In addition, HbaseThe Host IP address is not recognized. Only the host hostname can be used.That is, if the IP address of dm4 (192.168.0.10) is used, a java error will be thrown. As for the specific error, due to a long time, I am too lazy to review the large number of logs.
2.Hbase. zookeeper. quorum must be an odd number.
?
123456789101112131415161718 |
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs: //dm4:9000/hbase</value> </property> <property> <name>hbase.cluster.distributed</name> <value> true </value> </property> <property> <name>hbase.master</name> <value> 192.168 . 0.10 : 60000 </value> </property> <property> <name>hbase.zookeeper.quorum</name> <value> 192.168 . 0.9 , 192.168 . 0.17 , 192.168 . 0.18 </value> </property> </configuration> |
2. Edit the hbase-default.xml for all machines. The command is as follows:
?
1 |
vi /home/hbase/conf/hbase- default .xml |
You only need to modify hbase. rootdir and hbase. cluster. distributed. The code below shows how to modify:
HBase data is erased after restart. If data persistence is required, modify the rootdir item and write your HDFS directory.
For the meaning and modification of other items in default, refer to the official website.
?
12345678910111213141516171819202122 |
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs: //dm4:9000/hbase_rootdir</value> <description>The directory shared by region servers. Should be fully-qualified to include the filesystem to use. E.g: hdfs: //NAMENODE_SERVER:PORT/HBASE_ROOTDIR </description> </property> <property> <name>hbase.master.port</name> <value> 60000 </value> <description>The port master should bind to.</description> </property> <property> <name>hbase.cluster.distributed</name> <value> true </value> <description>The mode the cluster will be in. Possible values are false : standalone and pseudo-distributed setups with managed Zookeeper true : fully-distributed with unmanaged Zookeeper Quorum (see hbase-env.sh) </description> </property> |
3. EditAll machinesThe hbase-env.sh, command is as follows:
?
1 |
vi /home/hbase/conf/hbase-env.sh |
The modification code is as follows:
?
123456 |
export HBASE_OPTS="$HBASE_OPTS -XX:+HeapDumpOnOutOfMemoryError -XX:+UseConcMarkS weepGC -XX:+CMSIncrementalMode" export JAVA_HOME=/usr/lib/jvm/java- 6 -sun- 1.6 . 0.22 export HBASE_MANAGES_ZK= true export HBASE_HOME=/home/hbase export HADOOP_HOME=/home/hadoop |
4. Edit HMasters and HRegionServers of HBase on all machines. Modify the regionservers file in the/home/hbase/conf folder
. Add the IP address of DataNode. The Code is as follows:
?
123 |
192.168 . 0.9 192.168 . 0.17 192.168 . 0.18 |
Now, the HBase cluster configuration has been completed. Start and test.
5. Start and test the HBase database.
Start HBase database in HMaster (Namenode (dm4) (Hadoop cluster must have been started ). Start command:
?
1 |
/home/hbase/bin/start-hbase.sh |
Shows how to start Hbase:
Run the JPS command to test your current Hbase cluster process. For example:
Enter the following command to go to The hbase command line management interface:
?
1 |
/home/hbase/bin/hbase shell |
Enter list in hbase shell, as shown below, to list the name of your current database, as shown in. If your Hbase is not configured successfully, a java error is thrown.
You can also manage and view HBase databases on the WEB page.
HMaster: http: /// 192.168.0.10: 60010/master. jsp
My HBase database:
For HBase commands, see Hadoop Wiki.
HBase database development and application, including database read/write and conditional query. For more information, see my future articles ......
Vi. References
1. HBase: Bigtable-like structured storage for Hadoop HDFS
Http://wiki.apache.org/hadoop/Hbase
2. HBase Testing Tutorial
Http://wiki.apache.org/hadoop/Hbase/HowToTest