Hadoop2.6 cluster deployment based on the previous article Blog :http://lizhenliang.blog.51cto.com/7876557/1661354
Next, deploying the HBase1.0 Distributed NoSQL database, hbase involves two concepts of hmaster and hregionserver.
(Note: The following concept is Baidu Encyclopedia)
Hmaster is primarily responsible for table and region management work:
1. Manage user's increment, delete, change, check operation to table
2. Manage Hregionserver load balancing and adjust region distribution
3. After split in region, responsible for the distribution of the new region
4. After hregionserver outage, responsible for regions migration on the failed Hregionserver
The Hregionserver is primarily responsible for responding to user I/O requests and reading and writing data to the HDFs file system.
How HBase works:
Hregionserver internally manages a series of Hregion objects, each of which corresponds to a region,hregion in a table consisting of multiple hstore. Each hstore corresponds to the storage of a column family in the table, and you can see that each column family is actually a centralized storage unit, so it's best to place a column with the common IO feature in a columnfamily, This is the most effective.
After understanding the basic principles of the above hstore, it is also necessary to understand the Hlog function, because the above Hstore in the system is not a problem under the premise of normal operation, but in a distributed system environment, can not avoid system error or downtime, so once hregionserver unexpectedly quit, The memory data in the Memstore will be lost, which requires the introduction of Hlog. Each hregionserver has a Hlog object, Hlog is a class that implements Writeahead log, and writes a copy of the data to the Memstore file each time the user operation writes Hlog, and the Hlog file periodically scrolls out the new and delete the old files (data that has persisted to storefile). When the hregionserver unexpected termination, Hmaster will be aware through zookeeper, Hmaster will first deal with the remaining hlog files, the different region of the log data is split, respectively, placed in the corresponding region of the directory, Then redistribute the failed region, pick up the hregionserver of these region in the process of load region, will find that there is a history hlog need to deal with, so will replay Hlog data into Memstore, Then flush to Storefiles to complete the data recovery.
HBase High-availability implementations:
HBase is also divided into active and standby, the data stored in the zookeeper, you can start two or more Hmaster service processes, the first to start as the HBase active node, the rest as an alternate node. If a failure occurs, zookeeper chooses the standby node to become the active node, allowing him to take over the failed active node task and ensure that there is always a master running.
Two, hbase installation and configuration (each to be configured)
1. Installation Configuration
# tar zxvf hbase-1.0.1.1-bin.tar.gz # mv Hbase-1.0.1.1/opt # VI hbase-env.sh export java_home=/usr/local/jdk1.7 Expor T Hbase_manages_zk=false #关闭通过内置Zookeeper管理HBase
# vi hbase-site.xml <configuration> <!--hbase Data Directory location--> <property> < Name>hbase.rootdir</name> <value>hdfs://hcluster/hbase </value> </property> <!--enable distributed clusters-- > <property> <name> Hbase.cluster.distributed</name> <value>true</value > </property> <!--Default hmaster HTTP Access Port--> <property> <name> Hbase.master.info.port</name> <value>16010</value> </property> <!--default hregionserver http access Port--> < Property> <name>hbase.regionserver.info.port</name> <value>16030</value> </ property> <!--do not use the default built-in, configure a standalone ZK cluster address--> < property> <name>hbase.zookeeper.quorum</name> <value>hslave0,hslave1,hslave2</value> </ Property> </configuration>
# vi regionservers HSLAVE0 HSlave1 HSlave2
2. Configure System Variables
# vi/etc/profile hbase_home=/opt/hbase-1.0.1.1 path= $PATH: $HBASE _home/bin export hbase_home PATH # Source/etc/profil E
3. Start HBase
Start Hmaster in HMaster0 and HMaster1, respectively:
# start-hbase.sh
Start Hregionserver in HSLAVE0/1/2, respectively:
# hbase-daemon.sh Start Regionserver
4. Check whether the start is successful
On the primary and standby node, see the Hmaster Process description success:
[Email protected] ~]# JPS 2615 dfszkfailovercontroller 30027 ResourceManager 29656 NameNode 2841 hmaster 8448 JPS
On the Regionserver node, see the Hregionserver Process description success:
[Email protected] ~]# JPS 11391 nodemanager 11213 DataNode 11298 journalnode 10934 quorumpeermain 12571 hregionserve R 7005 Jps
View by visiting the Web page:
650) this.width=650; "src=" http://s3.51cto.com/wyfs02/M01/6E/DE/wKioL1WKdfzwbp0yAAPpgFVMH6k525.jpg "title=" 1.png " alt= "Wkiol1wkdfzwbp0yaappgfvmh6k525.jpg"/>
5. HBase Shell Common Operations Command
Use the following TB1 table structure to demonstrate hbase additions and deletions:
Name |
Info |
Address |
Sex |
Age |
Zhangsan
|
22 |
Mans |
Beijing |
Lisi |
23 |
Woman |
Shanghai |
# HBase Shell #进入字符页面
5.1 CREATE table TB1, and there are two columns family name, info, and address,info column families with sex and age column
HBase (main):024:0> create ' tb1 ', ' name ', ' info ', ' address '
5.2 View Table Structure
HBase (main):025:0> describe ' tb1 '
5.3 List All Tables
HBase (main):025:0> list
5.4 Inserting several records
HBase (main):028:0> put ' tb1 ', ' Zhangsan ', ' info:sex ', ' All ' hbase (main):039:0> put ' tb1 ', ' Zhangsan ', ' info:age ', ' Man ' HBase (main):031:0> put ' tb1 ', ' Zhangsan ', ' address ', ' Beijing ' HBase (main):046:0> put ' tb1 ', ' Lisi ', ' info:age ', ' Woman ' hbase (main):047:0> put ' tb1 ', ' Lisi ', ' info:sex ', ' All ' hbase (main):048:0> put ' tb1 ', ' Lisi ', ' address ', ' Shanghai '
5.5 View all records (full table scan)
hbase (Main):040:0> scan ' tb1 ' ROW COLUMN+CELL zhangsan column=address:,timestamp=1435129009088,value=beijing zhangsan column=info:age,timestamp=1435129054098, value=man zhangsan column=info:sex,timestamp=1435128714392, value=22
Description
Row: Rows that are used to retrieve the primary key for a record.
Column family: The columns family, which is part of the table, must be defined when the table is created, you can see that the column names are prefixed with the column family, and a column family can have multiple columns (column).
CELL: The storage unit that stores the actual data, that is, no data type in the Value,cell seen, all in bytecode form.
Timestamp: timestamp, hbase is automatically assigned when writing, for the current system time, accurate to milliseconds. If each cell holds multiple versions of the same data, the version can be indexed by a timestamp.
5.6 Total records in the statistics table
HBase (Main):050:0> count ' tb1 ' 2 row (s) in 0.0190 seconds = 2
5.7 Viewing a record in a table
hbase (Main):054:0> get ' tb1 ', ' Zhangsan ' hbase (main): 054:0> get ' tb1 ', ' Zhangsan ' COLUMN CELL address: timestamp=1435129096397, value=beijing info:age timestamp=1435129054098,value=man info:sex timestamp=1435128714392,value=22
5.8 View all data in a column family of a row in a table
hbase (Main):055:0> get ' tb1 ', ' Zhangsan ', ' info ' COLUMN CELL info:age timestamp=1435129054098,value=man info:sex timestamp=1435128714392,value=22
5.9 Update a record (overwrite)
HBase (main):063:0> put ' tb1 ', ' Zhangsan ', ' info:sex ', ' 0 ' row (s) in 0.0080 seconds
6.0 Add a Comment field to Lisi
HBase (Main):070:0> incr ' tb1 ', ' Lisi ', ' info:comment '
6.1 Delete a row of a column family data
HBase (main):065:0> delete ' tb1 ', ' Zhangsan ', ' info:sex '
6.2 Delete all records in a row
HBase (Main):067:0> DeleteAll ' tb1 ', ' Zhangsan '
6.3 Deleting a table
HBase (Main):072:0> disable ' tb1 ' #先禁用 hbase (Main):073:0> drop ' tb1 ' #再删除
This article is from the "Penguin" blog, please be sure to keep this source http://lizhenliang.blog.51cto.com/7876557/1665130
HBase1.0 Distributed NoSQL database deployment and use