The basic idea of learning an open source software is (1) Installation and configuration (2) Understanding how it works (3) command Operations (4) Code Operations (5) Research Source (6) Two development based on paper or requirement. Also, learning HBase is no exception, but I omitted the installation and configuration of the HBase cluster (4 units), mainly summarizing how hbase works, Shell command operations, and Java code operations related content.
I. HBASE storage structure
1. Client
Analytical:
For the operation of the management class, the client and Hmaster RPC, and the client and Hregionserver RPC for the operation of the data read-write class.
2. ZooKeeper
Analytical:
Zookeeper as a collaborative service, can be seen as Google's chubby open source implementation, the main purpose is to avoid hmaster single point of problem.
3. Hmaster
Analytical:
Hmaster is primarily responsible for the management of table and region. The specific functions are as follows:
- Manage user's increment, delete, change, check operation to table;
- Manage the load balance of hregionserver, adjust region distribution;
- After the region split, responsible for the distribution of the new region;
- After Hregionserver outage, responsible for the region migration on the failed hregionserver.
4. Hregionserver
Hregionserver is primarily responsible for responding to user I/O requests and reading and writing data to HDFs, which is the core module in HBase. The overall relationship is that Hregionserver contains multiple hregion,hregion that contain multiple hstore, each hstore corresponding to the storage of one column family in the table. The hstore contains two parts of Memstore and StoreFile. Memstore is sorted Memory Buffer, the data written by the user is first put into Memstore, and when the Memstore is full, the flush operation becomes a storefile (the underlying implementation is hfile), When the number of storefile files increases to a certain threshold, the compact merge operation is triggered, merging multiple storefile into one storefile, and the merge process is versioned and data deleted.
How HBase works and related operations