HBASE Basic Structure

Last Update:2014-12-09 Source: Internet

Author: User

Tags failover

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

HBASE Basic Structure
One. Overview
1. HBase <=> NOSQL
Yes, HBase is some kind of nosql database, the only difference is that he supports massive amounts of data.
Basic features of HBase:
1) strong consistency of read and write, rather than "final consistency" (eventually consistent) data warehouse. Based on this, HBase is ideal for high-speed statistical counting work.

2) Automatic sharding, HBase is a distributed database that supports automatic segmentation of data.

3) automatic failover of Regionserver

4) Integrated Hadoop/hdfs, Habase is built on the Hadoop/hdfs file system.

5) Support MapReduce

6) Java Client API

7) Thift/rest API: Support for non-Java language access.

8) data block level cache and Bron filter (Bloom filter)

9) Integrated Operation management System, providing web-based and command-line tools.

2. Application Scenario:

1) HBase is not suitable for all scenarios, first of all, you need to have enough data, if only millions of records, the general RDBMS can be managed. HBase design is a data warehouse that supports billions of record levels

2) HBase does not support RDBMS functions, such as table joins, indexes, sorting, etc.

3) Be sure that you have enough hardware to support the system. HBase does not guarantee that it will run correctly in a cluster of 5 nodes.

The difference between 3.hbase and Hadoop/hdfs:

1) HDFs is a distributed file system, he is very suitable for storing large files, he only completed the general file system functions, he does not support the file internal record level search function.

2) HBase is built on top of HDFs and provides a quick find function for record levels. HBase places the data in an indexed storefile.

Two. Dictionary table

Data dictionary for HBase:-root-. META These tables are filtered out by the HBase Shell's List command, not displayed, but they are the same as the normal HBase table.

1.ROOT
1) The root table is logged. META. The position of the table his structure:

A) Key:
. META. Region key (. meta.,,1)

B) Values:
Info:regioninfo (serialized Hregioninfo instance of. META.)
Info:server (Server:port of the regionserver holding. META.)
Info:serverstartcode (Start-time of regionserver process holding. META.)

2.META.
1) META. The table holds all the regions information in the system, structure:

A) Key:
Region key of the format ([table],[region start key],[region ID])

B) Values:
Info:regioninfo (serialized Hregioninfo instance for the region)
Info:server (Server:port of the regionsever containing this region)
Info:serverstartcode (Start-time of the regionsever process containing this region)

2) When a region is executing split, there will be another two columns Info:splita and INFO:SPLITB the values of the two columns are also instances of Hregioninfo, and when split is complete, the columns are deleted.

3) Note: Hregioninfo: If key is empty, indicates that the table starts with the end of the table,
If a region has an empty start key then this region is the first region of the table. If a region colleague has an empty start key with an empty end key, then the table has only one region at the moment.

3. Start sequence:

META. is written to root META. The value with the new server and Startcode.

Three. CLIENT

Hbase's client htable is responsible for finding resionserver used to provide basic operations through queries. META. and the-root-dictionary table to find the regionsever.
Once the corresponding region is found, the client will directly connect to the region corresponding to the Resionserver, the colleague will cache the result, and then will not pass meta. To read and write data, unless the regionserver dies, it will be new from meta. Find the new regionserver.

1. Connection:

Htable is not thread-safe and is not suitable for multithreaded calls. It is generally recommended to use a Hbaseconfiguration instance to connect to the database.

1) connection pool, there is no final solution, there is a htablepool, but difficult to manage.

Generally, an array is created early to store the established connection:
Hconnectionmanager.createconnection (Configuration)

2) Writerbuffer and batch (batch) method
If the client's AutoFlush is set to off then the data is written, the Resionserver is written full writebuffer and then refreshed. WriteBuffer default is 2MB.
Note: The Htable.delete (delete) method does not enter WriteBuffer.
Batch method for controlling the fine-grained reference htable of WriteBuffer.

3) Support for non-Java clients.

4) Row-level lock, the current client has a row-level lock, the future version will remove this block.

Four. Client Request filtering

The Get and scan instances can be configured with filter filtering, but filter may be rejected because there are too many types to understand their functionality first.

This part, need to have a certain understanding of the system code, need better Java knowledge, first skim.
Test the example code:
filterlist list = new filterlist (FilterList.Operator.MUST_PASS_ONE);
Singlecolumnvaluefilter filter1 = new Singlecolumnvaluefilter (
Cf
Column
Compareop.equal,
Bytes.tobytes ("My Value")
);
List.add (Filter1);
Singlecolumnvaluefilter filter2 = new Singlecolumnvaluefilter (
Cf
Column
Compareop.equal,
Bytes.tobytes ("My Other Value")
);
List.add (FILTER2);
Scan.setfilter (list);

2. Filter for column values

Singlecolumnvaluefilter is used to filter column values (columns value) operations compareop.equal equality Compareop.not_equal unequal, Compareopgreater range

A sample code:
Singlecolumnvaluefilter filter = new Singlecolumnvaluefilter (
Cf
Column
Compareop.equal,
Bytes.tobytes ("My Value")
);
Scan.setfilter (filter);

3. Regular expression support:

Substringcomparator comp = new Substringcomparator ("Y Val"); Looking for ' my value '
Singlecolumnvaluefilter filter = new Singlecolumnvaluefilter (
Cf
Column
Compareop.equal,
Comp
);
Scan.setfilter (filter);

For the filter section, refer to Https://hbase.apache.org/book.html#architecture 9.4. Section content.

Five. Master

Hmaster is a Management server that is responsible for monitoring all regionserver instances and providing modifications. META. Interface, in a production environment, Hmaster is generally deployed on Namenode.

1. Behavior at startup:

If you are running in a multi-master environment, if the primary master fails, or if you are unable to contact zookeeper, take the opportunity to replace it.

2. Run-time failure:
If the Hmaster is offline while the cluster is running, the cluster can continue to run for a short period of time because the client is in direct contact with Regionserver, but because the Hmaster controls the core functionality, the master should be restarted as soon as possible if Master is offline.

3. Interface (Interface)
Hmasterinterface provides metadata-based interfaces

Table (createtable modifytable removetable enable .... ）
Columnfamily (AddColumn modifycolumn removeclolumn)
Region (move assign unassign)

4. Process:
Hmaster runs several background processes.
1) loadbalance process load load balancer.

2) catalogjanitor check and clean. META. Table.

Six. Regionserver
Regionserver provides management services for region on production deployment, Regionserver runs on Datanode.

1. Interface provides data access, and region management
Data (get put delete next ... ）

Region (Splitregion compactregion ... ）

2. Process Regionserver runs several background processes.

1) Compactsplitthread
Provides split and small compression/finishing work

2) Majorcompactionchecker
Provides large compression/finishing work
3) Memstoreflusher
Periodic flush Memory Data guide Disk
4) Logroller
Periodic inspection of Regionserver Hlog

3.Coprocessors
Because Java does not inherit much, so a lot of complexity of the work, resulting in the Java code is very complex to write, and HBase's different function points, are in different modules, which leads to a code colleague to complete a few functions, so that the coding work is very complex, and then refer to the technology in the Goole paper, Developed the Coprocessors module. For more information, refer to: https://blogs.apache.org/hbase/entry/coprocessor_introduction

4.Block Cache
1) Design: Block cache is the LRU cache with 3 levels of cache policy
A) Single access policy: the first time a data block is loaded from HDFs into the cache
b) Multiple access to the side road: a data block on a) strategy based on multiple visits.
c) In-memory strategy:
About the cache policy: https://hbase.apache.org/xref/org/apache/hadoop/hbase/io/hfile/LruBlockCache.html

2) USAGE
The block cache works by default on all user tables, i.e. all data read operations are cached by the block cache. For most situations, it is also possible to close the cache for specific performance improvements. The settings for the cache depend on the working set (WSS working_set_size) size.
The size of the block cache is estimated by the formula:
Number of Region servers * Heap Size * hfile.block.cache.size * 0.85
The default block cache size is 25% instances of heapsize:
A region server default of HeapSize 1g then the block cache is about 217MB
20 region Serer Default HeapSize 8g then the Block CACHE is about 34GB
100 Region Server default HeapSize 24g block cache approx. 1TB

. META. The-root-dictionary table defaults to the In-memory policy.
Hfile's index keys, Bloom Filter also uses the block cache.

Usage principle: You should know the size of the data set you want to access,
For random access, block cache is not recommended, because random access to large amounts of data causes the block cache to be full, and then because the LRU policy causes more garbage collection by the JVM, resulting in a performance penalty.
For mapping table:a in a mapreduce task, it is not necessary to use the block cache because each record is read only once.

5.WAL (hbase log)

1) Each regionserver will write the changes to the data into the Wal log and then write the Memstore before writing to the store physical file. HLOG is the implementation of Wallog, each regionserver an instance.

2) Wal flush is not documented at this time.
3) Wal splitting
No detailed documentation documented

Seven. Regions

Regions is highly available, with the basic elements of a distributed table, each column family (column family) a store (corresponding to a data file that we understand)
(HBase table)
Region (regions for the table)
Store (store per columnfamily for each region for the table)
Memstore (Memstore for each Store for each region for the table)
StoreFile (Storefiles for each Store for each region for the table)
Block (Blocks within a storefile within a Store for each of the table)

The storage structure of HBase on HDFs:
/hbase
/<table> (Tables in the cluster)
/<region> (Regions for the table)
/<columnfamiy> (Columnfamilies for the region for the table)
/<storefile> (Storefiles for the columnfamily for the regions for the table)

The WAL log stores the structure in HDFs:
/hbase
/.logs
/<regionserver> (Regionservers)
/

1.Region Size
Determining a suitable region size is a difficult thing to consider in several situations:

1) HBase is scalable across the server through the region, but if you have 2 16GB of region in a 20-node cluster, about only a few servers, most of the servers are idle, wasting resources.
2) on the other hand, the more region number, will make the whole work slower, for the same size of data, 700 region performance is better than the case of 3,000 region,
3) Regionsever in the memory package, 1 region with a number of region is no difference.

2.region-regionserver-assignment (distribution of region in Regionserver)

1) Startup process
A) master calls Assignmentmenager
b) AssignmentManager view. META. Distribution of the region already in existence
c) If the assignment in region is still valid, keep
D) If this assignment fails, Loadbalancefactory is called to reassign the region, and by default it is randomly assigned between all regionserver.
e) META. The information is updated with the re-assignment of this region.

2) failover
When a regionserver fails:
A) region becomes unavailable immediately
b) Hmaster detected a regionserver failure
c) region assignment is considered invalid and will initiate reassignment work.

3) LoadBalance
Region can periodically move through loadbalance in Regionserver.

3.region Regionserver Localization
After a period of time, the region will reach localization
Because the replication of HDFs is
A) first write local node
b) Second copy of another machine that writes the same cabinet
3) Third copy write another cabinet
In this way, HBase is localized for a period of time, and when regionserver failover, the data is reassigned and the region may be assigned to different nodes, but after a period of time, localization is achieved through the process above.

4.Region splits
It should be forbidden to let HBase itself perform the region split, the control parameter is hbase.hregion.max.filesize prohibit automatic split, you can set this value of about 100GB, once forget to perform manual split 100GB of data files passed compation time in about 1 hours or so.

0.94 layout, you can use the manual split strategy to cover the global strategy
The split strategy for region can be set globally or as a single table.

5 Online Region Merge

Both Master and Regionserver can perform the merge operation,
Example actions:
$ hbase> merge_region ' encoded_regionname ', ' encoded_regionname '
hbase> merge_region ' encoded_regionname ', ' encoded_regionname ', true

6.store
A store consists of a memstore and 0 or more store file (hfile) per store corresponding to a column family

1) Memstore Save the modified image in the memory, the modification of the data is done in Memstore and then request flush to write the data to the store file.
2) Store file is the data file that we store in HBase in the regular sense.
A) format of hfile:
Reference information: Https://hbase.apache.org/book.html#hfilev2

b) hfile Tools:
$ ${hbase_home}/bin/hbase org.apache.hadoop.hbase.io.hfile.hfile-v-F hdfs://10.81.47.41:8020/hbase/test/ 1418428042/dsmp/4759508618286845475 Viewing anthology Content

c) StoreFile in HDFs reference the preceding HBase storage format on HDFs.

3) Data blocks (block)
The storefile is made up of blocks, and the block size can be set in columfamily
The data compression (compact) is done at the block level.

4) KeyValue to
Storage format for KeyValue:
A) keylength
b) valuelength
c) Key
D) value
The structure of key:

A) rowlength
b) row (i.e. the Rowkey)
c) columnfamilylenth
D) columnfamily
e) Columnqualifier
f) Timestamp
g) KeyType (i.e. Put Delete deletecolumn deletefamily)
A sample data:
Put #1: Rowkey=row1, cf:attr1=value1
Put #2: Rowkey=row1, Cf:attr2=value2

We want HBase to write two lines of data, a structure split

For PUT 1:
Rowlength------------> 4
Row-----------------> Row1
Columnfamilylength---> 2
columnfamily--------> CF
Columnqualifier------> ATTR1
Timestamp-----------> Server time of Put
KeyType-------------> Put

For PUT 2:

Rowlength------------> 4
Row-----------------> Row1
Columnfamilylength---> 2
columnfamily--------> CF
Columnqualifier------> ATTR2
Timestamp-----------> Server time of Put
KeyType-------------> Put

5) About compaction
This part is not very well, refer to the documentation:
Https://hbase.apache.org/book.html#architecture 9.7.6.5 section.

Eight. Bulk import (BULK LOAD)
This is a tool, a major process, based on the HBase file format,
Write the data that you want to import hbase into the HBase file format, and then register these files with the tool. META. In.

This section has not been actually manipulated, refer to the following document: Https://hbase.apache.org/book.html#architecture 9.8 bar.

HBASE Basic Structure

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More