[Reprinted] hbase System Architecture

Source: Internet
Author: User
Tags file info hadoop mapreduce

Hbase is an Apache hadoop database that provides random and real-time read/write access to large data. Hbase aims to store and process large data. Hbase is an open-source, distributed, multi-version, column-oriented storage model. It stores loose data.

Hbase features:

1. High Reliability

2 Efficiency

3. Column Orientation

4. scalable

5. You can build a large-scale structured storage cluster on a cheap PC server.

Hbase is an open-source implementation of Google bigtable, Which correspond to the following:

          Google hbase
File Storage System GFS HDFS
Massive Data Processing mapreduce hadoop mapreduce
Collaborative Service Management chubby zookeeper


Hbase relationship diagram:

Hbase is located on the structured storage layer. hbase supports hbase by components:
Functions of hadoop Components
High-reliability underlying storage support for HDFS
Mapreduce high-performance computing capability
Zookeeper stable service and Failover Mechanism
Pig & hive high-level language support for data statistics
Sqoop provides RDBMS data import to facilitate migration of traditional databases to hbase

Hbase access interface

Mode features occasion
Native Java API: The most common and efficient hadoop mapreduce job processes hbase table data in parallel
Hbase shell is the simplest interface for hbase management and use.
Thrift gateway uses thrift serialization to support online access to hbase table data in heterogeneous systems in multiple languages
Rest gateway removes language restrictions on access to rest-style HTTP APIs
Pig latin 60 Programming Language Processing Data Statistics
Hive is simple, sqllike

Hbase Data Model

Component Description:

Row key: Table Primary Key row key table records are sorted by row key
Timestamp: The timestamp corresponding to each data operation, that is, the version number of the data.
Column family: column clusters. A table has one or more column clusters in the horizontal direction. The column clusters can be composed of any number of columns. The column clusters support dynamic expansion, you do not need to specify the quantity and type, binary storage, and type conversion.

Table & Region

1. As the number of records increases, the table is automatically split into multiple splits and becomes the regions
2. A region is represented by [startkey, endkey ).
3. Different region will be allocated to the corresponding regionserver by the master for management.

Two special tables:-root-&. Meta.

. Meta. Record the region information of the User table. At the same time,. Meta. can also have multiple Region
-Root-records the region information of the. Meta. Table, but-root-only one region
The location of the-root-table is recorded in zookeeper.
The process of accessing data from the client:
Client-> zookeeper->-root->. Meta.-> User data table
Multiple network operations, but the client has Cache

Hbase System Architecture

Component Description
Use hbase RPC mechanism to communicate with hmaster and hregionserver
The client communicates with the hmaster to perform management operations.
Client and hregionserver perform data read/write operations

Zookeeper quorum storage-root-Table address, hmaster address
Hregionserver registers itself as ephedral to zookeeper. The hmaster can detect the health status of each hregionserver at any time.
Zookeeper avoids hmaster spof

There is no single point of failure in hmaster. Multiple hmasters can be started in hbase. The Zookeeper master election mechanism ensures that one master is always running.
Mainly responsible for table and region management:
1. Manage the addition, deletion, modification, and query operations on tables.
2. Manage the load balancing of hregionserver and adjust the region distribution
3. After region is split, it is responsible for the distribution of the new region.
4. After the hregionserver is down, it is responsible for migrating region on the hregionserver that fails.

The most core module in hbase is responsible for responding to user I/O requests and reading and writing data to the HDFS file system.

Hregionserver manages hregion objects in some columns;
Each hregion corresponds to a region in the Table. hregion consists of multiple hstores;
Each hstore corresponds to the storage of a column family in the table;
Column family is a centralized storage unit. Therefore, it is more efficient to put columns with the same IO Characteristics in a column family.

The core of hbase storage. It consists of memstore and storefile.
Memstore is a sorted memory buffer. Data Writing Process:

Client write-> Save to memstore, until memstore is full-> flush into a storefile, until it reaches a certain threshold-> Start compact merge operations-> Merge multiple storefiles into a storefile, merge versions and delete data at the same time-> when storefiles are compact, a larger storefile is gradually formed-> when the size of a single storefile exceeds a certain threshold, the split operation is triggered, split the current region into two region instances, and the region instances will go offline. The new two shards will be allocated to the corresponding hregionserver by the hmaster, this allows the original pressure of one region to be distributed to two region instances.
From this process, we can see that hbase only adds data and performs some update and delete operations in the compact phase. Therefore, you only need to enter the memory for User write operations to return immediately, this ensures high I/O performance.

Reasons for hlog introduction:
In a distributed system environment, you cannot avoid system errors or downtime. Once you exit outside of hregionserver, memory data in memstore will be lost. Introducing hlog will prevent this situation.
Working mechanism:
Each hregionserver has an hlog object. hlog is a class that implements write ahead log. Each time a user writes an operation to memstore, it also writes a copy of data to the hlog file, the hlog file regularly scrolls out and deletes the old file (data that has been persisted to the storefile ). When the hregionserver is accidentally terminated, the hmaster perceives it through zookeeper. The hmaster first processes the legacy hlog files and splits the log data of different regions into the corresponding region directories, then, the invalid region will be re-allocated, and the hregionserver that receives these region will be in the load
During the region process, you will find that there is a historical hlog to be processed, so the data in the replay hlog will be stored in memstore, and then flush to storefiles to complete data recovery.

Hbase Storage Format
All data files in hbase are stored in the hadoop HDFS file system. There are two formats:
1. The storage format of keyValue data in hfile hbase. hfile is a hadoop binary file. In fact, storefile is lightweight packaging of hfile, that is, the underlying layer of storefile is hfile.
2 hlog file: The storage format of Wal (write ahead log) in hbase. It is a hadoop Sequence File physically.


Image explanation:
The hfile file is not necessarily long. There are only two fixed blocks: trailer and fileinfo.
The starting point of the pointer pointing to another data block in trailer
File info records the meta information of the file, such as avg_key_len, avg_value_len, last_key, comparator, and max_seq_id_key.
The data index and meta index blocks record the starting point of each data block and meta block.
Data block is the basic unit of hbase I/O. To improve efficiency, hregionserver has a LRU-based block cache mechanism.
The size of each data block can be specified by parameters when a table is created. Large blocks facilitate Sequential Scan and small blocks facilitate random query.
In addition to the magic at the beginning, each data block is spliced by keyValue pairs. The magic content is random numbers to prevent data corruption.

Each keyValue pair in hfile is a simple byte array. This byte array contains many items and has a fixed structure.

Keylength and valuelength: two fixed lengths, representing the length of key and value, respectively.
Key part: row length is a fixed-length value, indicating the length of rowkey. row is the rowkey.
Column family length is a fixed length value, indicating the length of family.
Followed by column family, followed by qualifier, followed by two fixed-length values, indicating time stamp and key type (put/delete)
The value part does not have such a complex structure, that is, pure binary data.

Hlog File

The hlog file is a common hadoop sequence file. The key of the sequence file is the hlogkey object. The hlogkey records the ownership of the written data, except for the table and region names, it also includes the sequence number and timestamp. The timestamp is the "write time", the start value of the sequence number is 0, or the sequence number stored in the file system last time.
The value of hlog sequece file is the keyValue object of hbase, which corresponds to the keyValue in hfile.


Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.