How HBase stores data

Last Update:2018-07-26 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

1, HBase is a structured scalable, high-reliability, column-oriented open-source database, HBase is different from the traditional relational database, using the Bigt able data model, is a suitable for unstructured data storage database. HBase is a sub-project of the Apache Hadoop project. Data model for HBase: An Enhanced sparse sort table (key/value) with keys that consist of row keywords, column keywords, and timestamps. HBase provides random, real-time read and write access to large-scale data. Data saved in HBase can be handled using MapReduce, which combines data storage and parallel computing perfectly.

Data Model: Schema-->table-->column Family-->rowkey-->timestamp-->value

2, the Characteristics of hbase table

HBase tables are large: A table can have billions of rows and millions of columns;

HBase tables are modeless: Each row has a sortable primary key for any number of columns, the columns can be dynamically increased as needed, and different rows in the same table can have different columns;

Column-oriented: column independent search;

Sparse: Empty columns do not occupy storage space, the table can be designed very sparse;

Data type singleton: Data in HBase is a string, no type

HBase Basic Concepts

RowKey: is a byte array, which is the "primary key" for each record in the table;

Column Family: A family of columns with a name (string) that contains one or more related columns

Column: Belongs to a columnfamily,familyname:columnname, each record can be added dynamically

Version Number: The type is long, the default is the system timestamp and can be customized by the user

Value (cell): Byte array

Physical storage:

1. All rows in table are sorted by the dictionary of row key;

2, table in the direction of the division of multiple region;

HBase vs. HDFs

Both have good fault tolerance and extensibility, and can be extended to hundreds of points;

HDFs for batch processing scenarios

Not suitable for incremental data processing

Data Update not supported

The three-dimensional ordered storage of hbase means: Rowkey (row primary key), column Key,timetamp (timestamp) three-dimensional ordered storage.

Rowkey:rowkey is the primary key for the row, and HBase can use only one rowkey. Rowkey is critical to the design of the application layer, which is related to the query efficiency. Rowkey are sorted in dictionary order. And the stored byte code, the dictionary sort, if the letter, is the letter order, for example, has two rowkey,rowkey1:aaa222,rowkey2:bbb111, then Rowkey1 is Rowkey2 front.

Column Key:column key is the second dimension, and after the data is sorted by Rowkey dictionary, if Rowkey is the same, it is sorted according to column key and is sorted by dictionary.

The Timestamp:timestamp is a timestamp, a third dimension, sorted in descending order, that is, the most recent data is in the front row.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

How HBase stores data

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support