[HBase] data model (logical structure)

Source: Internet
Author: User
[HBase] data model (Logical Structure) HBase stores data in the form of tables. A table consists of rows and columns. The column is divided into several rowfamily columns. The logical view is as follows: The following describes several key concepts: 1) RowKey-the row key is a byte array, any string can be used as the row key. -- the rows in the table are sorted by the row key,

[HBase] data model (Logical Structure) HBase stores data in the form of tables. A table consists of rows and columns. The column is divided into several row families. The logical view is as follows: The following describes several key concepts: 1) RowKey-the row key is a byte array, any string can be used as the row key. -- the rows in the table are sorted by the row key,

[HBase] data model (logical structure)

HBase stores data in tables. A table consists of rows and columns. The columns are divided into several row families. The logical view is as follows:

The following describes several key concepts:

1) RowKey)

-- The row key is a byte array. Any string can be used as the row key;

-- The rows in the table are sorted by the Row key and stored in byte order of the Row key;

-- Access to All Tables must be performed through the row key (single RowKey access, RowKey range access, or full table scan)

2) ColumnFamily)

-- CF must be provided when the table is defined

-- Each CF can have one or more ColumnQualifier columns. The column members do not need to be given when the table is defined. New column family members can be added as needed and dynamically.

-- Data is stored separately by CF. the so-called column-based storage of HBase is stored separately by CF (each CF corresponds to a Store). This design is very suitable for data analysis.

3) TimeStamp)

-- Each Cell may have multiple versions, which are distinguished by timestamps.

4) Cell)

-- Cell is determined by the row key, column family: qualifier, and time stamp

-- The data in Cell is of no type and all are stored in bytecode format.

5) Region)

-- HBase automatically divides the table horizontally (by Row) into multiple regions (region), and each region stores a continuous data segment in the table;

-- Each table has only one region at the beginning. As the data is inserted into the table, the region increases. When the value increases to a threshold, the region will wait for two new region segments;

-- When the number of rows in the table increases, more and more region will occur. In this way, a complete table is saved on multiple Region instances.

-- HRegion is the smallest unit of distributed storage and load balancing in HBase. The smallest unit indicates that different HRegion can be distributed on different hregionservers. However, an HRegion will not be split into multiple servers.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.