Several questions about hbase

Source: Internet
Author: User
Tags file info printable characters

This article focuses on people who do not know hbase. I want to answer the following questions based on my personal understanding:

  • What is hbase?
  • When to use hbase?
  • What is the difference with hive and pig?
  • Hbase Structure
  • Why is hbase fast?
  • What are common hbase operations?
  • Hbase configuration and monitoring

What is hbase?

Hbase is a hadoop database and a highly reliable, high-performance, column-oriented, and Scalable Distributed Storage System. Hbase technology can be used to build large-scale structured storage clusters on low-cost PC servers. Its underlying file system uses HDFS and zookeeper to manage the communication between the hmaster of the cluster and each region server, monitor the status of each region server, and store the entry addresses of each region.

When to use hbase?

First of all, let's think about the features of traditional relational databases. The general features include:

  1. Supports transactions and acid (atomicity, consistency, isolation, and durability) features;
  2. Row-based storage;
  3. SQL statements are easy to use;
  4. Supports indexes and views;

Next, let's consider a scenario: if we want to build a social networking website, we may choose an easy-to-operate lamp (Linux, Apache, MySQL, and PHP) model to quickly build a prototype. As the number of users increases, more and more people are accessing each day. At this time, the pressure on the shared database servers increases. You can choose to increase the number of application servers. However, because these application servers share the central database, therefore, as the CPU and I/O load of the database increases, this solution will not last for a long time.

At this time, we may increase the number of slave servers to facilitate parallel reading and read/write splitting. This is because the number of reads generated by user access is more than the number of writes. However, if the number of users increases rapidly and more content is generated, the difference in the number of reads and writes is not that large, this solution cannot last for a long time.

The following common practice is to increase the cache, such as using memcached. In this way, read operations are stored in the database system in the memory, but there is no way to ensure data consistency, because the user updates the data to the database, and the database does not actively update the data in the cache, and, this solution can only solve the Read Request pressure, but still does not solve the write request. Therefore, the need for more servers and faster disks will result in a rapid increase in hardware costs.

In addition, with the increase of users, website functions will inevitably increase, and business functions will use SQL statements for queries, and too much table data will lead to slow join operations, therefore, we have to adopt some inverse paradigm to design databases, which leads to the inability to use stored procedures. In addition, when the data is too large, the index effect is not that strong. Because the index will also become very large.

What should I do at this time? Some people use database partitioning to split data. However, large-scale splitting will lead to a large number of replication operations, resulting in a large amount of I/O loss. Therefore, this method is not necessarily good.

In, Google published a paper called the Google file system, which is short for gfs. The data in the file system is stored in redundant nodes, even if a server fails, and does not affect data availability. However, gfs is only suitable for storing a small number of very large files and is not suitable for storing a large number of small files, because the metadata of files is stored in the memory of the master node, the more files, the stress on the master node increases. After in-depth research by Google, another heavyweight paper bigtable: a distributed storage system for structed data was published in. Hbase is the open-source implementation of bigtable. Of course, it is also based on HDFS (open-source implementation of GFS), hadoop (open-source implementation of mapreduce), and zookeeper (open-source implementation of Chubby.

When Will hbase be used? Hbase can be used to replace relational databases in the following situations:

  1. The system needs to adapt to different types of data formats and data sources. The mode cannot be defined strictly in advance and large-scale data needs to be processed;
  2. The relationship between data is not emphasized. The data to be stored is semi-structured or unstructured;
  3. Data is sparse;
  4. For better scalability;

For example, Google uses bigtable to store the index data of web pages, and the index data meets the above requirements.

What is the difference with hive and pig?

  1. Hbase is low-latency, unstructured, and programming-oriented, while hive is highly-delayed, structured, and analysis-oriented;
  2. Hive itself does not store and compute data. It is completely dependent on HDFS and mapreduce. Tables in hive are logical tables;
  3. Hbase provides a large memory hash table by organizing the memory of all machines in the node. It needs to organize its own data structure on disk and memory, and the table in hbase is a physical table;
  4. For full table scan, hive + hadoop is used. For index access, hbase + hadoop is used.
  5. Hive is mainly used for static structures and work that requires frequent analysis;
  6. Compared with hive, pig is relatively lightweight. Its main advantage is that it can greatly reduce the amount of code by directly using hadoop Java APIs;
  7. Both hive and pig can be used in combination with hbase. hive and pig also provide high-level language support for hbase, making it easy to perform statistical processing on hbase.

Hbase Structure

1) tables, rows, columns, and cells

First, make a simple summary: the most basic unit is column. One or more columns form a row and are stored by a unique row key. A table has many rows, and each column may have multiple versions. Different values are stored in each cell.

Hbase's rows and rows are ordered in the Lexicographic Order of the row key. The row key is unique and appears only once in a table. Otherwise, the same row is updated, the row key can be any byte array. A row consists of several columns, and some columns can constitute a column family. All columns of a column family are stored in the same underlying storage file, this file is called hfile.

The column family needs to be defined when creating a table, and the number should not be too large. The column family name must consist of printable characters. You do not need to define columns when creating a table. The reference format of a column is generally family: qualifier. Qualifier can also be any byte array. The qualifier name in the same columnfamily should be unique. Otherwise, the same column is updated. There is no limit on the number of columns, and there can be millions of columns. The column value does not have a type or length limit. Hbase checks the length of the row key. The default value is less than 65536.

A visual hbase table is as follows:


Timestamp indicates the timestamp, which is specified by the system by default. You can also set it. Use different timestamps to differentiate different versions. Values of different versions of a cell are sorted in descending order according to the timestamp, and the latest values are preferentially obtained during reading. You can specify the maximum number of versions that each value can save, and the HBase-0.96 version defaults to 1.

The hbase access mode is as follows (table, row key, column family, column, timestamp)-> value. That is, the value of a version of a column family with a row of keys in a table is unique.

The row data access operation is atomic and can read any number of columns. Currently, cross-row transactions and cross-Table transactions are not supported.

Data in the same columnfamily is compressed, and the access control disk and memory are both performed at the column family level.

2) automatic partitioning

The basic unit of expansion and load balancing in hbase is called region, which is essentially a continuous storage space sorted by row keys. If region is too large, the system will dynamically split them. On the contrary, multiple region will be merged to reduce the number of stored files.

A table has only one region at the beginning. When you start to insert data to the table, the system checks the region size to ensure that it does not exceed the configured maximum value. If it exceeds the maximum value, the key center of the region row is split into two parts, divide the region into two region with roughly the same size.

Note that each region can only be loaded by one region server, and each region server can load multiple region at the same time. A table is displayed, which is a logical view consisting of region sets loaded by many region servers.


The number of regions that each server can load and the optimal size of each region depend on the effective processing capability of a single server.

3) hbase Storage Format

Hfile: The storage format of keyValue data in hbase. Hfile is a hadoop binary file.

Hlog: The storage format of Wal (write-ahead-log, pre-write log) files in hbase. It is a hadoop Sequence File physically.

The hfile format is as follows:


The length of the hfile file is variable. The only fixed values are file info and trailer. Trailer stores pointers to other blocks. It writes persistent data to the end of the file. After writing, the file becomes an immutable data storage file. The key-values stored in data blocks can be considered as a mapfile. When the block is disabled, the first key is written to the index, and the index file is written to the hfile when the hfile is disabled.

The format of keyValue is as follows:


There are four keytypes: Put, delete, deletecolumn, and deletefamily. Rowlength is 2 bytes, row length is not fixed, columnfamilylength is 2 bytes, columnfamily length is not fixed, columnqualifier length is not fixed, timestamp is 4 bytes, keytype is 1 byte. The reason why columnqualifier length is not recorded is that it can be calculated using other fields.

4) WAL (prewrite log)

The region server saves the data to the memory until it accumulates enough data and then writes it to the disk. This avoids many small files. However, if a power failure or other fault occurs, the data stored in the memory will be lost before it can be saved to the disk. Wal can solve this problem. Logs are written to each update (edit). logs are successfully written to the client only after they are successfully written. Then, the server processes data in memory in batches as needed.

If the server crashes, the region server will return to the log, restoring the server to the status before the server crashes. The write process is displayed:


  • All modifications are saved to Wal and then passed to memstore. The entire process is as follows:
  • The client starts an operation to modify data, such as put. Each modification is encapsulated in a keyValue object instance and sent by calling RPC. These calls will be sent to the region server with matching region;
  • When the keyValue instances arrive, they are allocated to the hregion instance that manages the corresponding row, data is written to Wal, and then put into the memstore that actually owns the record;
  • When memstore reaches a certain size or goes through a specific time, data is asynchronously written to the file system (hfile ).
  • If a problem occurs during the write process, wal ensures that data is not lost because Wal log hlog is stored on HDFS. Other region servers can read log files, play back modifications, and restore data.

5) hbase System Architecture

Hbase architecture includes hbase client, Zookeeper, hmaster, hregionserver, and hstore storage. It is described in detail below. A general architecture diagram is as follows:


A) hbase Client

Hbase client uses hbase's RPC mechanism to communicate with hmaster and hregionserver. For management operations (such as table creation and table deletion), client and hmaster perform RPC; for data read/write operations, client and hregionserver perform RPC.

B) zookeeper

A distributed, open-source distributed application coordination service that allows distributed applications to implement synchronization services, configuration maintenance, and naming services. It is the open source implementation of Chubby.

In addition to storing the-root-Table address and Master Address in zookeeper quorum, regionserver also registers itself to zookeeper so that the master can detect the health status of each regionserver at any time.

C) hmaster

  • Allows you to add, delete, modify, and query tables;
  • Manage the load balancing of hregionserver and adjust the region distribution;
  • After region split, it is responsible for allocating new region;
  • After the hregionserver is down, it is responsible for regions migration on the hregionserver that fails.

D) hregionserver

  • It is mainly responsible for responding to user I/O requests and reading and writing data to HDFS file systems. It is the core module in hbase;
  • When the user updates the data, it is assigned to the corresponding hregion server to submit the changes. These changes are displayed in the memstore write cache and the server's hlog file. After the operation is written to the hlog, The COMMIT () call will return it to the client;
  • When reading data, the hregion server first accesses blockcache to read the cache. If no data is changed in the cache, it will return to the hstores disk for searching. Each columnfamily will have an hstore set, each hstore collection contains many hstorefile files.

E) Special Tables

-Root-table and. Meta. Table are two special tables .. Meta. records the region information of the User table.. Meta. can have multiple regoin records. -Root-records the region information of the. Meta. Table.-root-only has one region, and zookeeper records the location of the-root-table. The details are as follows:


Why is hbase fast?

The main reason why hbase can provide real-time computing services is determined by its architecture and underlying data structure, that is, LSM-tree (log-structured merge-tree) + htable (region partition) + cache decides-the client can directly locate the hregion server where the data to be queried is located, and then directly search for the data to be matched on a region of the server, and the data is cached.

As mentioned above, hbase will save the data to the memory, and the data in the memory is ordered. If the memory space is full, it will be written to hfile, the content stored in hfile is also ordered. When data is written to hfile, the data in the memory is discarded.

Hfile files are optimized for sequential disk reading and stored on pages. It shows the process of merging multiple block storage in the memory into the disk. Merging and writing will generate new result blocks, and eventually multiple blocks will be merged into larger blocks.


After multiple writes, many small files will be generated, and the background thread will combine small files to form large files. In this way, disk searches will be restricted to a few data storage files. Hbase writes data quickly because it does not actually write data to files immediately, but first writes data to the memory, and then asynchronously refreshes hfile. So in the client's opinion, the write speed is very fast. In addition, random write is converted into sequential write during write, and the data write speed is also stable.

The reading speed is fast because it uses the LSM tree structure instead of the B or B + tree. The sequential disk reading speed is very fast, but the searching speed is much slower than that of the track. Because hbase's storage structure requires that the disk seek time be within the predictable range, and reading any number of records that are consecutive with the rowkey to be queried does not incur additional seek overhead. For example, if you have five storage files, you can search for them five times at most. Relational databases cannot determine the number of disk seek times even if there is an index. In addition, hbase reads are first searched in the cache (blockcache). It uses LRU (the least recently used algorithm). If it is not found in the cache, it will be searched from memstore in the memory, the content in hfile will be loaded only when neither of the two locations is found, and the reading speed of hfile will be very fast as mentioned above, because it saves the seek overhead.

Common hbase operations:

  • List;
  • Create;
  • Put;
  • Scan;
  • Get;
  • Delete;
  • Disable;
  • Drop;

For more information, see hbase Study Notes (1).

Hbase installation, configuration, and monitoring:

  1. Hbase installation see: centos distributed environment installation HBase-0.96.0
  2. For details about hbase configuration, see hbase Study Notes (2), hbase Study Notes (3), and subsequent blogs;
  3. Hbase monitoring see: centos cluster installation ganglia-3.6.0 monitoring hadoop-2.2.0 and hbase-0.96.0

Reprinted please indicate the source:Http://blog.csdn.net/iAm333

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.