HBase is a distributed, column-oriented database built on the Hadoop file system. It is an open source project and is scaled horizontally.
HBase is a data model, similar to Google's large table design, which provides fast random access to massive structured data. It leverages the fault-tolerant capabilities provided by the file system (HDFS) of Hadoop.
It is a hadoop ecosystem that provides random, real-time read/write access to data and is part of the Hadoop file system.
People can store HDFS data directly or through HBase. Use HBase to read consumption/random access data in HDFs. HBase is on the file system of Hadoop and provides read and write access. HBase and HDFS
HDFS |
HBase |
HDFs is a distributed file system that is suitable for storing large volumes of files. |
HBase is a database built on top of HDFs. |
HDFs does not support fast individual record lookup. |
HBase provides quick lookup in large tables |
It provides high latency batch processing, and no batch processing concepts. |
It provides billions of records for low latency access to single Row Records (random access). |
The data it provides can only be accessed sequentially. |
HBase internally uses a hash table and provides random access, and it stores the index to quickly find the data in the HDFs file. |
storage mechanism for HBase
HBase is a column-oriented database that is sorted by rows in a table. The table schema definition can only be column family, that is, the key value pair. A table has multiple column families and each column family can have any number of columns. The values of subsequent columns are stored continuously on disk. Each cell value in the table has a timestamp. In short, in a hbase: A table is a collection of rows. Rows are collections of column families. A column family is a collection of columns. A column is a collection of key-value pairs.
An example of the hbase pattern is shown in the table below.
Rowide |
Column Family |
Column Family |
Column Family |
Column Family |
|
col1 |
col2 |
col3 |
col1 |
col2 |
|
| col3
col2 |
col3 |
col1 |
col2 |
col3 |
1 |
|
|
|
|
|
|
|
|
|
|
|
|
2 |
|
|
|
|
|
|
|
|
|
|
|
|
3 |
|
|
|
|
|
|
|
|
|
|
|
|
column-oriented and row-oriented
A column-oriented database is the part that stores the data tables as columns of data, not as row data. In short, they have a row family.
row-type database |
column-type database |
It applies to online transaction processing (OLTP). |
It is suitable for online analytical processing (OLAP). |
Such a database is designed to be a small number of rows and columns. |
A large table for column-oriented database design. |
The following illustration shows the column family in a column-oriented database: HBase and RDBMS
HBase |
RDBMS |
HBase, which does not have the concept of a fixed column pattern, defines only the column family. |
An RDBMS has its schema, which describes the constraints of the overall structure of the table. |
It is created specifically as a wide table. HBase is spread horizontally. |
These are thin and specially designed for small tables. It's hard to form a scale. |
No transaction exists in HBase. |
The RDBMS is transactional. |
It is anti normalized data. |
It has normalized data. |
It is very good for semi-structured and structured data. |
For structured data is very good. |
Characteristics of HBaseHBase linear extensible. It has automatic fault support. It provides a consistent read and write. It integrates Hadoop as a source and destination. Client-friendly Java APIs. It provides replication across clustered data.
where to use the hbase. Apache HBase was once a random, real-time read/write access to large data. It is hosted on the top of the cluster common hardware is a very large table. Apache HBase is the previous Google bigtable analog-relational database. BigTable on Google File system operations, similar to the Apache HBase work at the top of the Hadoop HDFs.
the application of HBaseIt is used when there is a need to write heavy applications. HBase is used when we need to provide fast random access data. Many companies, such as Facebook,twitter, Yahoo, and Adobe, are using HBase.