HBase Non-structured database vs. structured database

Last Update:2017-07-02 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Objective: To understand the characteristics and implementation of hbase and support massive data query

Characteristics and limitations of traditional relational database

Traditional database transaction is particularly strong, requiring data integrity and security, resulting in system availability and scalability is greatly compromised. For high-concurrency traffic, database performance is not good, and traffic like the Internet can easily cause downtime.

HBase

HBase is a columnstore-based database that is scalable compared to a traditional row-based relational database. HBase is a columnstore-oriented distributed storage system that has the advantage of achieving high-performance concurrent read and write operations while HBase transparently splits the data so that the storage itself has horizontal scalability.

The data organization structure of HBase consists mainly of primary key and column family, each column family needs to have multiple columns according to the attribute, the column is allowed to be extensible, for example, want to increase a column can be added at any time.

Advantages and disadvantages of hbase
1 columns can be dynamically increased and listed as empty without storing data, saving storage space.

2 hbase automatically splits data so that the data store automatically has a horizontal scalability.

3 HBase provides support for high concurrency read and write operations

Disadvantages of HBase:

1 cannot support conditional queries, only query by row key is supported.

2 cannot support failover of master server temporarily, and when Master goes down, the entire storage system hangs up.

Four. Supplement

1. Data types, HBase has only a simple character type, all types are left to the user to handle, it only saves the string. The relational database has rich types and storage methods.
2. Data manipulation: HBase is simple to insert, query, delete, empty, and so on, the table and table are separated, there is no complex relationship between tables and tables, and traditional databases usually have a variety of functions and connection operations.
3. Storage mode: HBase is a column-based store, and each column family is saved by several files, separated by different column family files. The traditional relational database is saved based on the table structure and the row pattern.
4. Data maintenance, HBase Update operation should not be called update, it is actually inserting new data, and traditional database is replacing modify
5. Scalability, hbase this kind of distributed database is developed for this purpose, so it can easily increase or decrease the number of hardware, and the compatibility of the error is relatively high. Traditional databases typically require an additional middle tier to achieve similar functionality

The organizational structure of the htable of HBase see http://blog.csdn.net/lifuxiangcaohui/article/details/39894265 Blog

Application Scenarios for HBase

Say what the situation requires hbase

Semi-structured or unstructured data

data that is not deterministic or disorganized for data structure fields is difficult to extract by a concept that is suitable for hbase. In the example above, when business development needs to store author Email,phone,address information, the RDBMS requires downtime maintenance while hbase support dynamically increases.

Very sparse records

The number of rows in an RDBMS is fixed, and null columns waste storage space. As mentioned above, the null column of hbase is not stored, which saves space and improves read performance.

Multi-version data

As mentioned above, the value that is anchored to row key and column key can have any number of version values, so it is very convenient to use hbase for data that needs to store the change history. For example, the address of the author in the example above is subject to change, and business generally requires only the most recent values, but sometimes it may be necessary to query to historical values.

Very large data volume

when the data volume is getting larger, the RDBMS database can't hold up, there is a read-write separation strategy, through a master dedicated to write operations, multiple slave responsible for read operations, server cost multiplier. As the pressure increases, master can't hold up, at this time to separate the library, the data is not associated with the deployment, some join query can not be used, need to rely on the middle tier. As the amount of data increases further, the records of a table become larger, the query becomes very slow, and the tables are divided, such as by ID modulo into multiple tables to reduce the number of records in a single table. People who have experienced these things know how the process is going to be frustrating. With HBase, it's easy to add machines, HBase automatically scales horizontally, and seamless integration with Hadoop guarantees high performance (MapReduce) for data Reliability (HDFS) and massive data analytics.

HBase applications can also be described in http://blog.csdn.net/yen_csdn/article/details/55657363

HBase Non-structured database vs. structured database

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

HBase Non-structured database vs. structured database

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support

HBase Non-structured database vs. structured database

Contact Us

What's Trending

Top 10 Tags

Top 10 Keywords

Trending Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support