Original statement: Reprint please indicate the author and original link http://www.cnblogs.com/zhangningbo/p/4068957.html
English Original: http://hbase.apache.org/
Apache Hbasetm, the Hadoop database, is a distributed, scalable, big data storage solution.
When to use Apache HBase?
Apache HBase is used when you need to read and write big data at random and in real time. The project is designed to organize very large tables on commercial hardware clusters--billions of rows x millions of columns. Apache HBase is an open-source, distributed, versioned, non-relational database modeled on Google BigTable paper (A distributed Storage System for structured Data). Just as BigTable borrowed the distributed data storage capabilities provided by the GFS (Google File System), Apache HBase provides similar bigtable capabilities on top of Hadoop and HDFs.
Characteristics
- Linear and modular Scalability
- Strict and consistent reading and writing
- Automatic and configurable table shards
- Support for automatic recovery between regionserver
- Easy-to-use base class with Apache HBase tables to support Hadoop MapReduce tasks
- Client-friendly Java API
- Block cache and Bloom filter mechanism to support real-time queries
- The server-side filter query can predict the push-down
- Thrift gateways and Rest-full Web services that support XML, PROTOBUF, and binary data encoding options
- Extensible jruby-based (JIRB) shell
- Support for exporting measurements to files or ganglia via the Hadoop measurement subsystem, or via JMX
Where can I get more information?
View architecture Overview,Apache HBase Reference Manual FAQs, and other documentation.
Reference reading
1) HBase official website
2) HBase Reference Guide (Official document English version)
3) HBase Reference Guide (Chinese version of official documents, I translate)
4) HBase Reference Guide (Chinese version of official document, Zhou Haihan, outskirts)
Introduction to the "Hadoop learning" Apache HBase Project