Features |
Riak |
Hbase |
Data Model |
Riak uses a bucket as the namespace to store key-value information.
Buckets, keys, and values |
Hbase stores data according to the pre-defined column family structure (each piece of data contains a key and several column attribute values, and each column of data has its own version information ). Data in hbase is stored sequentially by column (unlike row-based relational databases ).
Hbase Data Model Supported Data Types |
Storage Engine |
Riak uses the modular idea to mount the storage layer to the system as an engine. You can select different storage engines as needed.
Storage engine supported by Riak You can even use Riak's backend API to implement your own storage engine. |
Hbase exists on HDFS, and its data files exist in HDFS. Similar to bigtable, data storage is divided into memory-based memstore and stored storefiles. Its data file is called hfile, Which is sstable Based on bigtable. You can directly use JVM's file system Io operations to operate data files.
HDFS Hadoop uses HDFS |
Data access interface |
In addition to using Erlang directly, Riak also provides two data access interfaces, rest mode and Protocol Buffer:
HTTP Protocol Buffers Riak clients are implemented based on the above APIs, and currently have good support for mainstream languages. Client-libraries Community developed libraries and projects |
Hbase mainly performs code execution in JVM. Hbase also provides external data access methods, including rest and thrift protocol access.
Java Interface Rest Thrift |
Data Operation Method |
Riak supports the following operations:
Perform direct operations on the primary key (get, put, delete, update) Mapreduce Mode Riak also provides secondary Indexes Riak search plug-in Comparison of the above methods |
Hbase supports two data operations: scanning and querying ordered key values, obtaining value values, or performing mapreduce queries using powerful hadoop.
Scanning Mapreduce Secondary Indexes |
Data Consistency |
Riak maintains data versions by means of vector clock to handle inconsistencies. You can also use the "Last-Write-wins" policy based on the timestamp instead of using the vector clock.
Vector clocks Why vector clocks are easy Why vector clocks are hard |
Hbase uses highly consistent read/write guarantees. Data is saved in multiple different region formats. Column families can contain an unlimited number of data versions, and each version can have its own TTL
Consistent Architecture Time to live |
Concurrency |
All nodes in the Riak cluster can perform read and write operations at the same time. Riak is only responsible for Data Writing operations (saving with Version Control Based on Vector clock ), when reading data, define the processing logic of data conflicts. |
Hbase ensures the atomicity of write operations through row-level locks, but does not support transactions of multi-row write operations. Data scan operations do not guarantee consistency.
Consistency guarantees |
Copy |
The theoretical source of Riak's data replication system is Dynamo's thesis and Dr. Eric Brewer's cap theory. Riak uses consistent hash to partition data. The same data is saved and backed up in multiple nodes. With the theoretical support of consistent hash, Riak uses virtual nodes to replicate data and ensure balanced data distribution. Introducing virtual nodes enables loose coupling between data and actual nodes
Replication Clustering Riak APIs provide free choice between consistency and availability. You can select different policies based on your application scenarios. When you initially store data to Riak, you can configure the replication mode by bucket. In subsequent read/write operations, you can set the number of copies each time. Reading, writing, and updating data |
Hbase is a typical eventual consistency implementation. Data replication is implemented through the master push to slave. Recently, hbase has also added the master-master implementation.
Replication |
Scalability |
Riak supports dynamically adding and deleting nodes. All nodes are equal and there is no difference between the master and slave nodes. After a node is added to the Riak, the cluster discovers the node through gossiping and allocates the corresponding data range for data migration. The process of removing a node is the opposite. Riak provides a series of tools to add and delete nodes.
Adding and removing nodes Command Line tools |
Hbase performs sharding in the unit of regions. The region splits and merges and is automatically allocated among multiple nodes. Regions
Node Management Hbase Architecture |
Data synchronization between multiple data centers |
Only the Riak Enterprise Edition supports the deployment of multiple data centers. Generally, users only support the deployment of a single data center.
Riak Enterprise |
Hbase uses region for sharding, which naturally supports the deployment of multiple data centers.
Node Management |
Graphical monitoring and management tools |
Starting from Riak 1.1.x, Riak released Riak control, an open-source graphical management tool for Riak.
Riak Control Introducing Riak Control |
Hbase has some graphical tools developed by the open-source community and a command line control terminal.
Admin Console tools Eclipse Dev plugin Hbase Manager Gui Admin |