Reprinted from http://blog.csdn.net/ryantotti/article/details/13295325
There are typically several scenarios for implementing a two-level index:
1. Table Index
Use a separate hbase table to store index data, the index column value of the business table as the rowkey of the Index table, the Rowkey of the business table as the qualifier of the Index table, or Value.
Problem: The data update performance has a large impact, there is no guarantee of consistency;Client queries require 2 RPCs(indexed tables and datasheets).
2. Column index
Use the same table as the business table, use a separate column family store index, the user data column value as the Qualifier of the indexed column family , and the user data Qualifier as the column values of the indexed column family. For data models with millions of Qualifier on a single line , such as the network disk ID as Rowkey in the network disk application , the directory metadata of the network disk is stored in an hbase row . ( The Facebook message model is also this scenario)
Transactional is guaranteed.
Issue: Applies only to specific scenarios.
Two ways to build an index on HBase