This is Huawei's two-level indexing scheme, has opened the source code, the following is an online article on the principle of the post, issued to share with you.
After I seriously read the code, found that the source for reference only, want to integrate into the original cluster is a bit difficult, it to HBase source code to make a lot of changes.
Source Address: Https://github.com/Huawei-Hadoop/hindex
The following is an analysis of its scenario.
1. Overall architecture
This architecture sets index details in client ext, collects information in Balancer, and manages level two index data in coprocessor.
2. Table creation
When the table is created, an index table is created on the same region server, and one by one corresponds.
3. Insert operation
After inserting a piece of data into the primary table, the index column is written to the index table using coprocessor, which writes that the primary key of the data in the Index table is: Region start key+ index name + index column value + Main Table row key. This is done so that under the same distribution rule, the Index table will be the same as the primary table on the region server, and the RPC can be reduced at query time.
4.scan operation
When a query arrives, through the coprocessor hook, the range row is queried from the index table, and the final data is then scanned from the related row in the main table.
5. Split Operation Processing
In order for the main table and the index table to be on the same RS, to disable the automatic and manual split of the index table, which can only be triggered by the main table split, when the main table is split, the index table is divided by its corresponding data, and the second daughter split row of the index table The front part of the key is modified to the corresponding primary key's row key.
6. Performance
Query performance is greatly improved, insert performance decreased by about 10%
In summary, this paper gives a rough analysis of the steps of Huawei HBase using coprocessor to create a two-level index, inserting data, and querying data to see the full picture. It can be used as a reference.
Reprinted from: http://www.dengchuanhua.com/167.html
Hbase Learning (ix) Huawei Level Two index (principle)