The join of secondary indexes and indexes is a basic feature that most business systems require the storage engine to provide. RDBMS has long supported it, and the nosql camp is exploring the best solution that suits its own characteristics.
This article will use hbase as an object to discuss how to build secondary indexes and implement index join Based on hbase. At the end of this article, we will also list the currently known versions including 0.19.3 secondary index, ithbase, Facebook solutions and official coprocessor introduction.
Theoretical Objectives
To achieve join between secondary indexes and indexes in hbase, three goals must be taken into account:
1. High-performance range search.
2. Low data redundancy (the amount of data stored ).
3. data consistency.
Performance and data redundancy are mutually constrained.
If you implement high-performance range search, you must rely on redundant index data to improve performance. data redundancy will make it difficult to achieve consistency when updating data, especially in Distributed scenarios.
If you do not require efficient range retrieval, you can avoid generating redundant data and avoid consistency issues indirectly. After all, share nothing is recognized as the simplest and most effective solution.
Based on the theory and practice, the following describes how to choose the preferred solution as an example.
These solutions come to the conclusion after reading the author's information and continuing communication with colleagues. If you have any mistakes, please correct them:
1. Create a table by index
Each index creates a table and then uses the row key of the table to perform range search. Row key is structured and ordered in B + tree in hbase, so it is more efficient to scan.
A single table stores indexes with row key, and column value stores ID values or other data. This is the structure of the hbase index table.
How to join?
In the join scenario of Multiple indexes (multiple tables), there are two reference schemes:
1. Scan each independent single-index table based on the index type, and finally scan the result merge.
This solution is simple, but if the data size of multiple index scans is large, merge will encounter a bottleneck.
For example, there is now a 0.1 billion user information table with two indexes: birthplace and age. I want to obtain a condition that I was born in Hangzhou, the first 10 users are listed in a forward order by user ID.
One solution is that the system first scans the index with the birthplace of Hangzhou to obtain a user ID result set. The scale of this set is assumed to be 0.1 million.
Then scan the age, the size is 50 thousand, and finally merge these user IDs, de-duplicate, sort to get the result.
This is obviously a problem. How can we improve it?
Ensure that the results of place and age are out-of-order. Can the data volume of merge be reduced? However, hbase sorts data by row key, and values cannot be sorted.
Work und-redundant user IDs in the row key? OK, this is a solution. The diagram of this solution is as follows:
When merge is used to extract the intersection, It is the required list. The order is determined by the _ id added by the index, which is ensured by lexicographically.
2. Create a composite index based on the index Query type.
In scenario 1, imagine what if the number of single indexes is as high as 10? For 10 indexes, merge is required for 10 times. The performance can be imagined.
To solve this problem, you need to refer to the implementation of the combined index of RDBMS.
For example, you need to query both the place of birth and age. If you create a combined index of place of birth and age, the query validity rate will be much higher than that of merge.
Of course, this index also requires redundant user IDs, so that the results are naturally ordered. The structure diagram is as follows:
The advantage of this solution is that the query speed is very fast. According to the query conditions, you only need to search in a table to obtain the result list. The disadvantage is that if Multiple indexes exist, you need to create multiple composite indexes that correspond to one-to-one query conditions, increasing the Storage pressure.
When developing a schema design scheme, the designer must fully consider the characteristics of the scenario and use it in combination with solution 1 and solution 2. The following is a simple comparison:
|
Single Index |
Composite Index |
Search Performance |
Excellent |
Excellent |
Storage |
Data is not redundant, saving storage. |
Data redundancy is a waste of storage. |
Transactional |
Multiple indexes are difficult to guarantee transaction performance. |
Multiple indexes are difficult to guarantee transaction performance. |
Join |
Poor performance |
Excellent Performance |
Count, sum, AVG, etc |
Full table scan for qualified result sets |
Full table scan for qualified result sets |
As we can see from the table above, solution 1 and 2 all have the problem of more difficult transaction guarantee during update. If the business system can accept the final consistency, the transaction will be slightly better. Otherwise, only complex distributed transactions such as JTA and chubby can be used.
For aggregation functions such as Count, sum, AVG, Max, and Min, hbase can only be hard-scanned and is very tragic. You may need to perform some hack operations (such as adding a CF, value is null). Otherwise, you may need to return all data to the client during scanning.
Of course, you can make some optimizations in this scenario, such as adding a status table, but the complexity brings more risks.
Another ultimate solution is to provide only the previous and next pages in the business. This is perhaps the simplest and most effective solution.
2. Multiple column families in a single table. The index is based on the column
Hbase provides the column family feature.
Column indexes use column family as the index. Multiple index values are scattered to qualifier. Multiple column values are arranged according to version. (CF, qualifer, and version hbase ensure the order, among them, CF and qualifier are in reverse order and version is in reverse order ).
A typical example is that a user sells many products. The title of these products must support like % title % query. The traditional rdmbs-based fuzzy query and search engine-Based Word Segmentation + inverted table.
In hbase, fuzzy queries obviously do not meet our requirements. Next we can only store them by word segmentation + inverted sorting. For the CF-based inverted table index structure, see:
When retrieving data, you only need to locate a row based on the user ID (row Key), then locate qualifier Based on word segmentation, and then retrieve the Top N records through the ordered list of version. However, you may find a problem. The total number of version lists must be scanned for full version list. Here, you need to make some improvements to the business system.
How to join?
The implementation method is the same as join in Case 1. After scanning results of multiple CF columns index, merge needs to be used to conjunction the query results of Multiple indexes.
The comparison between the two solutions seems to change to a table and a column. But in fact, the biggest benefit of this solution is to solve the transaction problem, because all indexes are bound with a single row key, we know that the update of a single row ensures atomic update in hbase, which is the natural advantage of this solution. When you consider a single index, using a column-based index has better applicability than a single table index.
In the solution of column storage granularity, composite indexes can also be achieved in a compromise. Those who understand this storage model may have guessed that it is based on qualifier.
The following table compares the advantages and disadvantages of table and column indexes:
|
Column Index |
Table Index |
Search Performance |
Data retrieval requires multiple scans, the first scan row key, the second scan qualifier, and the third scan version. |
You only need to scan the row key once. |
Storage |
Saving storage when no composite indexes exist |
Saving storage when no composite indexes exist |
Transactional |
Easy to guarantee |
It is difficult to guarantee transaction |
Join |
Poor performance. The performance will be improved only when the combination condition qualifier is set up. |
Poor performance. The performance will be improved only when the composite table index is created. |
Additional questions |
1. The version of each qualifier in the same row is limited in size and cannot exceed the maximum value of Int. (Do not think this value is very large. For massive data storage, it is common to store hundreds of millions of data records) 2. The total count of version needs to be obtained through additional processing. 3. If the size of data in a single row exceeds the size of the split, the compaction or compaction memory cannot be tightened, increasing the risk. |
|
Count, sum, AVG, etc |
Full table scan for qualified result sets |
Full table scan for qualified result sets |
Although column indexes have many disadvantages, the cost advantages brought about by storage savings are sometimes worth doing so. Besides, it solves the transaction problem and requires the user to weigh it by themselves.
It is worth mentioning that Facebook's message application server is implemented based on similar solutions.
3, ithbase
The multi-table in solution 1 solves the performance problem and brings about storage redundancy and data consistency problems. Only one of these two problems can satisfy most business scenarios.
This solution focuses on data consistency. The full name of ithbase is indexed transactional hbase. It can be seen from the name that transaction is an important feature of it.
Introduction to transaction principles of ithbase
Create a transaction table _ global_trx_log __, and record the status in the table each time the transaction is started. Because it is hbase-based htable, the transaction table will also write Wal for recovery, but this log format has been transformed by ithbase, which is called thlog.
When the client updates multiple tables, it starts the transaction and then sends the transaction ID to the hregionserver each time it is put.
Ithbase inherits the hregionserver and hreogin classes and overwrites most operation interface methods, such as put, update, and delete, to obtain transactionalid and status.
After the server receives the operation and transaction ID, it first confirms that the server receives the operation and marks the current transaction as being written (put needs to be initiated again ). After all the table operations are completed, the client will write the commit and commit the table in two stages.
4. Map-Reduce
There is nothing to say about this solution, saving storage and no need to create index tables. You only need to rely on the powerful cluster computing capabilities to export the results. But it is generally not suitable for online businesses.
5. coprocessor
Coprocessor, a new feature under development in the official version 0.92.0, supports region-level indexes. For details, see:
Https://issues.apache.org/jira/browse/HBASE-2038
The coprocessor mechanism can be understood as adding some callback functions to the server. These callback functions are as follows:
The coprocessor interface defines these hooks:
- Preopen, postopen: called before and after the region is reported as online to the master.
- Preflush, postflush: called before and after the memstore is flushed into a new store file.
- Precompact, postcompact: called before and after compaction.
- Presplit, postsplit: called after the region is split.
- Preclose and postclose: called before and after the region is reported as closed to the master.
The regionobserver interface is defines these hooks:
- Preget, postget: called before and after a client makes a GET request.
- Preexists, postexists: called before and after the client tests for existence using a get.
- Preput and postput: called before and after the client stores a value.
- Predelete and postdelete: called before and after the client deletes a value.
- Prescanneropen postscanneropen: called before and after the client opens a new branch.
- Prescannernext, postscannernext: called before and after the client asks for the next row on a previous.
- Prescannerclose, postscannerclose: called before and after the client closes a closed.
- Precheckandput, postcheckandput: called before and after the client cballs checkandput ().
- Precheckanddelete, postcheckanddelete: called before and after the client CILS checkanddelete ().
These hooks can be used to implement region-level secondary indexes and aggregate operations such as Count, sum, AVG, Max, and min without returning all the data. For details, see https://issues.apache.org/jira/browse/hbase-1512.
Principle of secondary index prediction
Because the Final Solution of coprocessor has not yet been published, for the provided hooks, the implementation of the secondary index should be to intercept put, get, scan, delete and other operations of the same region. At the same time, an index CF is maintained in the same reigon to create the corresponding index table.
Region-based index tables have many limitations, such as global sorting.
However, I think the biggest benefit of coprocessor is that it provides full server scalability, which is a great leap for hbase.
How to join?
It has not yet been released, but it is difficult to make breakthroughs in essence. The solution is nothing more than merge and composite index, and transaction is one of the challenges to be solved.
The following table lists the public secondary index solutions:
0.19.3 secondary index
Those who have been paying attention to hbase may know that they have added the secondary index feature when hbase 0.19.3 was released. For details about issue, see here.
Its example is also very simple: http://blog.rajeevsharma.in/2009/06/secondary-indexes-in-hbase.html
Version 0.19.3 secondary Index provides index scan by storing column values as row keys.
However, early hbase demands mainly come from hadoop. The complexity of the transaction and the hard-to-solve ithbase compatibility problem found in hadoop-core at the time, resulting in the official removal of its core code from hbase-core in version 0.20.0, change to contrib third-party extension. For details about issue, see here.
Transactional tableindexed-Ithbase
This solution is a third-party extension officially stripped from the core in version 0.19.3. It has already been introduced above. Currently, the latest hbase 0.90 is supported.
Whether the stability of industrial strength is the main obstacle for users to choose from.
Https://github.com/hbase-trx/hbase-transactional-tableindexed
Facebook Solutions
Facebook uses a single-table, multi-column index solution that has been mentioned above. It perfectly solves the data consistency problem, which is mainly related to their use scenarios.
If you are interested, you can read this blog, which is not described in detail in this article:
Blog.huihoo.com /? P = 688
Development of hbase official solution version 0.92.0-coprocessor
Not yet released, but hbase official blog has an introduction: http://hbaseblog.com/2010/11/30/hbase-coprocessors
Lily hbase indexing Library
This is an index building, query, and management framework. In structure, multiple indexdata index tables are managed through an indexmeta table.
It has a complete set of row key sorting mechanisms for int, String, UTF-8, decimal, and other types. This mechanism is described in detail in this blog:
Http://brunodumon.wordpress.com/2010/02/17/building-indexes-using-hbase-mapping-strings-numbers-and-dates-onto-bytes/
In addition, the Framework provides encapsulated conjunction and Disjunction tool classes for the join scenario (Principle = merge.
Hbase indexing Library also provides convenient interfaces for indexing scenarios.
Ihbase
Ihbase is very similar to ithbase. Ihbase is also extended from the hbase source code level, and some server and client processing logic is redefined and implemented. Therefore, ihbase is highly invasive. Unfortunately, this tool has not been updated after fixing the hbase 0.20.5 compatibility bug. Whether or not version 0.90 or later is supported. I have not tried it yet. A Comparison Between ihbase and ithbase)
Feature |
Ithbase |
Ihbase |
Comment |
Global Ordering |
Yes |
No |
Ihbase has an index for each region. The flip side of not having global ordering is compatibility with the good old hregion: results are coming back in row order (and not value order as in ithbase) |
Full table scan? |
No |
No |
Thbase does a partial scan on the index table. ithbase supports specifying start/end rows to limit the number of scanned regions |
Multiple Index usage |
No |
Yes |
Ihbase can take advantage of Multiple indexes in the same scan. ihbase idxscan object accepts an expression which allows intersection/unison of several indexed column criteria |
Extra disk storage |
Yes |
No |
Ihbase indexes are created when the region starts/flushes and do not require any extra storage |
Extra RAM |
Yes |
Yes |
Ihbase indexes are in memory and hence increase the memory overhead. thbbase indexes increase the number of regions each region server has to support thus costing memory too |
Parallel scanning support |
No |
Yes |
In ithbase the index table needs to be consulted and then gets are issued for each matching row. the behavior of ihbase (as perceived by the client) is no different than a regular scan and hence supports parallel scanning seamlessly. parallel get can be implemented to speedup thbase scans |
PrinciplesWhen a disk is flushed when memstore is full, ihbase intercepts requests and builds indexes for the data in this memstore. Indexing another CF method is stored in the table. However, only the region level (similar to coprocessor) is supported)
Ihbase accelerates scan by combining the tags in the index column. Http://github.com/ykulbak/ihbase
Turn: http://kenwublog.com/hbase-secondary-index-and-join