Transferred from: http://www.oschina.net/question/12_32573
Secondary indexes and index joins are the basic features that the online business system requires the storage engine to provide. RDBMS support is better, and the NoSQL camp is groping for the best solution that fits its own characteristics.
This article will use HBase as an object to explore how to build two-level indexes and implement index joins based on HBase. At the end of this article, we will also list the current known secondary index, Ithbase, Facebook and the official Coprocessor program, which includes 0.19.3 version.
Theoretical goals
Implementing a Level Two index and index join in HBase requires three goals to consider:
1, high-performance range retrieval.
2, low redundancy of data (the amount of data stored).
3, data consistency.
Performance and data redundancy, consistency is a mutually restrictive relationship.
If you want to achieve high-performance range retrieval, it is necessary to rely on redundant index data to improve performance, and data redundancy will make it difficult to achieve consistency when updating data, especially in distributed scenarios.
If the performance requirements for range retrieval are not high, then you can avoid redundant data, consistency issues can be indirectly avoided, after all, share nothing is recognized as the simplest and most effective solution.
Theory and practice, the following will be an example of how the various options to choose the focus.
These programs are through the author's data review and colleagues of the continuous exchange of conclusions, if there are errors, please correct:
1, build table by index
Each index builds a table and then relies on the table's row key to implement scope retrieval. Row key is structured and stored in hbase with a B + tree, so the scan will be more efficient.
Single table stores the index as row key, and column value stores the ID value or other data, which is the structure of the HBase Index table.
How to join?
In a multiple-index (multi-table) join scenario, there are two main reference scenarios:
1, scan each independent single index table by the kind of index, finally merge the scan result.
This scenario is characterized by simplicity, but if multiple index scans result in a larger amount of data, the merge will encounter bottlenecks.
For example, there is now a 100 million user information table, with birth place and age two index, I want a condition is born in Hangzhou, age is 20 years old by the user ID in the first 10 of the list of users.
One scenario is that the system first scans the index of the place of birth for Hangzhou, and gets a user ID result set, which assumes a scale of 100,000.
Then scan age, scale is 50,000, finally merge these user IDs, go to heavy, sort to get results.
This is obviously a problem, how to improve it?
To ensure that the result of birth and age is sequenced, can reduce the data volume of the merge? But HBase is sorted by row key, and value cannot be sorted.
Work around – redundancy of the user ID into the row key? OK, this is a solution, the diagram for this scenario is as follows:
The intersection of the merge is the desired list, the order is added by the index _id, in order to guarantee the dictionary.
2, set up a composite index by index query type.
In Scenario 1 scenarios, imagine what happens if you have up to 10 single-index numbers? 10 index, will merge 10 times, the performance is conceivable.
Solving this problem requires reference to the integrated index implementation of the RDBMS.
For example, the place of birth and age need to be queried at the same time, if the establishment of a place of birth and age combination index, query efficiency will be higher than the merge many.
Of course, this index also requires a redundant user ID to make the results natural and orderly. The structure diagram is as follows:
The advantage of this scheme is that the query speed is very fast, according to the query conditions, only need to go to a table to retrieve the results list. The disadvantage is that if there are multiple indexes, it is necessary to establish multiple indexes corresponding to the query criteria, the storage pressure will increase.
In the design of the schema, designers need to fully consider the characteristics of the scene, combined with scheme one and the other use. The following is a simple comparison:
|
Single index |
Combined index |
Retrieving performance |
Excellent |
Excellent |
Store |
Data is not redundant and saves storage. |
Data redundancy, storage comparison waste. |
Transactional |
Multiple indexes make transactional comparisons difficult. |
Multiple indexes make transactional comparisons difficult. |
Join |
Poor performance |
Excellent performance |
Count,sum,avg,etc |
Full table scan of eligible result sets |
Full table scan of eligible result sets |
As you can tell from the table above, there are problems with the transactional guarantees that are more difficult to update in each scenario. If the business system can accept eventual consistency, the transaction will be slightly better done. Otherwise, it can only rely on complex distributed transactions, such as Jta,chubby and other technologies.
Aggregate functions such as count, Sum, AVG, Max, Min, hbase can only be hard-swept, and it is tragic that you may need to do some hack operations (such as adding a cf,value to null), otherwise you may need to pass all the data back to the client when you scan.
Of course you can do some optimizations in this scenario, such as adding state tables, but the risk of complexity is higher.
The ultimate solution is to provide only the previous and next pages in the business, which is perhaps the simplest and most effective solution.
2, single table multiple column families, indexed column-based
HBase provides the column family, family attribute.
The column index is indexed by family, and multiple index values are scattered to qualifier, and multiple column values are arranged according to the version (CF, qualifer, version hbase is guaranteed to be orderly, where CF and qualifier are positive sequence, version reverse).
A typical example is that the user sells a lot of goods, the title of these products need to support like%title% query. Traditional based on rdmbs is fuzzy query, based on search engine is word breaker + inverted table.
In HBase, the fuzzy query obviously does not meet our requirements, the next can only be stored by Word segmentation + inverted way. CF-based inverted table index structure see:
When fetching the data, simply navigate to a row based on the user ID (row key), then locate the qualifier according to the word breaker, and then use the ordered list of version to fetch top N records. However, you may find a problem, the total number of version list is required to scan the full version of the list in order to know, there is a need for the business system itself to make some improvements.
How to join?
Implementation mode with the join in Scenario 1, multiple CF column index scan results, need to go merge, the query results of multiple indexes conjunction.
The comparison of two scenarios seems to change to be a table, a column, but in fact this scheme has one of the biggest benefits is to solve the transactional problem, because all the indexes are bound with a single row key, we know that the update of a single row, in HBase is guaranteed atomic updates, This is the natural advantage of this solution. When you consider a single index, using a column-based index is better than a single-table index.
The combination of indexes in the storage granularity of the scheme, also can be a compromise implementation. Students who understand this storage pattern may have guessed it, based on qualifier.
The following table compares the advantages and disadvantages of table and column indexes:
|
column index |
table index |
Retrieval Performance |
Retrieving data takes multiple scan, first scan row key, second scan qualifier, third scan version. |
only need to go through the scan of row key one time. |
store |
Storage Savings when no composite index |
Storage savings when no composite index |
transactional |
Easy to guarantee |
make transactional more difficult |
Join |
performance is poor and performance is improved only when the combined condition qualifier is established |
performance is poor and performance is improved only when a composite table index is established |
Additional issues |
1, the version of each qualifier in the same row has a size limit and cannot exceed the maximum value of int. (Do not think this value is very large, for the massive data storage, hundreds of millions of very common) 2,version count of the total need to do extra processing to get. 3, when a single row data exceeds the split size, can result in a compaction or compaction memory crunch, increasing the risk. |
|
count,sum,avg,etc |
qualifying result set full table scan |
qualifying result set full table scan |
Although there are so many disadvantages to the column index, the cost advantages of storage savings are sometimes worth doing, not to mention that it solves transactional problems and requires users to weigh them.
It's worth mentioning that Facebook's Messaging application server is based on a similar scenario.
3,ithbase
The multi-table in scenario one solves the performance problem and brings storage redundancy and data consistency issues. In both cases, the majority of business scenarios are met, as long as one of them is resolved.
In this scenario, the focus is on data consistency. The full name of the ithbase is Indexed transactional HBase, which can be seen from the names that transactional is an important feature of it.
Brief introduction to Ithbase's business principles
A transaction table __global_trx_log__ is created, and the status is recorded in the table each time a transaction is opened. Because it is hbase-based htable, the transaction table also writes the Wal for recovery, but the log format has been ithbase modified, which is called Thlog.
When the client updates multiple tables, it starts the transaction, and then each time it is put, the transaction ID is passed to hregionserver.
By inheriting the Hregionserver and Hreogin classes, Ithbase overrides most of the operations interface methods, such as put, update, delete, to get transactionalid and state.
When the server receives the operation and transaction ID, it first confirms that the service is received, marking that the current transaction is to be written (it needs to initiate a put again). When all the table operations are completed, the client commits the commit write uniformly, making two-phase commit.
4,map-reduce
There is nothing to say about this scenario, storage savings, and the need to build an index table, which can be derived only by powerful cluster computing power. But generally not suitable for online business.
5,coprocessor co-processor
Official 0.92.0 new feature in development-coprocessor, support for region-level indexing. See:
https://issues.apache.org/jira/browse/HBASE-2038
The mechanism of the coprocessor can be understood as a number of callback functions added to the server side. These callback functions are as follows:
The Coprocessor interface defines these hooks:
- Preopen, postopen:called before and after the region are reported as online to the master.
- Preflush, postflush:called before and after the Memstore are flushed into a new store file.
- Precompact, postcompact:called before and after compaction.
- Presplit, postsplit:called after the region is split.
- Preclose and postclose:called before and after the region are reported as closed to the master.
The Regionobserver interface is defines these hooks:
- Preget, postget:called before and after a client makes a Get request.
- preexists, postexists:called before and after the client tests for existence using a Get.
- Preput and postput:called before and after the client stores a value.
- Predelete and postdelete:called before and after the client deletes a value.
- Prescanneropen postscanneropen:called before and after the client opens a new scanner.
- Prescannernext, postscannernext:called before and after the client asks for the next row on a scanner.
- Prescannerclose, postscannerclose:called before and after the client closes a scanner.
- Precheckandput, postcheckandput:called before and after the client calls Checkandput ().
- Precheckanddelete, postcheckanddelete:called before and after the client calls Checkanddelete ().
With these hooks, you can achieve a region level two index that enables aggregation operations such as Count, Sum, AVG, max, MIN, etc. without returning all the data, as described in https://issues.apache.org/jira/browse/HBASE-1512.
The principle of second-level index guessing
Since Coprocessor's final plan is not yet published, the implementation of the two-level index should be to intercept the put, get, scan, delete operations of the same region as provided by these hooks. At the same time maintain an index CF in the same reigon, establish the corresponding index table.
A region-based index table has a number of limitations, such as global ordering, which is difficult to do.
But I think the biggest advantage of coprocessor is that it provides server-side full scalability, which is a big leap forward for hbase.
How to join?
It is not yet released, but it is difficult to understand the nature of the breakthrough. The solution is nothing more than merge and composite index, and the same transactional is one of the challenges to solve.
The industry's publicly available two-level indexing scheme lists:
Version 0.19.3 secondary Index
have been concerned about hbase classmate, perhaps know, as early as HBase 0.19.3 release, has joined the secondary index function, issue see here.
The example of its use is also simple: http://blog.rajeevsharma.in/2009/06/secondary-indexes-in-hbase.html
0.19.3 version Secondary index provides index scan by storing column values in the row key method.
But the early needs of HBase came primarily from Hadoop. The complexity of the transaction and the Ithbase compatibility problem found in the Hadoop-core, led to the official release of its core code from Hbase-core in 0.20.0, instead of contrib third-party extensions, issue see here.
Transactional tableindexed-ithbase
The plan is to be officially stripped out of the core of the third-party extension in 0.19.3, which has been described above. Currently supports the latest HBase 0.90.
The stability of industrial strength is the main obstacle for users to choose.
Https://github.com/hbase-trx/hbase-transactional-tableindexed
Facebook Solutions
Facebook uses a single-table multi-column index solution, as mentioned above. Perfectly solves the problem of data consistency, which is mainly related to their usage scenarios.
Interested students can read this blog, this article does not make detailed:
blog.huihoo.com/?p=688
HBase Official Program 0.92.0 version development of –coprocessor co-processor
Not yet released, but HBase official blog has an introduction: Http://hbaseblog.com/2010/11/30/hbase-coprocessors
Lily Hbase Indexing Library
This is an indexed build, query, and management framework. structure is to manage multiple Indexdata index tables through a single Indexmeta table.
The feature is a very complete set of row key sorting mechanisms for int, string, utf-8, Decimal, and other types. This mechanism is described in detail in this blog post:
http://brunodumon.wordpress.com/2010/02/17/building-indexes-using-hbase-mapping-strings-numbers-and-dates-onto-bytes/
In addition, the framework provides encapsulated conjunction and disjunction tool classes for the join scene (principle =merge).
The Hbase Indexing Library also provides a convenient interface for index building scenarios.
Ihbase
Ihbase is very similar to ithbase. Ihbase has also been extended from the HBase source code level, redefining and implementing some server,client-side processing logic, so it is highly intrusive. Unfortunately, this tool has not been updated since fix hbase 0.20.5 version compatibility bug. Whether or not to support 0.90 or more versions, the author has not tried. A comparison of Ihbase and Ithbase (from the beholder)
Feature |
Ithbase |
Ihbase |
Comment |
Global Ordering |
Yes |
No |
Ihbase has a index for each region. The flip side of not have global ordering is compatibility with the good old hregion:results be coming back in row ord ER (and not value order as in Ithbase) |
Full table scan? |
No |
No |
Thbase does a partial scan on the index table. Ithbase supports specifying start/end rows to limit the number of scanned regions |
Multiple Index Usage |
No |
Yes |
Ihbase can take advantage of multiple indexes in the same scan. Ihbase Idxscan Object accepts an Expression which allows intersection/unison of several indexed column criteria |
Extra disk storage |
Yes |
No |
Ihbase indexes is created when the region starts/flushes and does not require any extra storage |
Extra RAM |
Yes |
Yes |
Ihbase indexes is in memory and hence increase the memory overhead. Thbbase indexes increase the number of regions each region server have to support thus costing memory too |
Parallel Scanning Support |
No |
Yes |
In Ithbase the index table needs to is consulted and then GETs is issued for each matching row. The behavior of ihbase (as perceived by the client) is no different than a regular scan and hence supports parallel scanni ng seamlessly. Parallel GET can implemented to speedup thbase scans |
Principle IntroductionWhen the Memstore is full, ihbase intercepts the request and builds an index of the Memstore data. Index another CF in a way that is stored in the table. Only region level (similar to coprocessor) is supported
Scan, Ihbase will speed up the scan by combining the markers in the index column. Http://github.com/ykulbak/ihbase
HBase Two-level index and join