關於HBase MVCC的設計原理以及MVCC所引起的一個scan問題

最後更新：2018-12-05 來源：互聯網

上載者：User

創建阿里雲帳戶，並獲得超過 40 款產品的免費試用版；而企業帳戶則可以享有總值 $1200 的免費試用版。立即註冊！

最近在使用HBase0.94版本的時，偶爾會出現，HRegionInfo was null or empty in Meta 的警告java.io.IOException: HRegionInfo was null or empty in Meta for writetest, row=lot_let,9399239430349923234234,99999999999999
at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:170)
在用戶端的MetaScanner.metaScan實現中metaTable = new HTable(configuration, HConstants.META_TABLE_NAME);Result startRowResult = metaTable.getRowOrBefore(searchRow,HConstants.CATALOG_FAMILY);if (startRowResult == null) { throw new TableNotFoundException("Cannot find row in .META. for table: " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }
byte[] value = startRowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.REGIONINFO_QUALIFIER);
if (value == null || value.length == 0) { throw new IOException("HRegionInfo was null or empty in Meta for " + Bytes.toString(tableName) + ", row=" + Bytes.toStringBinary(searchRow)); }可以發現在掃描MetaScanner,rowkey所在的範圍在Meta 表中不存在；通過RPC定位到服務端的實現
HRegion中：

public Result getClosestRowBefore(final byte [] row, final byte [] family) throws IOException { if (coprocessorHost != null) { Result result = new Result(); if (coprocessorHost.preGetClosestRowBefore(row, family, result)) { return result; } } // look across all the HStores for this region and determine what the // closest key is across all column families, since the data may be sparse checkRow(row, "getClosestRowBefore"); startRegionOperation(); this.readRequestsCount.increment(); try { Store store = getStore(family); // get the closest key. (HStore.getRowKeyAtOrBefore can return null) KeyValue key = store.getRowKeyAtOrBefore(row); Result result = null; if (key != null) { Get get = new Get(key.getRow()); get.addFamily(family); result = get(get, null); } if (coprocessorHost != null) { coprocessorHost.postGetClosestRowBefore(row, family, result); } return result; } finally { closeRegionOperation(); } }在 KeyValue key = store.getRowKeyAtOrBefore(row);中獲得了Meta表的rowkey，但是在後續的實現中 if (key != null) { Get get = new Get(key.getRow()); get.addFamily(family); result = get(get, null); }獲得空的result導致了這個問題;為什麼會存在這個現象。先講一下HBase 的MVCC的原理，MVCC是保證資料一致性的手段，HBase在寫資料的過程中，需要經過好幾個階段，寫HLog，寫memstore，更新MVCC;只有更新了MVCC，才算真正memstore寫成功，其中事務的隔離需要有mvcc的來控制，比如讀資料不可以擷取別的線程還未提交的資料。1、put、delete資料都會調用applyFamilyMapToMemstoreHRegion中private long applyFamilyMapToMemstore(Map<byte[], List<KeyValue>> familyMap, MultiVersionConsistencyControl.WriteEntry localizedWriteEntry) { long size = 0; boolean freemvcc = false;
try { if (localizedWriteEntry == null) {//開始一個寫memstore，mvcc中的memstoreWrite++，並add待write pending隊列中 localizedWriteEntry = mvcc.beginMemstoreInsert(); freemvcc = true; }
for (Map.Entry<byte[], List<KeyValue>> e : familyMap.entrySet()) { byte[] family = e.getKey(); List<KeyValue> edits = e.getValue();
Store store = getStore(family); for (KeyValue kv: edits) { kv.setMemstoreTS(localizedWriteEntry.getWriteNumber()); size += store.add(kv); } } } finally { if (freemvcc) { mvcc.completeMemstoreInsert(localizedWriteEntry); } }
return size; }
mvcc.completeMemstoreInsert，更新mvcc 的memstoreRead，也就是可以讀的位置，並通知readWaiters.notifyAll()，釋放因flushcache調用waitForRead引起的阻塞;waitForRead參見以下代碼： public void waitForRead(WriteEntry e) { boolean interrupted = false; synchronized (readWaiters) {//小於，表示還有寫未提交 while (memstoreRead < e.getWriteNumber()) { try { readWaiters.wait(0); } catch (InterruptedException ie) { // We were interrupted... finish the loop -- i.e. cleanup --and then // on our way out, reset the interrupt flag. interrupted = true; } } } if (interrupted) Thread.currentThread().interrupt(); }2、在flushcache的過程中，擷取到memstore中的keyvalues後,會調用mvcc.waitForRead(w)(因memstore所有的keyvalue,包括還未真正提交的，所以要等待其他事務提交後，才可以進行後續的flush操作，保證事務的一致性。 w = mvcc.beginMemstoreInsert(); mvcc.advanceMemstore(w); mvcc.waitForRead(w);
3、scan資料在RegionScannerImpl.next方法實現中： public synchronized boolean next(List<KeyValue> outResults, int limit) throws IOException { if (this.filterClosed) { throw new UnknownScannerException("Scanner was closed (timed out?) " + "after we renewed it. Could be caused by a very slow scanner " + "or a lengthy garbage collection"); } startRegionOperation(); readRequestsCount.increment(); try { // This could be a new thread from the last time we called next().//this.readPoint在構造的時，初始化（readpoint為當前hregion的mvcc中的memstoreRead，為當前可讀的點）和當前線程綁定 MultiVersionConsistencyControl.setThreadReadPoint(this.readPt);在MemStore中過濾掉還未提交的事務（新的keyvalue中有最新的point） protected KeyValue getNext(Iterator<KeyValue> it) { long readPoint = MultiVersionConsistencyControl.getThreadReadPoint(); while (it.hasNext()) { KeyValue v = it.next();//過濾掉大於當前線程readPoint的keyvalue if (v.getMemstoreTS() <= readPoint) { return v; } } return null; } 縱觀MVCC的整個過程，再分析HRegion中的getClosestRowBefore方法實現， KeyValue key = store.getRowKeyAtOrBefore(row);這個調用不會進行MVCC的控制，可以讀到memstore中所有的資料而get方法是會進行MVCC進行控制的，所以一種可能情況是在get調用的時， store.getRowKeyAtOrBefore(row)讀到的key值還未提交，所有都過濾掉了，查詢範圍為null。

本文章原先以中文撰寫並發佈於 aliyun.com，亦設英文版本，僅作資訊用途。本網站不對文章的準確性，完整性或可靠性或其任何翻譯作出任何明示或暗示的陳述或保證。如對該文章有任何疑慮或投訴，請傳送電郵至 info-contact@alibabacloud.com 並提供相關疑慮或投訴的詳細說明。職員會於 5 個工作天內與您聯絡，一經驗證之後，即會刪除該侵權內容。

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

關於HBase MVCC的設計原理以及MVCC所引起的一個scan問題

聯繫我們

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support