HBase1.0.0源碼分析之請求處理流程分析以Put操作為例(一),hbase1.0.0put
如下面的代碼所示,是HBase Put操作的簡單代碼執行個體,關於代碼中的Connection connection = ConnectionFactory.createConnection(conf),已近在前一篇博 HBase1.0.0源碼分析之Client啟動串連流程,中介紹了連結的相關流程以及所啟動的服務資訊。
TableName tn = TableName.valueOf("test010"); try (Connection connection = ConnectionFactory.createConnection(conf)) { try (Table table = connection.getTable(tn)) { Put put = new Put("ROW1".getBytes()); put.addColumn("CF1".getBytes(),"column1".getBytes(),"value1".getBytes()); put.addColumn("CF2".getBytes(),"column1".getBytes(),"value1".getBytes()); table.put(put); System.out.println("done!"); } }
本文著重解析put是如何被一步步的傳送到伺服器端以及被伺服器端調用的。首先我們有必要回顧一下關於Connection的類型結構,如所示:HConnectionImplementation 類是實際負責和伺服器串連的,要想對錶的資料操作,例如例子中的put我們首選需要擷取一個Table的的執行個體,這個可以從connection中拿到,
public HTableInterface getTable(TableName tableName, ExecutorService pool) throws IOException { if (managed) { throw new NeedUnmanagedConnectionException(); } return new HTable(tableName, this, tableConfig, rpcCallerFactory, rpcControllerFactory, pool); }
Table其實就是一個操作介面,真正的實作類別是HTable,HTable可以負責對單一的HBase的資料表進行資料的插入刪除等資料層次的操作,該類目前只是HBase Internal 的,對外的介面是Table,擷取HTable執行個體之後就是對操作進行執行了,
/** * {@inheritDoc} * @throws IOException */ @Override public void put(final Put put) throws IOException { getBufferedMutator().mutate(put); if (autoFlush) { flushCommits(); } }
以上的代碼就是HTable操作的原型,這裡進行了一系列的調用,我們一一分析,首先是getBufferedMutator()函數,
該函數返回一個實現的執行個體BufferedMutatorImpl,該類和HTable類似,負責和單個HBase的table通訊,但是他對put的操作是batch的,並且具有非同步執行的能力
mutate在內部會調用doMutate的方法:
private void doMutate(Mutation m) throws InterruptedIOException, RetriesExhaustedWithDetailsException { if (closed) { throw new IllegalStateException("Cannot put when the BufferedMutator is closed."); } if (!(m instanceof Put) && !(m instanceof Delete)) { throw new IllegalArgumentException("Pass a Delete or a Put"); } // This behavior is highly non-intuitive... it does not protect us against // 94-incompatible behavior, which is a timing issue because hasError, the below code // and setter of hasError are not synchronized. Perhaps it should be removed. if (ap.hasError()) { writeAsyncBuffer.add(m); backgroundFlushCommits(true); } if (m instanceof Put) { validatePut((Put) m); } currentWriteBufferSize += m.heapSize(); writeAsyncBuffer.add(m); while (currentWriteBufferSize > writeBufferSize) { backgroundFlushCommits(false); } }
有效代碼也就是這一句:writeAsyncBuffer.add(m);其實也就是向一個非同步緩衝區添加該操作,單後當一定條件的時候進行flash,當發生flash操作的時候,才會真正的去執行該操作,這主要是提高系統的吞吐率,接下來我們去看看這個flush的操作內部吧。
private void backgroundFlushCommits(boolean synchronous) throws InterruptedIOException, RetriesExhaustedWithDetailsException { try { if (!synchronous) { ap.submit(tableName, writeAsyncBuffer, true, null, false); if (ap.hasError()) { LOG.debug(tableName + ": One or more of the operations have failed -" + " waiting for all operation in progress to finish (successfully or not)"); } }
這個重新整理操作可以是制定非同步提交還是同步提交,從doMutate中來看預設是以非同步方式進行,這裡的ap是AsyncProcess類的一個執行個體,該類使用多線程的來實現非同步請求,通過Future進行線程中伺服器端資料的擷取。這裡的過程也比較複雜,我將在下一篇文章中繼續。