Optimization of Hbase complex operations-Htable HtablePool

Source: Internet
Author: User

Htable mainly provides operations in tables, such as put, delete, get, and scan.

HTablePool can be used to create a pool to store the implementation objects of the HTableInterface interface, which is generally Htable, to avoid the consumption of Htable creation.

Default HTablePool creation method:

New HTablePool (conf, poolSize );

The Htable obtained in this way cannot be set to autoflush, which reduces the speed by about 3/4 in scenarios with high speed requirements and data loss tolerance.

Check the HTablePool creation method and find that HTable is created by its internal parameter HTableFactory without any settings.

Therefore, the HTableBufferFactory implements HTableInterfaceFactory class is created.

Add settings:

HTable table = new HTable (config, tableName );
Table. setAutoFlush (false );

You can disable automatic submission.

When you put the object, setting put. setWriteToWAL (false); can also improve some performance (so far I have not seen where this log is written ...)

 

However, in actual business scenarios, it is not as simple as put. The function of the background processor is to input data, including put delete incr, which requires high-speed processing.

Disabling autoflush may result in data inconsistency (to be tested), and rpc call requests for delete incr operations cannot be avoided. In this case, you need to use HTable. batch to implement batch processing.

Create List <Row> batch = new ArrayList <Row> (); batch processing when the limit is reached

 

Note that hbase versions do not support incr batches in version 0.92.

In hbase user Mail List user@hbase.apache.org Consulting, "hbase authoritative guide" author Lars George replied to me within two hours, Love Mail List ah!

He has not yet supported 0.94 when writing a book, but it is not in change log, it is a small function, jira address https://issues.apache.org/jira/browse/HBASE-2947

Incr performs batch and compares the processing capability of 2000 data records after double threads:

Before batch and thread double:

07 05 16:15:05 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 1.045479
07 05 16:15:05 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 0.014752
07 05 16:15:08 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 2.638535
07 05 16:15:08 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 0.217352
07 05 16:15:09 [[ClickDBWorker] 41] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 64.197514
07 05 16:15:10 [[ClickDBWorker] 35] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 2.379507
07 05 16:15:14 [[ClickDBWorker] 32] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 1.904393
07 05 16:15:15 [[ClickDBWorker] 30] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 71.706639
07 05 16:15:21 [[ClickDBWorker] 40] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 1.533836

After batch and thread double

07 06 16:33:39 [[ClickDBWorker] 18] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 1.127692
07 06 16:33:40 [[ClickDBWorker] 18] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 0.800586
07 06 16:33:43 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to mysql cost time: 0.532394
07 06 16:33:43 [[ClickDBWorker] 34] INFO c. t. t. m. MsgReactor-[MsgReactor] exploadeAll to hbase cost time: 1.79E-4

 

In the past, there were dozens of hours of writing, and the available allocated memory was full. It should be because the thread processing was too slow and the created object could not be released.

 

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.