Tips for using Hbase Scan in MR

Source: Internet
Author: User
In Hadoop's MR operation, Hbase can be used as the input data source for calculation. The following describes how to use Hbase as the HTable iterator Scan: publicvoidsetBatch (intbatch) publicvoidsetCaching (intcaching) publicvoidsetCacheBlocks (booleancacheBlocks) publicvoidsetB

In Hadoop's MR operation, Hbase can be used as the input data source for calculation. The following are some tips for using Hbase as the HTable iterator Scan: public void setBatch (int batch) public void setCaching (int caching) public void setCacheBlocks (boolean cacheBlocks) public void setB

In Hadoop's MR operation, Hbase can be used as the input data source for calculation. As an HTable iterator, Scan has several usage skills.

The method involved is as follows:

public void setBatch(int batch)public void setCaching(int caching)public void setCacheBlocks(boolean cacheBlocks)

Public void setBatch (int batch ):

To set the number of columns to retrieve records, the default value is unlimited, that is, all columns are returned.

Public void setCaching (int caching ):

The number of lines read from the server each time. The default value is set in the configuration file.

Public void setCacheBlocks (boolean cacheBlocks ):

This parameter indicates whether a block is cached. The default cache is used. Three methods are available: memory, cache, and disk. Generally, data is read from memory-> cache-> disk. When MR is used, data is non-hotspot, therefore, no cache is required.

Therefore, it is best to set MR as follows:

Scan. setCacheBlocks (false); scan. setCaching (200); // memory usage is high, but rpc does not scan. setBatch (6); // The column you need

?



Existing 0People comment, slam-> Here<-Participate in the discussion


ITeye recommendation
  • -Software talents free of language and low guarantee paid study in the United States! -



Original article address: Tips for using Hbase Scan in MR. Thank you for sharing it with me.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.