HBase a slow query and troubleshooting process

Source: Internet
Author: User
Tags compact execution

The recent HBase cluster encountered a slow query request, the following is a specific description of the problem and troubleshooting process.

1. Finding problems

There is a hbase table in the project, the early morning will be concentrated in bulk import a batch of data, import data is very large, in tens of millions of magnitude, and then the day to provide users with query services. One day suddenly found that the table according to each region (a total of 256) scan only a few pieces of data, the partial region query request response time is very slow, up to 10 seconds or even 10 seconds.
2. Troubleshooting Issues

First, by looking at the HBase Region Server monitor interface, you can see that there is only 1~3 StoreFile below each region of the table, excluding the fact that the query response is slow due to too much storefile.

Then troubleshoot, found that this table has a TTL of 5 days, so there will be a large number of expired data exists. At the same time, since the table will import a batch of data every morning (3.22 of which has been imported more than 700 million records in the last week), the cluster's major compact cycle is 7 days, although No. 3.22 data has expired so far, but has not passed major The compact triggers the removal of expired data, so there are a large number of outdated but not yet purged data, resulting in even a small number of data in accordance with each region only a few scan, still need to filter out a lot of outdated data (from the monitoring to see that the block cache access at that time than usual, As the following illustration shows), the query response time is slow because the data is actually useful.

More Wonderful content: http://www.bianceng.cnhttp://www.bianceng.cn/database/extra/

3. Problem solving

There are two ways to solve this problem:

1 every morning after the import of data, forced to trigger a major compact operation (see Hbaseadmin majorcompct method, asynchronous execution), so that the table in each region of the expired data can be timely cleared away.

2) because the cluster major compact cycle is 7 days, and the table TTL is 5 days, it can be major The compact cycle is smaller (the configuration parameter is hbase.hregion.majorcompaction, the unit is in milliseconds, and Hbase.offpeak.start.hour can set the hour for the major compact to start, for example, set to 1, can be guaranteed to be triggered after 1 O ' Hour, from the cluster level to ensure that the major compact triggers execution as early as possible.

Author: great Circle those things

URL: http://www.cnblogs.com/panfeng412/archive/2013/06/08/hbase-slow-query-troubleshooting.html

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.