HBase practice series 2-Region monitoring

Source: Internet
Author: User
1. background with the drive of big data table applications, our HBase cluster is growing. However, due to some uncertain bugs in machines, networks, and HBase, this makes the system face some uncertain faults. Therefore, HBase has many Region components and needs to control the Region status of each table. Analysis: 1) Real-time Control of Region

1. background with the drive of big data table applications, our HBase cluster is growing. However, due to some uncertain bugs in machines, networks, and HBase, this makes the system face some uncertain faults. Therefore, HBase has many Region components and needs to control the Region status of each table. Analysis: 1) Real-time Control of Region

1. Background

Driven by big data table applications, our HBase Clusters become larger and larger. However, due to some uncertain bugs in machines, networks, and HBase, the system is faced with some uncertain faults.

Therefore, HBase has many Region components and needs to control the Region status of each table.

Analysis:

1) Real-time Control of Region status. Each access to an application must be directly associated with a Region of HBase. You need to check whether the Region on the Table is available.

2) read/write of Region is associated with the state of underlying HDFS. This association determines the monitoring of read/write status through Region, and also reflects the HDFS status.

2. Practical Tools

? Org. apache. hadoop. hbase. tool. Canary monitors Region availability and read/write status. ==> Corresponds to the first two problems in the analysis.

Usage:

Usage: bin/hbase org. apache. hadoop. hbase. tool. Canary [opts]
Where [opts] are:
-Help ????????? Show this help and exit.
-Daemon ??????? Continuous check at defined intervals.
-Interval ? Interval between checks (sec)

Run $ {HBASE_HOME}/bin/hbase org. apache. hadoop. hbase. tool. Canary.

All the tables in the cluster are tested under different configurations. The test results are as follows:

By default, it retrieves the startKey of Region, performs the Get operation once according to ColumnFamily, and prints the system latency. If a Region error occurs, the failed status is displayed.

However, this tool still has shortcomings:

1) Real-Time alerts for Region service exceptions cannot be provided.

2) Delay monitoring and alarms are not provided.

We have added the corresponding alarm function in the Code. Each time we test the function, we can find the Table with the latency exceeding the limit or the Region has a problem, and send an alarm by email or Message.

Ps: to increase the intelligent response of monitoring, when the hfile file cannot be seek or Region offline, the program uses HBaseAdmin. the assign (regionName) interface can be redeployed once to avoid the following exceptions:

1) The storefile on Region is inconsistent. For example, the files in the storefile list do not correspond to those in hdfs. This problem may occur during the system Compaction exception or split operation. re-assign will re-load this part of data to avoid this problem.

2) Region is in the Offline status. For example, when the RS is offline and the HMaster is down, AM cannot work, which may cause this phenomenon.

Note:

This series of articles is original in Binos_ICT personal technology blog in Binospace.

FromBinospace,PostHBase practice series 2-Region monitoring

The footer information of the article is automatically generated by WordPress's wp-posturl plug-in.

Copyright©2008
This feed is for personal, non-inclucial use only.
The use of this feed on other websites breaches copyright. If this content is not in your news reader, it makes the page you are viewing an infringement of the copyright. (Digital Fingerprint:
)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.