1. background with the drive of big data table applications, our HBase cluster is growing. However, due to some uncertain bugs in machines, networks, and HBase, this makes the system face some uncertain faults. Therefore, HBase has many Region components and needs to control the Region status of each table. Analysis: 1) Real-time Control of Region
1. background with the drive of big data table applications, our HBase cluster is growing. However, due to some uncertain bugs in machines, networks, and HBase, this makes the system face some uncertain faults. Therefore, HBase has many Region components and needs to control the Region status of each table. Analysis: 1) Real-time Control of Region
1. Background
Driven by big data table applications, our HBase Clusters become larger and larger. However, due to some uncertain bugs in machines, networks, and HBase, the system is faced with some uncertain faults.
Therefore, HBase has many Region components and needs to control the Region status of each table.
Analysis:
1) Real-time Control of Region status. Each access to an application must be directly associated with a Region of HBase. You need to check whether the Region on the Table is available.
2) read/write of Region is associated with the state of underlying HDFS. This association determines the monitoring of read/write status through Region, and also reflects the HDFS status.
2. Practical Tools
? Org. apache. hadoop. hbase. tool. Canary monitors Region availability and read/write status. ==> Corresponds to the first two problems in the analysis.
Usage:
Usage: bin/hbase org. apache. hadoop. hbase. tool. Canary [opts]
Where [opts] are:
-Help ????????? Show this help and exit.
-Daemon ??????? Continuous check at defined intervals.
-Interval ? Interval between checks (sec)
Run $ {HBASE_HOME}/bin/hbase org. apache. hadoop. hbase. tool. Canary.
All the tables in the cluster are tested under different configurations. The test results are as follows:
By default, it retrieves the startKey of Region, performs the Get operation once according to ColumnFamily, and prints the system latency. If a Region error occurs, the failed status is displayed.
However, this tool still has shortcomings:
1) Real-Time alerts for Region service exceptions cannot be provided.
2) Delay monitoring and alarms are not provided.
We have added the corresponding alarm function in the Code. Each time we test the function, we can find the Table with the latency exceeding the limit or the Region has a problem, and send an alarm by email or Message.
Ps: to increase the intelligent response of monitoring, when the hfile file cannot be seek or Region offline, the program uses HBaseAdmin. the assign (regionName) interface can be redeployed once to avoid the following exceptions:
1) The storefile on Region is inconsistent. For example, the files in the storefile list do not correspond to those in hdfs. This problem may occur during the system Compaction exception or split operation. re-assign will re-load this part of data to avoid this problem.
2) Region is in the Offline status. For example, when the RS is offline and the HMaster is down, AM cannot work, which may cause this phenomenon.
Note:
This series of articles is original in Binos_ICT personal technology blog in Binospace.
FromBinospace,PostHBase practice series 2-Region monitoring
The footer information of the article is automatically generated by WordPress's wp-posturl plug-in.
Copyright©2008
This feed is for personal, non-inclucial use only.
The use of this feed on other websites breaches copyright. If this content is not in your news reader, it makes the page you are viewing an infringement of the copyright. (Digital Fingerprint:
)