Use JMX to obtain Hadoop/HBase metric data

Last Update:2015-04-30 Source: Internet

Author: User

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Overview

Speaking of Hadoop and HBase Cluster Monitoring, we all know and use third-party monitoring tools, such as cacti, ganglia, and zabbix. Zenoss is used for playing more deeply. These tools are indeed good and can play a major role, but after a long time, I always feel that the monitoring granularity is still relatively coarse, not detailed enough. After all, it is a third-party monitoring. Even if Hadoop comes with the ganglia interface, it still feels insufficient.

In fact, Hadoop itself has monitoring interfaces. The release versions of various companies also have their own custom interfaces, but there may be fewer people to know.

In fact, this interface is very simple, but very detailed and convenient, that is, JMX.

All users of Hadoop's http Monitoring Ports know that namenode 50070, jobtracker 50030, datanode 50075, and tasktracker 50060. However, when you access these ports, the monitoring page such as dfshealth. jsp or jobtracker. jsp is automatically displayed. Jmx access is simple. You only need to change the webpage name to jmx.

For example

Http: // Replace the address of your_namenode: 50070/dfshealth. jsp with http: // your_namenode: 50070/jmx

You can obtain other information, such as, and so on. HBase system information can also be obtained using this method.

The returned values are all in JSON format, which is easy to process by yourself. The returned information is also very detailed, including the memory status, memory pool status, and java heap information. The operating system information, version, and JVM version information are comprehensive.

Implementation

For data access from an address such as http: // your_namenode: 50070/jmx, you can use HttpClient to access the data, and then pass the obtained data through

Because the returned Json data volume is large and basically not all is required, you can add? Obtain part of the data in qry mode,

For example: http: // your_namenode: 60010/jmx? Qry = Hadoop: service = HBase, name = Master, sub = Server

Maven Configuration:

<GroupId> commons-httpclient </groupId>

<ArtifactId> commons-httpclient </artifactId>

</Dependency>

</Dependency>

Java class

This program obtains Hbase monitoring data as an example. The HDFS monitoring data is similar.

/**
* As the source of HBase Master monitoring information
*
* @ Author aihua. sun
* @ Date 2015/4/6
* @ Since V1.0
*/

Import com. eric. agent. flume. model. HMasterRoleInfo;
Import com. eric. agent. utils. AgentConstants;
Import com. eric. agent. utils. MetricDataUtils;
Import org. json. JSONException;
Import org. json. JSONObject;
Import org. slf4j. Logger;
Import org. slf4j. LoggerFactory;

Public class HBaseMasterDataProvider {
Protected final Logger LOGGER = LoggerFactory. getLogger (getClass ());
Private static final String server = "Hadoop: service = HBase, name = Master, sub = Server ";
Private static final String assignment = "Hadoop: service = HBase, name = Master, sub = AssignmentManger ";

@ Override
Public String extractMonitorData (){
// TODO obtains the IP address and parameters by calling the API.
HMasterRoleInfo monitorDataPoint = new HMasterRoleInfo ();
String URL = "http: // hostname: 60010/jmx ";

JSONObject serverJson = qryJSonObjectFromJMX (URL, server );
JSONObject assignJson = qryJSonObjectFromJMX (URL, assignment );

Try {
MonitorDataPoint. setNumRegionServers (serverJson. getLong ("numRegionServers "));
MonitorDataPoint. setNumDeadRegionServers (serverJson. getLong ("numDeadRegionServers "));
MonitorDataPoint. setClusterRequests (serverJson. getLong ("clusterRequests "));
MonitorDataPoint. setRitCount (assignJson. getLong ("ritCount "));
MonitorDataPoint. setRitCountOverThreshold (assignJson. getLong ("ritCountOverThreshold "));
MonitorDataPoint. setRitOldestAge (assignJson. getLong ("ritOldestAge "));

} Catch (JSONException e ){
E. printStackTrace ();
}
Return monitorDataPoint. toString ();
}

Public static void main (String [] args ){
System. out. println (new HBaseMasterDataProvider (). extractMonitorData ());
}

/**

* Obtain monitoring data through jmx

* @ Param URL

* @ Param objectName

* @ Return

Public static JSONObject qryJSonObjectFromJMX (String URL, String objectName ){

JSONObject jsonObject = null;

Try {

StringBuilder sb = new StringBuilder (URL );

Sb. append ("? Qry = ");

Sb. append (objectName );

GetMethod getMethod = new GetMethod (sb. toString ());

Int statusCode = httpClient.exe cuteMethod (getMethod );

String jsonStr = new String (getMethod. getResponseBody ());

JsonObject = new JSONObject (removeDuplicateContext (jsonStr). getJSONArray ("beans"). getJSONObject (0 );

} Catch (JSONException e ){

E. printStackTrace ();

} Catch (Exception e ){

E. printStackTrace ();

}

Return jsonObject;

}
}

References

JMXJsonServlet Introduction
Http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/http/jmx/JMXJsonServlet.html
Hadoop metrics
Http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/Metrics.html#rpc

-------------------------------------- Split line --------------------------------------

Tutorial on standalone/pseudo-distributed installation and configuration of Hadoop2.4.1 under Ubuntu14.04

Install and configure Hadoop2.2.0 on CentOS

Build a Hadoop environment on Ubuntu 13.04

Cluster configuration for Ubuntu 12.10 + Hadoop 1.2.1

Build a Hadoop environment on Ubuntu (standalone mode + pseudo Distribution Mode)

Configuration of Hadoop environment in Ubuntu

Detailed tutorial on creating a Hadoop environment for standalone Edition

Build a Hadoop environment (use virtual machines to virtualize two Ubuntu systems in a Winodws Environment

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

Use JMX to obtain Hadoop/HBase metric data

Contact Us

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support