Use ganglia to monitor hadoop and hbase clusters and gangliahadoop

Last Update:2015-03-09 Source: Internet

Author: User

Tags rrd rrdtool dns names

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Use ganglia to monitor hadoop and hbase clusters and gangliahadoop

Introductory content from: http://www.uml.org.cn/sjjm/201305171.asp

1. Introduction to Ganglia

Ganglia is an open-source monitoring project initiated by UC Berkeley designed to measure thousands of nodes. Each computer runs a gmond daemon that collects and sends metric data (such as processor speed and memory usage. It is collected from the operating system and the specified host. Hosts that receive all metric data can display the data and pass the simplified form of the data to the hierarchy. Ganglia can be well expanded just because of this hierarchical structure. Gmond has very little system load, which makes it a piece of code running on each computer in the cluster without affecting user performance.

1.1 Ganglia component

The Ganglia Monitoring Suite consists of three main parts: gmond, gmetad, and webpage interfaces, which are generally called ganglia-web.

Gmond: a daemon that runs on every node to be monitored and collects monitoring statistics, send and receive statistics on the same multicast or unicast channel. If he is a sender (mute = no), he will collect basic metrics, such as system load (load_one ), CPU usage. It also sends the metric that is customized by adding the C/Python module. If he is a receiver (deaf = no), he aggregates all the metrics sent from other hosts and saves them in the memory buffer.

Gmetad: it is also a daemon. It regularly checks gmonds, pulls data from it, and stores their metrics in the RRD storage engine. It can query multiple clusters and aggregate metrics. It is also used to generate the web Front-end of the user interface.

Ganglia-web: as the name suggests, it should be installed on a machine running gmetad to read the RRD file. Clusters are logical groups of hosts and metric data, such as database servers, Web servers, production, testing, and QA. They are all completely separated, you need to run a separate gmond instance for each cluster.

Generally, each cluster needs a received gmond, and each website needs a gmetad.

Figure 1 ganglia Workflow

Ganglia workflow 1 is shown below:

On the left is the gmond process running on each node. The configuration of this process is only determined by the/etc/gmond. conf file on the node. Therefore, you must install and configure the file on each monitoring node.

The upper-right corner is a more responsible center machine (usually one of the clusters, or not ). Run the gmetad process on this machine to collect information from each node and store the information on RRDtool. The configuration of this process is only determined by/etc/gmetad. conf.

The bottom right corner shows some information about the web page. When browsing the website, we call the php script to capture information from the RRDTool database and dynamically generate various charts.

1.2 Ganglia running mode (unicast and Multicast)

Ganglia's data collection can work in unicast (unicast) or multicasting (multicast) mode. The default mode is multicast.

Unicast: Send the monitoring data collected by the user to a specific machine or several machines. The monitoring data can be distributed across network segments.

Multicast: Send the monitoring data collected by yourself to all machines in the same network segment, and collect the monitoring data sent by all machines in the same network segment. Because it is sent in the form of a broadcast package, it must be within the same network segment. However, different transmission channels can be defined within the same network segment.

Ii. Install ganglia

1. Topology description
Three hosts:

10.171.29.191 master10.171.94.155  slave110.251.0.197 slave3

The master node performs gmeta and web operations on all three machines as gmon.
Perform the following steps with the root user:

2. Install gmeta and web on the master

yum install ganglia-web.x86_64yum install ganglia-gmetad.x86_64

3. appease gmond on all three machines

yum install ganglia-gmond.x86_64

4. Configure/etc/ganglia/gmond. conf on the three machines and modify the following content:

udp_send_channel {  #bind_hostname = yes # Highly recommended, soon to be default.                       # This option tells gmond to use a source address                       # that resolves to the machine's hostname.  Without                       # this, the metrics may appear to come from any                       # interface and the DNS names associated with                       # those IPs will be used to create the RRDs.  mcast_join = 10.171.29.191  port = 8649  ttl = 1}/* You can specify as many udp_recv_channels as you like as well. */udp_recv_channel {  #mcast_join = 239.2.11.71  port = 8649  #bind = 239.2.11.71}

Change the default multicast address to the master Address, and comment out the two IP addresses of udp_recv_channel.

5. Modify/etc/ganglia/gmetad. conf on the master.
Modify data_source:

data_source "my cluster” 10.171.29.191

6. ln-s/usr/share/ganglia/var/www/ganglia
If you have any questions, copy the/usr/share/ganglia content directly to/var/www/ganglia.

7. Modify/etc/httpd/conf. d/ganglia. conf:

#  # Ganglia monitoring system php web frontend  #   Alias /ganglia /usr/share/ganglia  <Location /ganglia>    Order deny,allow    Allow from all    Allow from 127.0.0.1    Allow from ::1    # Allow from .example.com  </Location>

Change Deny from all to Allow from all.

8. Start

service gmetad startservice gmond start/usr/sbin/apachectl start

9. access from the page
Http: // ip/ganglia

Notes:
1. The information collected by gmetad is stored in/var/lib/ganglia/rrds/

2. Run the following command to check whether data is being transmitted:

tcpdump port 8649

3. Configure hadoop and hbase

1. Configure hadoop

Hadoop-metrics2.properties

# syntax: [prefix].[source|sink|jmx].[instance].[options]# See package.html for org.apache.hadoop.metrics2 for details*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink#namenode.sink.file.filename=namenode-metrics.out#datanode.sink.file.filename=datanode-metrics.out#jobtracker.sink.file.filename=jobtracker-metrics.out#tasktracker.sink.file.filename=tasktracker-metrics.out#maptask.sink.file.filename=maptask-metrics.out#reducetask.sink.file.filename=reducetask-metrics.out# Below are for sending metrics to Ganglia## for Ganglia 3.0 support# *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink30## for Ganglia 3.1 support*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31*.sink.ganglia.period=10# default for supportsparse is false*.sink.ganglia.supportsparse=true*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40menode.sink.ganglia.servers=10.171.29.191:8649datanode.sink.ganglia.servers=10.171.29.191:8649jobtracker.sink.ganglia.servers=10.171.29.191:8649tasktracker.sink.ganglia.servers=10.171.29.191:8649maptask.sink.ganglia.servers=10.171.29.191:8649reducetask.sink.ganglia.servers=10.171.29.191:8649

2. Configure hbase

Hadoop-metrics.properties

# See http://wiki.apache.org/hadoop/GangliaMetrics# Make sure you know whether you are using ganglia 3.0 or 3.1.# If 3.1, you will have to patch your hadoop instance with HADOOP-4675# And, yes, this file is named hadoop-metrics.properties rather than# hbase-metrics.properties because we're leveraging the hadoop metrics# package and hadoop-metrics.properties is an hardcoded-name, at least# for the moment.## See also http://hadoop.apache.org/hbase/docs/current/metrics.html# GMETADHOST_IP is the hostname (or) IP address of the server on which the ganglia # meta daemon (gmetad) service is running# Configuration of the "hbase" context for NullContextWithUpdateThread# NullContextWithUpdateThread is a  null context which has a thread calling# periodically when monitoring is started. This keeps the data sampled# correctly.hbase.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadhbase.period=10# Configuration of the "hbase" context for file# hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# hbase.fileName=/tmp/metrics_hbase.log# HBase-specific configuration to reset long-running stats (e.g. compactions)# If this variable is left out, then the default is no expiration.hbase.extendedperiod = 3600# Configuration of the "hbase" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContexthbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31hbase.period=10hbase.servers=10.171.29.191:8649# Configuration of the "jvm" context for nulljvm.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadjvm.period=10# Configuration of the "jvm" context for file# jvm.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# jvm.fileName=/tmp/metrics_jvm.log# Configuration of the "jvm" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContextjvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31jvm.period=10jvm.servers=10.171.29.191:8649# Configuration of the "rpc" context for nullrpc.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadrpc.period=10# Configuration of the "rpc" context for file# rpc.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# rpc.fileName=/tmp/metrics_rpc.log# Configuration of the "rpc" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContextrpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rpc.period=10rpc.servers=10.171.29.191:8649# Configuration of the "rest" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# rest.class=org.apache.hadoop.metrics.ganglia.GangliaContextrest.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rest.period=10rest.servers=10.171.29.191:8649

Restart hadoop and hbase.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More