Use ganglia to monitor hadoop and hbase clusters and gangliahadoop
Introductory content from: http://www.uml.org.cn/sjjm/201305171.asp
1. Introduction to Ganglia
Ganglia is an open-source monitoring project initiated by UC Berkeley designed to measure thousands of nodes. Each computer runs a gmond daemon that collects and sends metric data (such as processor speed and memory usage. It is collected from the operating system and the specified host. Hosts that receive all metric data can display the data and pass the simplified form of the data to the hierarchy. Ganglia can be well expanded just because of this hierarchical structure. Gmond has very little system load, which makes it a piece of code running on each computer in the cluster without affecting user performance.
1.1 Ganglia component
The Ganglia Monitoring Suite consists of three main parts: gmond, gmetad, and webpage interfaces, which are generally called ganglia-web.
Gmond: a daemon that runs on every node to be monitored and collects monitoring statistics, send and receive statistics on the same multicast or unicast channel. If he is a sender (mute = no), he will collect basic metrics, such as system load (load_one ), CPU usage. It also sends the metric that is customized by adding the C/Python module. If he is a receiver (deaf = no), he aggregates all the metrics sent from other hosts and saves them in the memory buffer.
Gmetad: it is also a daemon. It regularly checks gmonds, pulls data from it, and stores their metrics in the RRD storage engine. It can query multiple clusters and aggregate metrics. It is also used to generate the web Front-end of the user interface.
Ganglia-web: as the name suggests, it should be installed on a machine running gmetad to read the RRD file. Clusters are logical groups of hosts and metric data, such as database servers, Web servers, production, testing, and QA. They are all completely separated, you need to run a separate gmond instance for each cluster.
Generally, each cluster needs a received gmond, and each website needs a gmetad.
Figure 1 ganglia Workflow
Ganglia workflow 1 is shown below:
On the left is the gmond process running on each node. The configuration of this process is only determined by the/etc/gmond. conf file on the node. Therefore, you must install and configure the file on each monitoring node.
The upper-right corner is a more responsible center machine (usually one of the clusters, or not ). Run the gmetad process on this machine to collect information from each node and store the information on RRDtool. The configuration of this process is only determined by/etc/gmetad. conf.
The bottom right corner shows some information about the web page. When browsing the website, we call the php script to capture information from the RRDTool database and dynamically generate various charts.
1.2 Ganglia running mode (unicast and Multicast)
Ganglia's data collection can work in unicast (unicast) or multicasting (multicast) mode. The default mode is multicast.
Unicast: Send the monitoring data collected by the user to a specific machine or several machines. The monitoring data can be distributed across network segments.
Multicast: Send the monitoring data collected by yourself to all machines in the same network segment, and collect the monitoring data sent by all machines in the same network segment. Because it is sent in the form of a broadcast package, it must be within the same network segment. However, different transmission channels can be defined within the same network segment.
Ii. Install ganglia
1. Topology description
Three hosts:
10.171.29.191 master10.171.94.155 slave110.251.0.197 slave3
The master node performs gmeta and web operations on all three machines as gmon.
Perform the following steps with the root user:
2. Install gmeta and web on the master
yum install ganglia-web.x86_64yum install ganglia-gmetad.x86_64
3. appease gmond on all three machines
yum install ganglia-gmond.x86_64
4. Configure/etc/ganglia/gmond. conf on the three machines and modify the following content:
udp_send_channel { #bind_hostname = yes # Highly recommended, soon to be default. # This option tells gmond to use a source address # that resolves to the machine's hostname. Without # this, the metrics may appear to come from any # interface and the DNS names associated with # those IPs will be used to create the RRDs. mcast_join = 10.171.29.191 port = 8649 ttl = 1}/* You can specify as many udp_recv_channels as you like as well. */udp_recv_channel { #mcast_join = 239.2.11.71 port = 8649 #bind = 239.2.11.71}
Change the default multicast address to the master Address, and comment out the two IP addresses of udp_recv_channel.
5. Modify/etc/ganglia/gmetad. conf on the master.
Modify data_source:
data_source "my cluster” 10.171.29.191
6. ln-s/usr/share/ganglia/var/www/ganglia
If you have any questions, copy the/usr/share/ganglia content directly to/var/www/ganglia.
7. Modify/etc/httpd/conf. d/ganglia. conf:
# # Ganglia monitoring system php web frontend # Alias /ganglia /usr/share/ganglia <Location /ganglia> Order deny,allow Allow from all Allow from 127.0.0.1 Allow from ::1 # Allow from .example.com </Location>
Change Deny from all to Allow from all.
8. Start
service gmetad startservice gmond start/usr/sbin/apachectl start
9. access from the page
Http: // ip/ganglia
Notes:
1. The information collected by gmetad is stored in/var/lib/ganglia/rrds/
2. Run the following command to check whether data is being transmitted:
tcpdump port 8649
3. Configure hadoop and hbase
1. Configure hadoop
Hadoop-metrics2.properties
# syntax: [prefix].[source|sink|jmx].[instance].[options]# See package.html for org.apache.hadoop.metrics2 for details*.sink.file.class=org.apache.hadoop.metrics2.sink.FileSink#namenode.sink.file.filename=namenode-metrics.out#datanode.sink.file.filename=datanode-metrics.out#jobtracker.sink.file.filename=jobtracker-metrics.out#tasktracker.sink.file.filename=tasktracker-metrics.out#maptask.sink.file.filename=maptask-metrics.out#reducetask.sink.file.filename=reducetask-metrics.out# Below are for sending metrics to Ganglia## for Ganglia 3.0 support# *.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink30## for Ganglia 3.1 support*.sink.ganglia.class=org.apache.hadoop.metrics2.sink.ganglia.GangliaSink31*.sink.ganglia.period=10# default for supportsparse is false*.sink.ganglia.supportsparse=true*.sink.ganglia.slope=jvm.metrics.gcCount=zero,jvm.metrics.memHeapUsedM=both*.sink.ganglia.dmax=jvm.metrics.threadsBlocked=70,jvm.metrics.memHeapUsedM=40menode.sink.ganglia.servers=10.171.29.191:8649datanode.sink.ganglia.servers=10.171.29.191:8649jobtracker.sink.ganglia.servers=10.171.29.191:8649tasktracker.sink.ganglia.servers=10.171.29.191:8649maptask.sink.ganglia.servers=10.171.29.191:8649reducetask.sink.ganglia.servers=10.171.29.191:8649
2. Configure hbase
Hadoop-metrics.properties
# See http://wiki.apache.org/hadoop/GangliaMetrics# Make sure you know whether you are using ganglia 3.0 or 3.1.# If 3.1, you will have to patch your hadoop instance with HADOOP-4675# And, yes, this file is named hadoop-metrics.properties rather than# hbase-metrics.properties because we're leveraging the hadoop metrics# package and hadoop-metrics.properties is an hardcoded-name, at least# for the moment.## See also http://hadoop.apache.org/hbase/docs/current/metrics.html# GMETADHOST_IP is the hostname (or) IP address of the server on which the ganglia # meta daemon (gmetad) service is running# Configuration of the "hbase" context for NullContextWithUpdateThread# NullContextWithUpdateThread is a null context which has a thread calling# periodically when monitoring is started. This keeps the data sampled# correctly.hbase.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadhbase.period=10# Configuration of the "hbase" context for file# hbase.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# hbase.fileName=/tmp/metrics_hbase.log# HBase-specific configuration to reset long-running stats (e.g. compactions)# If this variable is left out, then the default is no expiration.hbase.extendedperiod = 3600# Configuration of the "hbase" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# hbase.class=org.apache.hadoop.metrics.ganglia.GangliaContexthbase.class=org.apache.hadoop.metrics.ganglia.GangliaContext31hbase.period=10hbase.servers=10.171.29.191:8649# Configuration of the "jvm" context for nulljvm.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadjvm.period=10# Configuration of the "jvm" context for file# jvm.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# jvm.fileName=/tmp/metrics_jvm.log# Configuration of the "jvm" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContextjvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext31jvm.period=10jvm.servers=10.171.29.191:8649# Configuration of the "rpc" context for nullrpc.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThreadrpc.period=10# Configuration of the "rpc" context for file# rpc.class=org.apache.hadoop.hbase.metrics.file.TimeStampingFileContext# rpc.fileName=/tmp/metrics_rpc.log# Configuration of the "rpc" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# rpc.class=org.apache.hadoop.metrics.ganglia.GangliaContextrpc.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rpc.period=10rpc.servers=10.171.29.191:8649# Configuration of the "rest" context for ganglia# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)# rest.class=org.apache.hadoop.metrics.ganglia.GangliaContextrest.class=org.apache.hadoop.metrics.ganglia.GangliaContext31rest.period=10rest.servers=10.171.29.191:8649
Restart hadoop and hbase.