1 ganglia Introduction
Ganglia is an open-source monitoring project initiated by UC Berkeley designed to measure thousands of nodes. Each computer runs a gmond daemon that collects and sends metric data (such as processor speed and memory usage. It is collected from the operating system and the specified host. Hosts that receive all metric data can display the data and pass the simplified form of the data to the hierarchy. Ganglia can be well expanded just because of this hierarchical structure. Gmond has a very small amount of system load, making it a part of the system running on each computer in the cluster.CodeWithout affecting user performance.
1.1 ganglia component
The ganglia Monitoring Suite consists of three main parts: gmond, gmetad, and webpage interfaces, which are generally called ganglia-web.
Gmond:Is a daemon that runs on every node to be monitored and collects monitoring statistics, send and receive statistics on the same multicast or unicast channel. If he is a sender (mute = no), he will collect basic metrics, such as system load (load_one ), CPU usage. It also sends the metric that is customized by adding the C/Python module. If he is a receiver (deaf = no), he aggregates all the metrics sent from other hosts and saves them in the memory buffer.
Gmetad:It is also a daemon that regularly checks gmonds, pulls data from it, and stores their metrics in the RRD storage engine. It can query multiple clusters and aggregate metrics. It is also used to generate the Web Front-end of the user interface.
Ganglia-Web:As the name suggests, it should be installed on a machine running gmetad to read the RRD file. Clusters are logical groups of hosts and metric data, such as database servers, Web servers, production, testing, and QA. They are all completely separated, you need to run a separate gmond instance for each cluster.
Generally, each cluster needs a received gmond, and each website needs a gmetad.
Figure 1 ganglia Workflow
Ganglia workflow 1 is shown below:
On the left is the gmond process running on each node. The configuration of this process is only determined by the/etc/gmond. conf file on the node. Therefore, you must install and configure the file on each monitoring node.
The upper-right corner is a more responsible center machine (usually one of the clusters, or not ). Run the gmetad process on this machine to collect information from each node and store the information on rrdtool. The configuration of this process is only determined by/etc/gmetad. conf.
The bottom right corner shows some information about the web page. When browsing the website, we call the PHP script to capture information from the rrdtool database and dynamically generate various charts.
1.2 ganglia running mode (unicast and Multicast)
Ganglia's data collection can work in Unicast (unicast) or multicasting (Multicast) mode. The default mode is multicast.
Unicast:Send the monitoring data collected by yourself to a specific machine, which can span network segments.
Multicast:Send the monitoring data collected by yourself to all machines in the same network segment, and collect the monitoring data sent by all machines in the same network segment. Because it is sent in the form of a broadcast package, it must be within the same network segment. However, different transmission channels can be defined within the same network segment.
2 Environment
Platform: ubuntu12.04
Hadoop: hadoop-1.0.4
Hbase: hbase-0.94.5.
Topology:
Figure 2 hadoop and hbase Topology
Software Installation: APT-Get
3. installation and deployment (unicast) 3.1 deployment Method
Monitoring node (gmond): 10.82.58.209, 10.82.58.211, and 10.82.58.213 (212 the host has left ).
Master node (gmetad, ganglia-web): 10.82.58.211
3.2 Installation
A problem must be explained here. I installed it directly under Ubuntu. Because of version issues, both hadoop-1.0.4 and hbase-0.94.5 support ganglia3.0 and ganglia3.1, but the modules loaded at configuration are different. Therefore, we need to know the version of the installed ganglia.
#SudoApt-Cache show ganglia-webfrontend ganglia-Monitor
Figure 3 installation version information
We can find that the installed version is ganglia3.1.7, so it is supported. Therefore, ganglia-webfrontend and ganglia-monitor are installed on the 211 host. On other monitoring nodes, you only need to install ganglia-monitor.
#SudoApt-GetInstallGanglia-webfrontend ganglia-Monitor
Link the ganglia file to the default directory of Apache.
#Sudo Ln-S/usr/share/ganglia-webfront/var/www/Ganglia
Ganglia-webfrontend is equivalent to gmetad and ganglia-Web mentioned above. It also automatically installs apache2 and rrdtool for you, which is very convenient.
3.3 ganglia Configuration
You must configure/etc/gmond. conf on each node. The configuration is the same as follows:
Globals {daemonize = Yes # Run setuid later = Yes user = Ganglia # debug_level, the user running gmond = 0# Debug level max_udp_msg_len = 1472 Mute = No # Dumb, this node will no longer broadcast any data it has collected on the network deaf = No # deaf, this node will no longer receive data packets broadcast by other nodes host_dmax = 0 /* Secs */cleanup_threshold = 300 /* Secs */gexec = No # whether to use gexec send_metadata_interval = 10 # Node sending interval/* secs */}/* If a cluster attribute is specified ,Then all gmond hosts are wrapped inside * Of A <cluster> tag. If you do not specify a cluster tag , Then all = "Hadoop" # Cluster owner of the current node = "Unspecified" # Who is the owner of the node latlong = "Unspecified" # Coordinate URL on the earth = "Unspecified" }/* The host section describes attributes of the host , Like the location */host {location = "Unspecified"}/* Feel free to specify as your udp_send_channels as you like. gmond used to only support having a single channel */udp_send_channel {# UDP packet sending Channel host = 10.82.58.211 # multicast, working on 239.2.11 . 71. Unicast, pointing to the master node. You can configure multiple udp_send_channel ports in Unicast mode. = 8649 # Listener port TTL = 1 }/* You can specify as your udp_recv_channels as you like as well. */udp_recv_channel {# accept UDP packet configuration # mcast_join = 239.2.11.71 Port = 8649 # Bind = 239.2.11.71 }
Note that send_metadata_interval is set to 10 seconds. Ganglia metrics are sent from its metadata interval. Metadata includes metrics such as measurement groups and types. If you restart the received gmond host, the metadata will be lost, and gmond will discard it because it does not know how to process metric data. This will generate a blank chart. In multicast mode, gmonds can communicate with any other host and request a new one when metadata is lost. However, this is not possible in Unicast mode, so you need to command gmond to regularly send metadata.
You also need to configure/etc/gmetad. conf on the master node (10.82.58.211). The name "hadoop" here should be consistent with that in gmond. conf above.
Data_source "hadoop" 10.82.58.211: 8649
3.4 hadoop Configuration
The hadoop-metrics2.properties needs to be configured for all nodes where hadoop is located, as shown below:
# Syntax: [ Prefix ] . [ Source | sink | JMX ] .[ Instance ] . [ Options ] # See package.html for org. Apache. hadoop. metrics2 for details *. Sink. file. Class = Org. Apache. hadoop. metrics2.sink. filesink # namenode. Sink. file. filename = Namenode-metrics.out # datanode. Sink. file. filename = Datanode-metrics.out # jobtracker. Sink. file. filename =Jobtracker-metrics.out # tasktracker. Sink. file. filename = Tasktracker-metrics.out # maptask. Sink. file. filename = Maptask-metrics.out # reducetask. Sink. file. filename = Reducetask-metrics.out # below are for sending metrics to ganglia # For Ganglia 3.0 Support # *. Sink. ganglia. Class = Org. Apache. hadoop. metrics2.sink. ganglia. gangliasink30 ## for Ganglia 3.1 Support *. Sink. ganglia. Class = Org. Apache. hadoop. metrics2.sink. ganglia. gangliasink31 *. Sink. ganglia. Period = 10# Default for supportsparse is false *. Sink. ganglia. supportsparse = True *. Sink. ganglia. Slope = JVM. Metrics. gccount = zero, JVM. Metrics. memheapusedm = Both *. Sink. ganglia. dmax = JVM. Metrics. threadsblocked = 70, JVM. Metrics. memheapusedm = 40 Namenode. Sink. ganglia. Servers = 10.82.58.211: 8649 Datanode. Sink. ganglia. Servers = 10.82.58.211: 8649 Jobtracker. Sink. ganglia. Servers = 10.82.58.211: 8649 Tasktracker. Sink. ganglia. Servers = 10.82.58.211: 8649 Maptask. Sink. ganglia. Servers = 10.82.58.211: 8649Reducetask. Sink. ganglia. Servers = 10.82.58.211: 8649
3.5 hbase Configuration
# Hbase-specific configuration to reset long-running stats (e.g. compactions) # If this variable is left out , Then the default is no expiration. hbase. extendedperiod = 3600 # Configuration of "Hbase" Context for ganglia # Pick one: Ganglia 3.0 (former) or ganglia 3.1 (Latter) # hbase. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontexthbase. Class =Org. Apache. hadoop. Metrics. ganglia. gangliacontext31hbase. Period = 10 Hbase. Servers = 10.82.58.211: 8649 # Configuration of "JVM" Context for nulljvm. Class = Org. Apache. hadoop. Metrics. SPI. nullcontextwithupdatethreadjvm. Period = 10 # Configuration of "JVM" Context for file # JVM. Class = Org. Apache. hadoop. hbase. Metrics. file. timestampingfilecontext # JVM. filename = /Tmp/metrics_jvm.log # configuration of "JVM"Context for ganglia # Pick one: Ganglia 3.0 (former) or ganglia 3.1 (Latter) # JVM. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontextjvm. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontext31jvm. Period = 10 JVM. Servers = 10.82.58.211: 8649 # Configuration of "RPC" Context for nullrpc. Class = Org. Apache. hadoop. Metrics. SPI. nullcontextwithupdatethreadrpc. Period = 10 # Configuration of "RPC"Context for file # rpc. Class = Org. Apache. hadoop. hbase. Metrics. file. timestampingfilecontext # rpc. filename = /Tmp/metrics_rpc.log # configuration of "RPC" Context for ganglia # Pick one: Ganglia 3.0 (former) or ganglia 3.1 (Latter) # rpc. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontext rpc. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontext31 rpc. Period = 10 Rpc. Servers = 10.82.58.211: 8649 # Configuration of "Rest" Context for ganglia # Pick one: Ganglia 3.0 (former) or ganglia 3.1 (Latter) # Rest. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontext rest. Class = Org. Apache. hadoop. Metrics. ganglia. gangliacontext31 rest. Period = 10 Rest. Servers = 10.82.58.211: 8649
4 startup and Inspection
Restart hadoop and hbase first. Start the gmond service on each node. The master node also needs to start the gmetad service.
#SudoService ganglia-Monitor start #SudoService gmetad start
You can use 10.82.58.211/ganglia.
Figure 4 view Cluster information on the Web
5 References 1. http://ganglia.sourceforge.net/2. http://linuxjcq.blog.51cto.com/3042600/759008
3, http://cryinstall.com /? P = 180
4. https://wiki.freebsdchina.org/howto/g/gangliainit
5. http://abloz.com/2012/09/19/ganglia-monitoring-hadoop.html
6. http://www.ibm.com/developerworks/wikis/display/wikiptype/ganglia