Ganglia is a distributed monitoring tool that monitors the nodes above the grid and cluster, uses the Web interface it provides to see each node state, and can output graphical representations. Ganglia is an open source monitoring project initiated by UC Berkeley, designed to measure thousands of nodes. Each computer runs a daemon named Gmond that collects and sends metric data, such as processor speed, memory usage, and so on. It will be collected from the operating system and the specified host. Hosts that receive all of the metrics data can display this data and can pass the condensed form of the data to the hierarchy. It is precisely because of this hierarchical pattern that the Ganglia can be extended well. Gmond brings very little system load, which makes it a piece of code that runs on each computer in the cluster without impacting user performance.
Noun description
Metrics: Monitor computer running data, this word is difficult to translate, English has the meaning of measurement, the following I do not translate, directly with the original word.
Node: A computer, perhaps with multiple CPUs, is called a node in Chinese.
Cluster: A group of nodes, called clusters in Chinese. Usually nodes have a high bandwidth to G-bit, cluster through the multicast protocol, each node multicast their own data, so each node has the whole cluster state, this redundancy design can improve the robustness of the cluster. The common intra-cluster nodes are the same system and architecture, managed by the same administrator.
Grid: A group of clusters, Chinese can be called grids. The use of grids is to bring together disparate clusters across a wide range of broadband. In document 3, there is also a concept that planetary-scale systems, a global network, is typically deployed at the root node of the backbone network. and assume that the bandwidth in the network is not abundant, but also expensive, often have congestion situation appears. This is a grid network in Berkeley, California: http://monitor.millennium.berkeley.edu You can view a variety of data by selecting Grid or cluster.
Ganglia's various components and functions
Gmond (Ganglia Monitor Daemon): Data Logger service Program, configuration file is/etc/gmond.conf located on each node
Gmetad (Ganglia Metadata Daemon): Data mixing Collector Service Program, configuration file is/etc/gmetad.conf. It collects gmond data by polling, aggregates various types of information from the cluster, and then stores it in a local rrdtool database, preferably with a Gmetad for each cluster to build a multilevel network.
Web Visualizer: This is a PHP script implemented to visualize the data and draw a table. can be any Web server that supports PHP, SSL, and XML. Generally use Apache2web server
Additional Advanced Tools
Gmetric can be used to add additional state of node that you need to monitor, Gstat can get ganglia data directly, each node that needs these functions
Ganglia function
As you can see, the cluster is multicast compressed XML (XDR) data through UDP protocol, each node shares the information of all the nodes in the cluster, and when a node in the Gmetad poll cluster is unsuccessful, it can also poll other nodes. Gmetad sends intra-cluster data to the upper Gmetad node via the TCP protocol.
The Gmond program consists of multiple threads:
Collect and publish thread threads are used to collect the metrics of nodes and to broadcast the group;
Listening thread threads are used to listen for multicast ports and store these metrics in a multilevel hash table in memory, and a set of XML export threads thread groups are used for the corresponding TCP request to send the metrics in the cluster.
Gmond will not save the data, just listen to save and send the data accordingly. Between nodes through the heartbeat signal detection of the other side of the node survival or not, if the node does not broadcast metrics for a period of time, we see it down, and each time it starts, will broadcast a Gmond boot time, when the neighbor node received after the machine restart, depending on All metrics that the node has saved are deleted.
Gmetad periodically sends a polling packet to the data source and assigns a thread to each source. The collected metrics, parsing through the Sax XML, built in a gperf hash table, facilitates the processing of data, and finally the processed data is stored in Rrdtools.
Composition of the Metrics
Metrics data is obtained by the Gmond built-in program or Gmetric program, generally in the form of XDR (external data notation (External Data Representation, abbreviated XDR)) as a compression save in the format: (Key,value), The key is 4 bytes and value is 4-8 bytes. The number of metrics, frequency and transmit interval are defined in gmond.conf, Gmond maintains a collection table, each metric has its attributes.
Data flow of a multi-cluster heterogeneous ganglia network
There are four types of clusters in the diagram:
Yellow cluster-both local node and interface for front-end display. It provides a Web server to view ganglia data, which includes not only local node (optional), but also data in blue and green clusters.
Light green cluster-front-end Web service display, typically no local node.
Blue cluster-There is no local data collector in this cluster. So these nodes will share all the data (because Gmond is sending the data by multicast, so sharing is easier), and then one of the nodes sends the data to the upper-level data collector. The yellow clusters of Gmetad services are collected and stored, and if not saved, the data will be lost.
Dark green cluster-this cluster has local data collectors and warehouses. The green node is also shared data, but the data is collected by a cluster head node and stored and sent to the upper yellow cluster via TCP when asked.
General Networking Recommendations:
1. The network consists of many dark green nodes and yellow clusters with local nodes
2. The network consists of many blue nodes and yellow clusters without local nodes
Configuration of green clusters for various cluster configurations
Get gmond default configuration for gmond.conf
Gmond-t >/etc/gmond.conf
Gmond.conf modified as follows:
/* This configuration was as close to 2.5.x default behavior. As possible
The values closely match./GMOND/METRIC.H definitions in 2.5.x */
Globals {
Daemonize = yes
Setuid = yes
user = Nobody
Debug_level = 0
Max_udp_msg_len = 1472
Mute = no
deaf = No
Host_dmax = 0/*secs */
Cleanup_threshold =/*secs */
Gexec = No
}
/* If A cluster attribute is specified and then all Gmond the hosts are wrapped inside
* of a <CLUSTER> tag. If you don't specify a cluster tag, then all <HOSTS> 'll
* Not being wrapped inside of a <CLUSTER> tag. */
Cluster {
Name = "Green"
Owner = "Unspecified"
Latlong = "Unspecified"
url = "Unspecified"
}
/* The host section describes attributes of the host, like the location */
Host {
Location = "Unspecified"
}
/* Feel free to specify as many udp_send_channels as. Gmond
Used to only support has a single channel. */
Udp_send_channel {
Mcast_join = Green_header
Port = 8649
}
/* You can specify as many udp_recv_channels as. */
Udp_recv_channel {
Port = 8649
Family = Inet4
}
...
For mcast_join This parameter, Green_header is the host name of the cluster head node, and you can specify the IP.
Then restart the Gmond service.
In the cluster head node,/etc/gmetad.conf needs to add the following line:
Data_source "Green" localhost
Configuration of blue clusters
The blue cluster configuration is similar to the green cluster, you only need to set the name of the cluster and the name of the cluster header, and then restart all nodes of the Gmond service.
Configuration of yellow clusters
Most configurations are similar to green clusters, with the following lines to be added to the/etc/gmetad.conf:
Data_source "Yellow" localhost
Data_source "Blue" Blue_header
Data_source "Green" Green_header
So Gmetad will:
1, contact the local gmond, get all the yellow node status data.
2, contact the Blue_header node Gmond, get all the Blue node status data. This data will be stored in the local RRDtool database.
3. Contact the Green_header node to obtain the aggregated data of Rrdtools collected in Gmetad. Note that this data is not saved in the Rrdtools in the yellow cluster, so if the front-end Web server refreshes, it will re-request the updated data to Green_header.
In addition, in/etc/gmetad.conf, you can also add the name of the grid: Gridname "Rainbow"
Now ganglia's web page will show a network called Rainbow, with three clusters: Yellow,green and Blue.
Some high-level topics gmetric use
You can add firmware:
Gmetric--name firmware--value ' Lsattr-el sys0-a modelname-f value '--type ' string '
Number of added disks:
Gmetric--name number_of_disks--value ' LSPV | Wc-l '--type int32
Add monitoring of a particular data (where name is the real name, value is obtained by the Myget program, and the type of number obtained is determined by type):
Gmetric--name TPM--value '/usr/local/bin/myget '--type double
The above statistics are only once, if you need long-term display, it is best to put the above statement every 60 seconds. Then, after a few minutes, the data will be displayed on the page.
For more information on gmetric, you can look at Http://ganglia.wiki.sourceforge.net/ganglia_readme.
Here are a few custom gmetric scripts to refer to: http://ganglia.sourceforge.net/gmetric/
Using Gstat to get data, Gstat can display data directly through commands such as:
[Email protected] ~]# Gstat
CLUSTER Information
Name:my_hadoop
Hosts:3
Gexec hosts:0
Dead hosts:0
Localtime:tue Feb 14 20:40:05 2012
There is no hosts running gexec at this time
[Email protected] ~]#
You can also get more information by adding parameters:
[Email protected] ~]# Gstat--all--single_line
CLUSTER Information
Name:my_hadoop
Hosts:3
Gexec hosts:0
Dead hosts:0
Localtime:tue Feb 14 20:39:43 2012
CLUSTER HOSTS
Hostname LOAD CPU gexec
CPUs (Procs/total) [1, 5, 15min] [User, Nice, System, Idle, Wio]
RAC2 1 (0/481) [0.04, 0.14, 0.11] [2.3, 0.0, 0.4, 97.3, 0.1] OFF
RAC3 1 (0/406) [0.07, 0.04, 0.01] [0.2, 0.0, 0.4, 99.4, 0.0] OFF
Rac1 1 (0/777) [0.09, 0.43, 0.42] [2.7, 0.0, 0.9, 96.3, 0.0] OFF
"Tools" Ganglia monitoring technology analysis