"Tools" Ganglia monitoring technology analysis

Source: Internet
Author: User
Tags rrdtool

Ganglia is a distributed monitoring tool that monitors the nodes above the grid and cluster, uses the Web interface it provides to see each node state, and can output graphical representations. Ganglia is an open source monitoring project initiated by UC Berkeley, designed to measure thousands of nodes. Each computer runs a daemon named Gmond that collects and sends metric data, such as processor speed, memory usage, and so on. It will be collected from the operating system and the specified host. Hosts that receive all of the metrics data can display this data and can pass the condensed form of the data to the hierarchy. It is precisely because of this hierarchical pattern that the Ganglia can be extended well. Gmond brings very little system load, which makes it a piece of code that runs on each computer in the cluster without impacting user performance.

Noun description

Metrics: Monitor computer running data, this word is difficult to translate, English has the meaning of measurement, the following I do not translate, directly with the original word.

Node: A computer, perhaps with multiple CPUs, is called a node in Chinese.

Cluster: A group of nodes, called clusters in Chinese. Usually nodes have a high bandwidth to G-bit, cluster through the multicast protocol, each node multicast their own data, so each node has the whole cluster state, this redundancy design can improve the robustness of the cluster. The common intra-cluster nodes are the same system and architecture, managed by the same administrator.

Grid: A group of clusters, Chinese can be called grids. The use of grids is to bring together disparate clusters across a wide range of broadband. In document 3, there is also a concept that planetary-scale systems, a global network, is typically deployed at the root node of the backbone network. and assume that the bandwidth in the network is not abundant, but also expensive, often have congestion situation appears. This is a grid network in Berkeley, California: http://monitor.millennium.berkeley.edu You can view a variety of data by selecting Grid or cluster.

Ganglia's various components and functions

Gmond (Ganglia Monitor Daemon): Data Logger service Program, configuration file is/etc/gmond.conf located on each node

Gmetad (Ganglia Metadata Daemon): Data mixing Collector Service Program, configuration file is/etc/gmetad.conf. It collects gmond data by polling, aggregates various types of information from the cluster, and then stores it in a local rrdtool database, preferably with a Gmetad for each cluster to build a multilevel network.

Web Visualizer: This is a PHP script implemented to visualize the data and draw a table. can be any Web server that supports PHP, SSL, and XML. Generally use Apache2web server

Additional Advanced Tools

Gmetric can be used to add additional state of node that you need to monitor, Gstat can get ganglia data directly, each node that needs these functions

Ganglia function

As you can see, the cluster is multicast compressed XML (XDR) data through UDP protocol, each node shares the information of all the nodes in the cluster, and when a node in the Gmetad poll cluster is unsuccessful, it can also poll other nodes. Gmetad sends intra-cluster data to the upper Gmetad node via the TCP protocol.

The Gmond program consists of multiple threads:

Collect and publish thread threads are used to collect the metrics of nodes and to broadcast the group;

Listening thread threads are used to listen for multicast ports and store these metrics in a multilevel hash table in memory, and a set of XML export threads thread groups are used for the corresponding TCP request to send the metrics in the cluster.

Gmond will not save the data, just listen to save and send the data accordingly. Between nodes through the heartbeat signal detection of the other side of the node survival or not, if the node does not broadcast metrics for a period of time, we see it down, and each time it starts, will broadcast a Gmond boot time, when the neighbor node received after the machine restart, depending on All metrics that the node has saved are deleted.

Gmetad periodically sends a polling packet to the data source and assigns a thread to each source. The collected metrics, parsing through the Sax XML, built in a gperf hash table, facilitates the processing of data, and finally the processed data is stored in Rrdtools.

Composition of the Metrics

Metrics data is obtained by the Gmond built-in program or Gmetric program, generally in the form of XDR (external data notation (External Data Representation, abbreviated XDR)) as a compression save in the format: (Key,value), The key is 4 bytes and value is 4-8 bytes. The number of metrics, frequency and transmit interval are defined in gmond.conf, Gmond maintains a collection table, each metric has its attributes.

Data flow of a multi-cluster heterogeneous ganglia network

There are four types of clusters in the diagram:

Yellow cluster-both local node and interface for front-end display. It provides a Web server to view ganglia data, which includes not only local node (optional), but also data in blue and green clusters.

Light green cluster-front-end Web service display, typically no local node.

Blue cluster-There is no local data collector in this cluster. So these nodes will share all the data (because Gmond is sending the data by multicast, so sharing is easier), and then one of the nodes sends the data to the upper-level data collector. The yellow clusters of Gmetad services are collected and stored, and if not saved, the data will be lost.

Dark green cluster-this cluster has local data collectors and warehouses. The green node is also shared data, but the data is collected by a cluster head node and stored and sent to the upper yellow cluster via TCP when asked.


General Networking Recommendations:

1. The network consists of many dark green nodes and yellow clusters with local nodes

2. The network consists of many blue nodes and yellow clusters without local nodes


Configuration of green clusters for various cluster configurations

Get gmond default configuration for gmond.conf

Gmond-t >/etc/gmond.conf

Gmond.conf modified as follows:

/* This configuration was as close to 2.5.x default behavior. As possible

The values closely match./GMOND/METRIC.H definitions in 2.5.x */

Globals {

Daemonize = yes

Setuid = yes

user = Nobody

Debug_level = 0

Max_udp_msg_len = 1472

Mute = no

deaf = No

Host_dmax = 0/*secs */

Cleanup_threshold =/*secs */

Gexec = No

}


/* If A cluster attribute is specified and then all Gmond the hosts are wrapped inside

* of a <CLUSTER> tag. If you don't specify a cluster tag, then all <HOSTS> 'll

* Not being wrapped inside of a <CLUSTER> tag. */

Cluster {

Name = "Green"

Owner = "Unspecified"

Latlong = "Unspecified"

url = "Unspecified"

}


/* The host section describes attributes of the host, like the location */

Host {

Location = "Unspecified"

}


/* Feel free to specify as many udp_send_channels as. Gmond

Used to only support has a single channel. */

Udp_send_channel {

Mcast_join = Green_header

Port = 8649

}


/* You can specify as many udp_recv_channels as. */

Udp_recv_channel {

Port = 8649

Family = Inet4

}

...


For mcast_join This parameter, Green_header is the host name of the cluster head node, and you can specify the IP.

Then restart the Gmond service.

In the cluster head node,/etc/gmetad.conf needs to add the following line:

Data_source "Green" localhost

Configuration of blue clusters

The blue cluster configuration is similar to the green cluster, you only need to set the name of the cluster and the name of the cluster header, and then restart all nodes of the Gmond service.

Configuration of yellow clusters

Most configurations are similar to green clusters, with the following lines to be added to the/etc/gmetad.conf:

Data_source "Yellow" localhost

Data_source "Blue" Blue_header

Data_source "Green" Green_header

So Gmetad will:

1, contact the local gmond, get all the yellow node status data.

2, contact the Blue_header node Gmond, get all the Blue node status data. This data will be stored in the local RRDtool database.

3. Contact the Green_header node to obtain the aggregated data of Rrdtools collected in Gmetad. Note that this data is not saved in the Rrdtools in the yellow cluster, so if the front-end Web server refreshes, it will re-request the updated data to Green_header.


In addition, in/etc/gmetad.conf, you can also add the name of the grid: Gridname "Rainbow"

Now ganglia's web page will show a network called Rainbow, with three clusters: Yellow,green and Blue.

Some high-level topics gmetric use

You can add firmware:

Gmetric--name firmware--value ' Lsattr-el sys0-a modelname-f value '--type ' string '

Number of added disks:

Gmetric--name number_of_disks--value ' LSPV | Wc-l '--type int32

Add monitoring of a particular data (where name is the real name, value is obtained by the Myget program, and the type of number obtained is determined by type):

Gmetric--name TPM--value '/usr/local/bin/myget '--type double

The above statistics are only once, if you need long-term display, it is best to put the above statement every 60 seconds. Then, after a few minutes, the data will be displayed on the page.

For more information on gmetric, you can look at Http://ganglia.wiki.sourceforge.net/ganglia_readme.

Here are a few custom gmetric scripts to refer to: http://ganglia.sourceforge.net/gmetric/

Using Gstat to get data, Gstat can display data directly through commands such as:

[Email protected] ~]# Gstat

CLUSTER Information

Name:my_hadoop

Hosts:3

Gexec hosts:0

Dead hosts:0

Localtime:tue Feb 14 20:40:05 2012

There is no hosts running gexec at this time

[Email protected] ~]#

You can also get more information by adding parameters:

[Email protected] ~]# Gstat--all--single_line

CLUSTER Information

Name:my_hadoop

Hosts:3

Gexec hosts:0

Dead hosts:0

Localtime:tue Feb 14 20:39:43 2012

CLUSTER HOSTS

Hostname LOAD CPU gexec

CPUs (Procs/total) [1, 5, 15min] [User, Nice, System, Idle, Wio]

RAC2 1 (0/481) [0.04, 0.14, 0.11] [2.3, 0.0, 0.4, 97.3, 0.1] OFF

RAC3 1 (0/406) [0.07, 0.04, 0.01] [0.2, 0.0, 0.4, 99.4, 0.0] OFF

Rac1 1 (0/777) [0.09, 0.43, 0.42] [2.7, 0.0, 0.9, 96.3, 0.0] OFF



"Tools" Ganglia monitoring technology analysis

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.