Reprint please indicate source: http://blog.csdn.net/cywosp/article/details/39701245
Note: The code involved in this article is validated on the CentOS 6.5 64bit system, and the ganglia version number is 3.1. For detailed steps on Yum installation, please refer to:http://blog.csdn.net/cywosp/article/details/39701141
1. OverviewThe Ganglia project, launched by the University of California, has become a widely used cluster monitoring software. Ability to monitor and display various state information of nodes in a cluster, such as CPU, MEM, hard disk utilization, I/O load, network traffic, etc. At the same time, historical data can be rendered in a curved way through a PHP page.
Very good extensibility at the same time. Allow users to add the status information they want to monitor. It is easy to know the health status of a cluster based on the data that is visualized. It is also possible to analyze the areas where the clusters can be optimized.
All of this data collection can affect node performance many times.
The "jitter (jitter)" In the network occurs at the same time as a large number of small messages, and it is possible to avoid this problem by keeping the node's clock consistent through the NTP service.
How the ganglia works for example withmany other about ganglia work principle please refer to http://flyerlee.diandian.com/post/2013-06-03/ 40051002657, this article focuses on how to use the Python interface provided by ganglia to develop the metric you want.
2. Ganglia's metricswhat is metric. In the dictionary, its translation is standard and means of measurement. The most common things we see in ganglia's web interface are the following diagrams:
These figures are not really metric, it is only a summary of the various types of metric through the RRDtool drawing out of the map, easy to observe the whole. The following are the metric (CPU metrics) to be described in this article:each small chart represents a data type for CPU-related information. These data are collected through the corresponding program modules deployed in each cluster node, and the development of this collection module is the main explanation of this paper.
3. Define your own metrics developmentThere are two ways to add your own definition to ganglia metric, one is to execute gmetric through the command line, and the other is to provide an extension module for C and Python through ganglia. Add support for your own defined modules.
The following uses Python to develop a simple metric instance: Create a random_number.py in/usr/lib64/ganglia/python_modules/and add code such as the following
#!/usr/bin/env python#-*-coding:utf-8-*-import randomimport timedescriptors = List () def random_number_1 (name): Retu RN Int (Random.uniform (5)) + 10def random_number_2 (name): Return int (Random.randrange ()) def METRIC_INIT (para MS): Global descriptors random.seed () print params D1 = {' name ': ' random_number_1 ', ' call_back ' : random_number_1, ' Time_max ':, ' value_type ': ' uint ', ' units ': ' C ', ' slope ': ' Both ', ' Format ': '%u ', ' description ': ' Random a number ', ' groups ': ' Example random '} d2 = {' Name ': ' Random_number_2 ', ' call_back ': random_number_2, ' Time_max ': All, ' value_type ': ' uint ', ' units ' : ' C ', ' slope ': ' Both ', ' format ': '%u ', ' description ': ' Random a number ', ' groups ': ' Example R Andom '} descriptors = [D1, D2] return descriptorsdef metric_cleanup (): Pass#this code is for debugging and UN it testingif __name__ = = ' __main__ ': Metric_init ({}) while true:for d in descriptors:v = D[' Call_back '] (d[' name ') Print (' Value for%s ' +d[' format ')% (d[' name '), V) time.sleep (5)
in the above code, ganglia execution will call Metric_init and metric_cleanup two functions, from the names of these two functions we can know that the previous is to do initialization work, The latter one is done to end the cleanup of resources. These two functions are only called when ganglia loads the relevant module for execution. The main function entry here is just for debugging and writing here.
Create the random_number.conf file in/etc/ganglia/conf.d/and add the following code for example
Modules {module {#这里的name值一定要与/usr/lib64/ganglia/python_modules/random_number.py has the same file name, otherwise the name = "will not be executed correctly" Rando M_number "language =" Python "}}
Collection_group {collect_every = 2 Time_threshold = 90
Metric {The name in the D1 in the #这里的name值要与random_number. PY is consistent name = "random_number_1" title = "Random number 1" value_th Reshold = 0}
Metric {The name in the D2 in the #这里的name值要与random_number. PY is consistent name = "Random_number_2" title = "Random number 2" value_th reshold = 0}}
Restart the service after configuring the file
Service gmond restart service Gmetad restart service httpd restart
4. Effectsyou can see the monitoring effect by entering 127.0.0.1/ganglia in the browser. Assuming there is no problem, there are two options in the Metric drop-down box, such as the following:Select one of the following shows the sample at the bottom of the current page, for example:
5. Summary This article is just about the simplest ganglia metric development, a false idea of an in-depth understanding of the official documentation that can be ganglia, or https:// View the source code directly on Github.com/ganglia, where gmond_python_modules project provides a sample metrics demo of a number of common projects.
ganglia provides high-visibility monitoring of the cluster, It not only enables OPS to clearly understand the current state of the cluster, but also allows the developer to understand the relevant execution state of the system, so as to make better targeted optimization.
Progress a little bit each day--ganglia's Python expansion module development