Ganglia: distributed monitoring system
1. Environment installation and configuration
1.1 dependent Software Download
Ganglia is a cluster monitoring software developed by Berkeley. Monitors and displays various status information of nodes in the cluster, such as cpu, mem, hard disk utilization, I/O load, and network traffic, historical data can also be displayed on the php page in a curve.
Ganglia depends on a web server to display the cluster status. rrdtool is used to store data and generate graphs. xml parsing is required, so expat is required. libconfuse is required for parsing the configuration file. To install httpd of apche, you also need to support php4 and later versions, as well as some dependent software.
In RedHat, run the following command to install the dependent software:
Yum-y install apr-devel apr-util check-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-devel kernel-devel freetype-devel fontconfig-devel gcc-c ++ expat-devel python -devel libXrender-devel
Libconfuse can be obtained through the following command:
Wget http://download.Fedora.redhat.com/pub/epel/5/x86_64/libconfuse-2.5-4.el5.x86_64.rpm
Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-devel-2.5-4.el5.x86_64.rpm
1.2 installation and configuration steps
1.2.1 Installation
Here to download the source code compilation and installation, to the http://ganglia.info site to download the latest version of ganglia, download and unzip.
./Configure -- with-librrd =/rrd/path -- with-gmetad -- prefix =/usr/local/ganglia
Make
Make install
If a dependency software problem occurs in the middle, you need to install the missing software package. After the installation is complete, you need to configure it. The configuration file is usually stored in the/etc/ganglia directory named gmetad. conf. Of course, there is no strict requirement on the path, because gmetad can specify the configuration file used at startup.
After ganglia is installed, you also need to install the apache server and support the php module. Otherwise, the final display page cannot be displayed normally. Yum install httpd php is recommended
Otherwise, apache may not be correctly associated with php if it is not configured correctly.
After installation, enter http: // localhost/test. php and compile a php page to test whether the installation is successful.
1.2.2 Configuration
If the source code is used for installation, ganglia will be installed in the/usr/local/ganglia directory according to the previous-prefix.
First, create a directory to store ganglia web pages.
Mkdir-p/var/www/html/ganglia/
This directory is used to store web pages that are used to display data.
Because the source code is compiled, gmetad and gmond are not added as services, run the following command.
Cp gmetad/gmetad. Init/etc/rc. d/Init. d/gmetad // copy the gmetad Service Startup Script
Cp gmond/gmond. Init/etc/rc. d/Init. d/gmond // copy the gmond Service Startup Script
Mkdir/etc/ganglia // create the main directory of the configuration file
Gmond-t | tee/etc/ganglia/gmond. conf // generate the gmond service configuration file
Cp gmetad/gmetad. conf/etc/ganglia // copy the gmetad service configuration file
Mkdir-p/var/lib/ganglia/rrds // create the rrd file storage directory
Chown nobody: nobody/var/lib/ganglia/rrds // The owner and group are both nobodies.
Chkconfig -- add gmetad // submit the service to chkconfig for management.
Chkconfig -- add gmond // same as above
To modify the configuration file/etc/gmetad. conf, you only need to modify the following parameters:
Data_source "Clustername" host1 host2
Change the cluster name to your own. host1 host2 is the data source of the xml file used by gmetad to obtain cluster information. If there is no write port, port 8649 is used by default, by default, gmetad downloads xml files from the host every 15 seconds through tcp connections. Therefore, they can be port 8649 of gmond or port 8651 of gmetad. Both of them can provide xml data download for cluster information.
Host1 host2 is an or relationship. If host1 cannot be downloaded, it will try to download it from host2. Therefore, they should all be nodes in the same cluster and save the same data. When the multicast mode is used, each gmond node has all the monitoring data of the nodes in the cluster. Therefore, you do not need to write all nodes into data_source. It is recommended that no less than two data entries be written. When the host 1 node crashes, it will automatically find the host 2 node to fetch data.
In addition, gmetad has the following attributes:
RRD database storage defInition
RRAs "RRA: AVERAGE: 0.5: 1: 244" "RRA: AVERAGE: 0.5: 24: 244" "RRA: AVERAGE: 0.5: 168" "RRA: AVERAGE: 0.5: 672: 244 "" RRA: AVERAGE: 0.5: 5760: 374"
RRD files location
Access control
Trusted_hosts address1 address2... DN1 DN2...
All_trusted OFF/on
Rrd directory for storing data
Rrd_rootdir "/var/lib/ganglia/rrds"
Network
Xml_port 8651 # telnet to this port to obtain the xml file of gmetad.
Interactive_port 8652 # port used for php page Data Interaction
1.2.3 php page configuration
You need to go to the/var/www/html/ganglia/directory to find
Php. conf
$ Gmetad_root = "/var/lib/ganglia"; # path of the rrd database written by gmetad
$ Rrds = "$ gmetad_root/rrds ";
$ Ganglia_ip = "localhost"; # address of the gmetad Server
$ Ganglia_port = 8652; # The interactive monitoring data port provided by the gmetad Server
By default, the web Front-end refreshes every 300 seconds (5 minutes). You can modify the refresh interval by modifying the config. php file, which includes all the Ganglia Web parameters.
1.2.4 ganglia client Configuration
Vi/etc/ganglia/gmond. conf
The cluster name, udp_send_channel, and udp_recv_channel must be modified. Note the differences between unicast and multicast modes. In multicast mode, nodes added to the multicast group receive data from all other nodes in the group. Therefore, each node is equivalent to a backup. In unicast mode, only data is sent from point to point, and data is sent to a specific host. In this mode, a central collection node is usually used.
Cluster {
Name = "Cluster1" # cluster of the current node
Owner = "chifeng" # Who is the owner of the node
Latlong = "unspecified" # coordinates on the earth, longitude, latitude?
Url = "unspecified"
}
Udp_send_channel {# udp packet transmission channel
Mcast_join = 239.2.11.71 # multicast, which works under channel 239.2.11.71. If the unicast mode is used, write host = host1 (the target host that receives data). In unicast mode, you can configure multiple udp_send_channels.
Port = 8649 # listening port
Ttl = 1
}
Udp_recv_channel {# udp packet receiving configuration
Mcast_join = 239.2.11.71 # It also works in the 239.2.11.71 channel. If unicast mode is used, write host = localip, which must be the ip address of the local machine.
Port = 8649 # listening port
Bind = 239.2.11.71 # bind
}
Tcp_accept_channel {
Port = 8649 # The port listened through the tcp protocol. The remote end can obtain monitoring data by connecting to port 8649. gmetad obtains xml data through this port.
}
There are other configuration items that do not need to be modified normally. Their meanings are as follows:
Collection_group section:
Collect_once-Specifies that the group of static metrics
Collect_every-Collection interval (only valid for non-static)
Time_threshold-Max data send interval
Metric section:
Name-Metric name (see "gmond-m ")
Value_threshold-Metric variance threshold (send if exceeded)
Example:
Collection_group {
Collect_every = 80
Time _ Threshold = 950
Metric {
Name = "proc_run"
Value_threshold = "1.0"
}
Metric {
Name = "proc_total"
Value_threshold = "1.0"
}
}
1.3 command set
Note: A command set refers to the command line commands I used during configuration installation. These commands can be used as a basis for automated deployment. You can consider writing automated deployment steps later.
Server:
1)install expat-2.0.1.tar.gz
Tar xvzf expat-2.0.1.tar.gz
Cd expat *;./configure -- prefix =/usr/local/apr; make install
2) install confuse-2.6
./Configure -- prefix =/usr/local/confuse-2.6 CFLAGS =-fPIC -- disable-nls; make install
3) install apr
Tar xvjf apr-1.3.2.tar.bz2
Cd apr-1.3.2;./configure -- prefix =/usr/local/apr; make install
Install apr-util-1.3.2.tar.bz2
Tar xvjf apr-util-1.3.2.tar.bz2
Cd apr-util-1.3.2;./configure -- with-apr =/usr/local/apr -- with-expat =/usr/local/expat
Make; make install
Cp/usr/local/apr-1.3.2/include/apr-1/*/usr/local/apr-1.3.2/include/directory, ganglia will find the apr library file under/usr/local/apr/include by default during installation.
4.install rrdtool-1.2.27.tar.gz
Tar xvzf rrdtool-1.2.27.tar.gz
Cd rrdtool-1.2.27;./configure -- prefix =/usr/local/rrdtool
Make; make install
5) cp/usr/local/apr/bin/apr-1 */usr/local/bin/copy this OK otherwise compilation problems
The following error is reported:
Checking for apr
Checking for apr-1-config... no
Configure: error: apr-1-config binary not found in pat
6) install ganglia
./Configure -- with-librrd =/opt/rrdtool-1.4.4 -- with-gmetad -- prefix =/usr/local/ganglia -- with-libconfuse =/usr/local/confuse-2.6
7) make; make install
8) install apache server and php support
Yum-y install httpd mysqld php-mysql php
Client:
Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-2.5-4.el5.x86_64.rpm
Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-devel-2.5-4.el5.x86_64.rpm
Scp apr-*. * 10.250.13.45 :~ /
Scp libconfuse-*. * 10.250.13.45 :~ /
Scp ganglia-*. gz 10.250.13.45 :~ /
Scp ganglia-devel-*. rpm 10.250.13.45 :~ /
Scp *. conf 10.250.13.45 :~ /
Ssh 10.250.13.45
Sudo su-
Yum install expat
Cd/home/admin
Tar-xvf apr-1.4. *. gz
Cd apr *
./Configure -- prefix =/usr/local/apr
Make
Make install
Cd ..
Tar-xvf apr-util-1.3.9 .*
Cd apr-util *
./Configure -- with-apr =/usr/local/apr
Make
Make install
Cd ..
Rpm-ivh libconfuse-2.5-4.el5.x86_64.rpm
Rpm-ivh libconfuse-devel-2.5-4.el5.x86_64.rpm
Tar-xvf ganglia-3.1. *. gz
Cd ganglia *
Cp/usr/local/apr/bin/apr-1 */usr/local/bin/
./Configure -- with-apr =/usr/local/apr
Find/-name "libpython2.5 *"
Cp/usr/local/lib/libpython2.5.so/usr/lib/libpython2.5.so
Make
Make install
Cd ..
Rpm-ivh ganglia-devel-3.1.1-1.x86_64.rpm -- nodeps
Cd/etc
Mkdir ganglia
Cp/home/admin/*. conf/etc/ganglia/
Cd/etc/ganglia
Vi gmond. conf; edit udp send and recv host.
Vi/usr/local/etc/gmond. conf
Gmond -- debug = 10
Ps-e | grep gmond
Kill-9 id
Gmond
Modify gmond. conf again if necessary.
Scp test 10.250.13.42 :~ /
Scp test 10.250.13.43 :~ /
Scp test 10.250.13.44 :~ /
Scp test 10.250.13.45 :~ /
Vi/etc/profile
Export LD_LIBRARY_PATH = $ LD_LIBRARY_PATH: "/usr/local/lib64 /"
Source/etc/profile
1.4 Problems and Solutions
1.4.1 Installation Problems
◎ The library file is missing. This error usually occurs during the make process. The ld cannot find the corresponding library ratio, such as libpython2.5.so.
Solution: run the find command to find the two files, and ln-s creates a soft link reference pointing to the two files. Find/-name libpython *
◎ If a dependency error occurs during installation, it usually occurs during configure.
Solution: First Use find to find the path. If you find the path, read readme and check whether there are any parameters that support specifying the path. If you do not want to copy the file to the default directory or do not, you can add the parameter-nodeps. then download the lib, which is usually included in its devel package. You need to search for the file containing the lib online, then install.
1.4.2 configuration and operation problems
◎ Test whether gmond and gmetad run successfully
Telent localhost 8649
Telent localhost 8651
If there is no response
Solution: it is probably because the service is not started, or the default port is not used, ps-e | grep gmond, to check whether the service is started. View gmond. conf to view the tcp recv port used.
If you cannot find the cause, you can start it in debug mode to view the cause.
Gmond-debug = 10
If a port binding error such as udp occurs, for example, if it has been bind, check whether the port has been used lsof-I: port.
It may also be that the configuration file is incorrectly configured. For example, if I changed the host of udp_recv_channel to the same value as udp_send_channel, a port error occurs, the host of udp_recv_channel must be the ip address of the Local Machine (one machine may have multiple ip addresses ). If the permission is disabled, consider the current user identity or change it to root.
Test whether php frontend support is successful
Http: // localhost/ganglia
◎ The php page is displayed as a file or prompts you to download the file
Solution: the php module of apache is not properly installed and configured. Use yum install or download and install the php module again, and configure it in the apache conf file.
◎ Display page without Image Display
First, check whether selinux is disabled.
Check whether the rrdtool path in the conf. php file is correct and whether the file exists. Note that the path is the path of the rrdtool executable file rather than the installation directory.
Check whether/var/lib/ganglia/rrds exists and whether it can be written. Chown nobody: nobody/var/lib/ganglia/rrds # make sure RRDTool can write here.
Check whether the gmetad path and port in php. conf are correct.
Use Ganglia to monitor Hadoop Clusters
Install and configure Hadoop and Ganglia in Ubuntu of VMware Workstation
Create a Grid
Ganglia installation tutorial yum
Ganglia Quick Start Guide (translated from the official wiki)
Install Ganglia-3.6.0 monitoring Hadoop-2.2.0 and HBase-0.96.0 on CentOS Cluster
Install Ganglia on CentOS 6.5
Install Ganglia on Ubuntu 14.04 Server
For more details, please continue to read the highlights on the next page: