Ganglia: distributed monitoring system

Source: Internet
Author: User
Tags rrd rrdtool

Ganglia: distributed monitoring system

1. Environment installation and configuration
1.1 dependent Software Download

Ganglia is a cluster monitoring software developed by Berkeley. Monitors and displays various status information of nodes in the cluster, such as cpu, mem, hard disk utilization, I/O load, and network traffic, historical data can also be displayed on the php page in a curve.

Ganglia depends on a web server to display the cluster status. rrdtool is used to store data and generate graphs. xml parsing is required, so expat is required. libconfuse is required for parsing the configuration file. To install httpd of apche, you also need to support php4 and later versions, as well as some dependent software.

In RedHat, run the following command to install the dependent software:

Yum-y install apr-devel apr-util check-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-devel kernel-devel freetype-devel fontconfig-devel gcc-c ++ expat-devel python -devel libXrender-devel

Libconfuse can be obtained through the following command:

Wget http://download.Fedora.redhat.com/pub/epel/5/x86_64/libconfuse-2.5-4.el5.x86_64.rpm
Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-devel-2.5-4.el5.x86_64.rpm

1.2 installation and configuration steps
1.2.1 Installation

Here to download the source code compilation and installation, to the http://ganglia.info site to download the latest version of ganglia, download and unzip.

./Configure -- with-librrd =/rrd/path -- with-gmetad -- prefix =/usr/local/ganglia

Make

Make install

If a dependency software problem occurs in the middle, you need to install the missing software package. After the installation is complete, you need to configure it. The configuration file is usually stored in the/etc/ganglia directory named gmetad. conf. Of course, there is no strict requirement on the path, because gmetad can specify the configuration file used at startup.

After ganglia is installed, you also need to install the apache server and support the php module. Otherwise, the final display page cannot be displayed normally. Yum install httpd php is recommended

Otherwise, apache may not be correctly associated with php if it is not configured correctly.

After installation, enter http: // localhost/test. php and compile a php page to test whether the installation is successful.

1.2.2 Configuration

If the source code is used for installation, ganglia will be installed in the/usr/local/ganglia directory according to the previous-prefix.

First, create a directory to store ganglia web pages.

Mkdir-p/var/www/html/ganglia/

This directory is used to store web pages that are used to display data.

Because the source code is compiled, gmetad and gmond are not added as services, run the following command.

Cp gmetad/gmetad. Init/etc/rc. d/Init. d/gmetad // copy the gmetad Service Startup Script
Cp gmond/gmond. Init/etc/rc. d/Init. d/gmond // copy the gmond Service Startup Script
Mkdir/etc/ganglia // create the main directory of the configuration file
Gmond-t | tee/etc/ganglia/gmond. conf // generate the gmond service configuration file
Cp gmetad/gmetad. conf/etc/ganglia // copy the gmetad service configuration file
Mkdir-p/var/lib/ganglia/rrds // create the rrd file storage directory
Chown nobody: nobody/var/lib/ganglia/rrds // The owner and group are both nobodies.
Chkconfig -- add gmetad // submit the service to chkconfig for management.
Chkconfig -- add gmond // same as above

To modify the configuration file/etc/gmetad. conf, you only need to modify the following parameters:

Data_source "Clustername" host1 host2

Change the cluster name to your own. host1 host2 is the data source of the xml file used by gmetad to obtain cluster information. If there is no write port, port 8649 is used by default, by default, gmetad downloads xml files from the host every 15 seconds through tcp connections. Therefore, they can be port 8649 of gmond or port 8651 of gmetad. Both of them can provide xml data download for cluster information.

Host1 host2 is an or relationship. If host1 cannot be downloaded, it will try to download it from host2. Therefore, they should all be nodes in the same cluster and save the same data. When the multicast mode is used, each gmond node has all the monitoring data of the nodes in the cluster. Therefore, you do not need to write all nodes into data_source. It is recommended that no less than two data entries be written. When the host 1 node crashes, it will automatically find the host 2 node to fetch data.

In addition, gmetad has the following attributes:

RRD database storage defInition

RRAs "RRA: AVERAGE: 0.5: 1: 244" "RRA: AVERAGE: 0.5: 24: 244" "RRA: AVERAGE: 0.5: 168" "RRA: AVERAGE: 0.5: 672: 244 "" RRA: AVERAGE: 0.5: 5760: 374"

RRD files location

Access control

Trusted_hosts address1 address2... DN1 DN2...

All_trusted OFF/on

Rrd directory for storing data

Rrd_rootdir "/var/lib/ganglia/rrds"

Network

Xml_port 8651 # telnet to this port to obtain the xml file of gmetad.

Interactive_port 8652 # port used for php page Data Interaction

1.2.3 php page configuration

You need to go to the/var/www/html/ganglia/directory to find

Php. conf

$ Gmetad_root = "/var/lib/ganglia"; # path of the rrd database written by gmetad

$ Rrds = "$ gmetad_root/rrds ";

$ Ganglia_ip = "localhost"; # address of the gmetad Server

$ Ganglia_port = 8652; # The interactive monitoring data port provided by the gmetad Server

By default, the web Front-end refreshes every 300 seconds (5 minutes). You can modify the refresh interval by modifying the config. php file, which includes all the Ganglia Web parameters.

1.2.4 ganglia client Configuration

Vi/etc/ganglia/gmond. conf

The cluster name, udp_send_channel, and udp_recv_channel must be modified. Note the differences between unicast and multicast modes. In multicast mode, nodes added to the multicast group receive data from all other nodes in the group. Therefore, each node is equivalent to a backup. In unicast mode, only data is sent from point to point, and data is sent to a specific host. In this mode, a central collection node is usually used.

Cluster {

Name = "Cluster1" # cluster of the current node

Owner = "chifeng" # Who is the owner of the node

Latlong = "unspecified" # coordinates on the earth, longitude, latitude?

Url = "unspecified"

}

Udp_send_channel {# udp packet transmission channel

Mcast_join = 239.2.11.71 # multicast, which works under channel 239.2.11.71. If the unicast mode is used, write host = host1 (the target host that receives data). In unicast mode, you can configure multiple udp_send_channels.

Port = 8649 # listening port

Ttl = 1

}

Udp_recv_channel {# udp packet receiving configuration

Mcast_join = 239.2.11.71 # It also works in the 239.2.11.71 channel. If unicast mode is used, write host = localip, which must be the ip address of the local machine.

Port = 8649 # listening port

Bind = 239.2.11.71 # bind

}

Tcp_accept_channel {

Port = 8649 # The port listened through the tcp protocol. The remote end can obtain monitoring data by connecting to port 8649. gmetad obtains xml data through this port.

}

There are other configuration items that do not need to be modified normally. Their meanings are as follows:

Collection_group section:

Collect_once-Specifies that the group of static metrics

Collect_every-Collection interval (only valid for non-static)

Time_threshold-Max data send interval

Metric section:

Name-Metric name (see "gmond-m ")

Value_threshold-Metric variance threshold (send if exceeded)

Example:

Collection_group {

Collect_every = 80

Time _ Threshold = 950

Metric {

Name = "proc_run"

Value_threshold = "1.0"

}

Metric {

Name = "proc_total"

Value_threshold = "1.0"

}

}

1.3 command set

Note: A command set refers to the command line commands I used during configuration installation. These commands can be used as a basis for automated deployment. You can consider writing automated deployment steps later.

Server:

1)install expat-2.0.1.tar.gz

Tar xvzf expat-2.0.1.tar.gz

Cd expat *;./configure -- prefix =/usr/local/apr; make install

2) install confuse-2.6

./Configure -- prefix =/usr/local/confuse-2.6 CFLAGS =-fPIC -- disable-nls; make install

3) install apr

Tar xvjf apr-1.3.2.tar.bz2

Cd apr-1.3.2;./configure -- prefix =/usr/local/apr; make install

Install apr-util-1.3.2.tar.bz2

Tar xvjf apr-util-1.3.2.tar.bz2

Cd apr-util-1.3.2;./configure -- with-apr =/usr/local/apr -- with-expat =/usr/local/expat

Make; make install

Cp/usr/local/apr-1.3.2/include/apr-1/*/usr/local/apr-1.3.2/include/directory, ganglia will find the apr library file under/usr/local/apr/include by default during installation.

4.install rrdtool-1.2.27.tar.gz

Tar xvzf rrdtool-1.2.27.tar.gz

Cd rrdtool-1.2.27;./configure -- prefix =/usr/local/rrdtool

Make; make install

5) cp/usr/local/apr/bin/apr-1 */usr/local/bin/copy this OK otherwise compilation problems

The following error is reported:

Checking for apr

Checking for apr-1-config... no

Configure: error: apr-1-config binary not found in pat

6) install ganglia

./Configure -- with-librrd =/opt/rrdtool-1.4.4 -- with-gmetad -- prefix =/usr/local/ganglia -- with-libconfuse =/usr/local/confuse-2.6

7) make; make install

8) install apache server and php support

Yum-y install httpd mysqld php-mysql php

Client:

Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-2.5-4.el5.x86_64.rpm

Wget http://download.fedora.redhat.com/pub/epel/5/x86_64/libconfuse-devel-2.5-4.el5.x86_64.rpm

Scp apr-*. * 10.250.13.45 :~ /

Scp libconfuse-*. * 10.250.13.45 :~ /

Scp ganglia-*. gz 10.250.13.45 :~ /

Scp ganglia-devel-*. rpm 10.250.13.45 :~ /

Scp *. conf 10.250.13.45 :~ /

Ssh 10.250.13.45

Sudo su-

Yum install expat

Cd/home/admin

Tar-xvf apr-1.4. *. gz

Cd apr *

./Configure -- prefix =/usr/local/apr

Make

Make install

Cd ..

Tar-xvf apr-util-1.3.9 .*

Cd apr-util *

./Configure -- with-apr =/usr/local/apr

Make

Make install

Cd ..

Rpm-ivh libconfuse-2.5-4.el5.x86_64.rpm

Rpm-ivh libconfuse-devel-2.5-4.el5.x86_64.rpm

Tar-xvf ganglia-3.1. *. gz

Cd ganglia *

Cp/usr/local/apr/bin/apr-1 */usr/local/bin/

./Configure -- with-apr =/usr/local/apr

Find/-name "libpython2.5 *"

Cp/usr/local/lib/libpython2.5.so/usr/lib/libpython2.5.so

Make

Make install

Cd ..

Rpm-ivh ganglia-devel-3.1.1-1.x86_64.rpm -- nodeps

Cd/etc

Mkdir ganglia

Cp/home/admin/*. conf/etc/ganglia/

Cd/etc/ganglia

Vi gmond. conf; edit udp send and recv host.

Vi/usr/local/etc/gmond. conf

Gmond -- debug = 10

Ps-e | grep gmond

Kill-9 id

Gmond

Modify gmond. conf again if necessary.

Scp test 10.250.13.42 :~ /

Scp test 10.250.13.43 :~ /

Scp test 10.250.13.44 :~ /

Scp test 10.250.13.45 :~ /

Vi/etc/profile

Export LD_LIBRARY_PATH = $ LD_LIBRARY_PATH: "/usr/local/lib64 /"

Source/etc/profile

1.4 Problems and Solutions
1.4.1 Installation Problems

◎ The library file is missing. This error usually occurs during the make process. The ld cannot find the corresponding library ratio, such as libpython2.5.so.

Solution: run the find command to find the two files, and ln-s creates a soft link reference pointing to the two files. Find/-name libpython *

◎ If a dependency error occurs during installation, it usually occurs during configure.

Solution: First Use find to find the path. If you find the path, read readme and check whether there are any parameters that support specifying the path. If you do not want to copy the file to the default directory or do not, you can add the parameter-nodeps. then download the lib, which is usually included in its devel package. You need to search for the file containing the lib online, then install.

1.4.2 configuration and operation problems
◎ Test whether gmond and gmetad run successfully

Telent localhost 8649

Telent localhost 8651

If there is no response

Solution: it is probably because the service is not started, or the default port is not used, ps-e | grep gmond, to check whether the service is started. View gmond. conf to view the tcp recv port used.

If you cannot find the cause, you can start it in debug mode to view the cause.

Gmond-debug = 10

If a port binding error such as udp occurs, for example, if it has been bind, check whether the port has been used lsof-I: port.

It may also be that the configuration file is incorrectly configured. For example, if I changed the host of udp_recv_channel to the same value as udp_send_channel, a port error occurs, the host of udp_recv_channel must be the ip address of the Local Machine (one machine may have multiple ip addresses ). If the permission is disabled, consider the current user identity or change it to root.

Test whether php frontend support is successful

Http: // localhost/ganglia

◎ The php page is displayed as a file or prompts you to download the file

Solution: the php module of apache is not properly installed and configured. Use yum install or download and install the php module again, and configure it in the apache conf file.

◎ Display page without Image Display

First, check whether selinux is disabled.

Check whether the rrdtool path in the conf. php file is correct and whether the file exists. Note that the path is the path of the rrdtool executable file rather than the installation directory.

Check whether/var/lib/ganglia/rrds exists and whether it can be written. Chown nobody: nobody/var/lib/ganglia/rrds # make sure RRDTool can write here.

Check whether the gmetad path and port in php. conf are correct.

Use Ganglia to monitor Hadoop Clusters

Install and configure Hadoop and Ganglia in Ubuntu of VMware Workstation

Create a Grid

Ganglia installation tutorial yum

Ganglia Quick Start Guide (translated from the official wiki)

Install Ganglia-3.6.0 monitoring Hadoop-2.2.0 and HBase-0.96.0 on CentOS Cluster

Install Ganglia on CentOS 6.5

Install Ganglia on Ubuntu 14.04 Server

For more details, please continue to read the highlights on the next page:

  • 1
  • 2
  • Next Page

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.