Deploying Ganglia to monitor Hadoop and HBase

Source: Internet
Author: User
Tags rrd rrdtool

Deploying Ganglia to monitor Hadoop and HBase

Some performance problems often occur during Hadoop O & M. However, performance problems cannot be simply analyzed through web pages and logs. Many metrics are required. Ganglia is one of the more practical monitoring tools.

Many people have shared a lot about deploying Ganglia on Baidu. Combined with everyone's experience. Add the problems encountered during the installation process, and sort out this article.

1. Prepare two machines

Server
192.168.0.11 (gmetad, web, gmond-master)
Client
192.168.0.12 (gmond)

2. software packages to be installed on the Server

  • Install the epel package: yum install-y epel-release (solve the problem that some installation packages cannot be installed by yum)
  • Install gmetad: yum install-y ganglia-gmetad ganglia-devel
  • Install gmond: yum install-y ganglia-gmond-python
  • Rrdtool installation: yum install-y rrdtool-devel
  • Httpd server installation: yum install-y httpd
  • Install ganglia-web and php: yum install-y ganglia-web php
  • Install other dependent packages: yum install-y apr-devel zlib-devel libconfuse-devel expat-devel pcre-devel

3. software packages to be installed on monitored nodes

  • Install the epel package: yum install-y epel-release (solve the problem that some installation packages cannot be installed by yum)
  • Install gmond: yum install-y ganglia-gmond-python

4. installation directory description

  • Ganglia configuration file directory:/etc/ganglia
  • Rrd database Directory:/var/lib/ganglia/rrds
  • Httpd Main Site Directory:/var/www/html
  • Ganglia-web installation directory:/usr/share/ganglia
  • Ganglia-web configuration Directory:/etc/httpd/conf. d/ganglia. conf

5. Disable SELINUX

Vi/etc/selinux/config
Change SELINUX = enforcing to SELINUX = disable;
Restart the machine.

6. Disable the Firewall

# Chkconfig iptables off
# Chkconfig iptables -- list
Iptables 0: off 1: off 2: off 3: off 4: off 5: off 6: off

7. Configure/etc/ganglia/gmetad. conf

Modify data_source:

Data_source "testcluster" 192.168.0.11: 8650 # destination gmond address and port (tcp_accept_channel) of gmetad data collection)

8. Configure gmond

/Etc/ganglia/gmond. conf ):

Cluster {
Name = "testcluster" # Set the cluster name
# Owner = "unspecified"
Latlong = "unspecified"
Url = "unspecified"
}
# The address and port sent to the target gmond (unicast)
Udp_send_channel {
Host = 192.168.0.11
Port = 8649
Ttl = 1
}
# Udp receiving port
Udp_recv_channel {
Port = 8649
}
# Gmetad port used to collect data requests
Tcp_accept_channel {
Port = 8650
Gzip_output = no
}

9. Configure web

Soft connection mode

> Ln-s/usr/share/ganglia/var/www/ganglia

You can also copy/usr/share/ganglia contents to/var/www/ganglia directly.

10. Modify/etc/httpd/conf. d/ganglia. conf:

Alias/ganglia/usr/share/ganglia
<Location/ganglia>
Order deny, allow
Allow from all
</Location>

11. Start the service

# Service gmetad start
# Service gmond start
# Service httpd restart

So far, the server of Ganglia has been deployed.

Configure the client:

12. You only need to configure gmond on the client (you need to install yum-y install ganglia-gmond-python first)

/Etc/ganglia/gmond. conf ):

Cluster {
Name = "testcluster" # Set the cluster name
# Owner = "unspecified"
Latlong = "unspecified"
Url = "unspecified"
}
# The address and port sent to the target gmond (unicast)
Udp_send_channel {
Host = 192.168.248.130
Port = 8649
Ttl = 1
}
# Udp receiving port
Udp_recv_channel {
Port = 8649
}
# Gmetad port used to collect data requests
Tcp_accept_channel {
Port = 8650
Gzip_output = no
}

13. Configure HDFS and YARN to integrate Ganglia

Modify hadoop-metrics2.properties

# For Ganglia 3.1 support
*. Sink. ganglia. class = org. apache. hadoop. metrics2.sink. ganglia. GangliaSink31
*. Sink. ganglia. period = 10
# Default for supportsparse is false
*. Sink. ganglia. supportsparse = true
*. Sink. ganglia. slope = jvm. metrics. gcCount = zero, jvm. metrics. memHeapUsedM = both
*. Sink. ganglia. dmax = jvm. metrics. threadsBlocked = 70, jvm. metrics. memHeapUsedM = 40
Namenode. sink. ganglia. servers = 192.168.0.11: 8649 # For details about host, refer to the definition in gmond. conf.
Datanode. sink. ganglia. servers = 192.168.0.11: 8649
Resourcemanager. sink. ganglia. servers = 192.168.0.11: 8649
Nodemanager. sink. ganglia. servers = 192.168.0.11: 8649
Mrappmaster. sink. ganglia. servers = 192.168.0.11: 8649
Jobhistoryserver. sink. ganglia. servers = 192.168.0.11: 8649

14. Integrate HBase with Ganglia

Modify hadoop-metrics2-hbase.properties

*. Sink. file *. class = org. apache. hadoop. metrics2.sink. FileSink
# Default sampling period
*. Period = 10
*. Source. filter. class = org. apache. hadoop. metrics2.filter. GlobFilter
*. Record. filter. class =$ {*. source. filter. class}
*. Metric. filter. class =$ {*. source. filter. class}
Hbase. sink. ganglia. record. filter. exclude = * Regions *
Hbase. sink. ganglia. class = org. apache. hadoop. metrics2.sink. ganglia. GangliaSink31
Hbase. sink. ganglia. tagsForPrefix. jvm = ProcessName
*. Sink. ganglia. period = 20
Hbase. sink. ganglia. servers = 192.168.0.11: 8649 # For details about host, see the definition in gmond. conf.

15. Copy the configuration file to every machine to be monitored.

Copy the hadoop-metrics2.properties to the $ HADOOP_HOME/etc/hadoop/directory

Copy hadoop-metrics2-hbase.properties to the $ HBASE_HOME/conf directory

Restart the hadoop & hbase software to make it take effect.

16. Start monitoring gmond

Service gmond start

Problem summary:

The client has passed the information to see the overall CPU load and other information.

2. However, the information of each node is empty and "no matching metrics detected or rrds not readable" is displayed"

3. View RRDs Information

# Cd/var/lib/ganglia/rrds

# Ll

Drwxr-xr-x 5 ganglia 4096 Jan 17 azcluster

Drwxr-xr-x 2 ganglia 36864 Jan 17 _ SummaryInfo __

4. the folder name is in lower case.

# Ll

Drwxr-xr-x 2 ganglia 32768 Jan 17 azcbetadnl05.envazure.com

Drwxr-xr-x 2 ganglia 4096 Jan 17 azcbetaldapl01.envazure.com

Drwxr-xr-x 2 ganglia 36864 Jan 17 _ SummaryInfo __

5. All data has been transferred.

# Ls azcbetadnl05.envazure.com/| more

Boottime. rrd

Bytes_in.rrd

Bytes_out.rrd

Cpu_aidle.rrd

Disk_free_absolute_data1.rrd

Disk_free_absolute_data2.rrd

Disk_free_absolute_data3.rrd

Disk_free_absolute_data4.rrd

Disk_free_absolute_data5.rrd

Disk_free_absolute_dev_shm.rrd

Disk_free_absolute_mnt_resource.rrd

......

6. cause: the folders of each node in/var/lib/ganglia/rrds are in lower case. If the hostname of the node contains uppercase letters, the data cannot be found.

Solution: Modify gmetad. conf and set case_sensitive_hostnames to 1.

# Ls/etc/ganglia/

Drwxr-xr-x 2 root 4096 Jan 17 08:36 conf. d

-Rw-r -- 1 root 171 Oct 12 2015 conf. php

-Rw-r -- 1 root 9834 Jan 17 08:44 gmetad. conf

-Rw-r -- 1 root 8756 Jan 17 08:45 gmond. conf

# Vi gmetad. conf

# In earlier versions of gmetad, hostnames were handled in a case

# Sensitive manner

# If your hostname directories have been renamed to lower case,

# Set this option to 0 to disable backward compatibility.

# From version 3.2, backwards compatibility will be disabled by default.

# Default: 1 (for gmetad <3.2)

# Default: 0 (for gmetad >=3.2)

Case_sensitive_hostnames 1 # if it is set to 1, the upper case will not be changed to lower case

7. After the modification, go to the RRDs directory to view the results.

# Cd/var/lib/ganglia/rrds/azcluster

No changes

# Ls-al

Drwxr-xr-x 2 ganglia 32768 Jan 17 azcbetadnl05.envazure.com

Drwxr-xr-x 2 ganglia 4096 Jan 17 azcbetaldapl01.envazure.com

Drwxr-xr-x 2 ganglia 36864 Jan 17 _ SummaryInfo __

8. Restart gmetad to make the configuration take effect.

# Service gmetad restart
Shutting down GANGLIA gmetad: [OK]
Starting GANGLIA gmetad: [OK]

9. You can see that the folder of the upper-case host name has been created.

# Ls-al
Drwxr-xr-x 2 ganglia 32768 Jan 18 azcbetadnl05.envazure.com
Drwxr-xr-x 2 ganglia 4096 Jan 18 AZcbetadnL05.envazure.com <
Drwxr-xr-x 2 ganglia 4096 Jan 17 azcbetaldapl01.envazure.com
Drwxr-xr-x 2 ganglia 4096 Jan 18 AZcbetaLDAPL01.envazure.com <
Drwxr-xr-x 2 ganglia 36864 Jan 18 _ SummaryInfo __

10. You can see that the information has arrived.

# Ls-l AZcbetaLDAPL01.envazure.com
-Rw-1 ganglia 630760 Jan 18 boottime. rrd
-Rw-1 ganglia 630760 Jan 18 bytes_in.rrd
-Rw-1 ganglia 630760 Jan 18 bytes_out.rrd
-Rw-1 ganglia 630760 Jan 18 cpu_aidle.rrd

11. Check the webpage again. It is normal.

This article permanently updates link: https://www.bkjia.com/Linux/2018-03/151488.htm

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.