Discover monitoring hadoop cluster with nagios, include the articles, news, trends, analysis and practical advice about monitoring hadoop cluster with nagios on alibabacloud.com
1. Be sure to find the application side to confirm each node needs to monitor the process, do not blindly think all the Hadoop cluster of ZK, journal what are the same, remember!2. The monitored node only needs to install Nagios-plugin and Nrpe, depending on the need to install xinetd3. Verify that Nagios is not instal
Ganglia is a cluster monitoring software developed by Berkeley. You can monitor and display the various status information of nodes in the cluster, such as CPU, MEM, hard disk utilization, I/O load, network traffic, etc., while the historical data can be presented in a curved way via PHP pages.And ganglia relies on a Web server to display the state of the
Nagios Enterprise Cluster Monitoring
Nagios is a monitoring system that monitors system running status and network information. Nagios can monitor specified local or remote hosts and services, and provide exception notificati
module, run easy_install Pymongo command installation. For example, see the following:[Email protected] objects]# Easy_install PymongoSearching for PymongoReading http://pypi.python.org/simple/pymongo/Best Match:pymongo 2.7.2......Zip_safe flag not set; Analyzing Archive Contents ...Adding Pymongo 2.7.2 to easy-install.pth fileInstalled/usr/lib/python2.6/site-packages/pymongo-2.7.2-py2.6-linux-x86_64.eggProcessing dependencies for PymongoFinished processing dependencies for Pymongo-------------
Apache Ambari is a Web-based tool that supports the supply, management, and monitoring of Apache Hadoop clusters. Ambari currently supports most Hadoop components, including HDFS, MapReduce, Hive, Pig, Hbase, Zookeper, Sqoop, and Hcatalog.Apache Ambari supports centralized management of HDFS, MapReduce, Hive, Pig, Hbase, Zookeper, Sqoop, and Hcatalog. It is also
reliable fault-tolerant mechanism, and the easy extension and inclusion of custom Metric. One of the more famous users of Ganglia is Wikipedia, where you can visit their Ganglia examples to see how the Wikipedia cluster works.Ganglia + Nagios, this is our monitoring system for the choice of solutions. But one problem with this is that both have their own surveil
sync_h_script # In fact, these two commands are the alias of my own salt command, check/opt/hadoop_scripts/profile. d/hadoop. sh
Iii. Monitoring
A common solution is ganglia and nagios monitoring. ganglia collects a large number of metrics and uses graphical programs. nagios
server via direct routing vs/ DR (Virtual Server via Direct Routing), which can greatly improve the scalability of the system. Vs/nat, Vs/tun and VS/DR technologies are three IP load balancing technologies implemented in LVS cluster.
Nagios Introduction
Nagios is a monitoring system that monitors system operation st
Hadoop consists of two parts:
Distributed File System (HDFS)
Distributed Computing framework mapreduce
The Distributed File System (HDFS) is mainly used for the Distributed Storage of large-scale data, while mapreduce is built on the Distributed File System to perform distributed computing on the data stored in the distributed file system.
Describes the functions of nodes in detail.
Namenode:
1. There is only one namenode in the
Nagios monitoring heartbeat, nagiosheartbeat
After heartbeat is set up, we need to monitor it. Next we will learn how to monitor it.
First, let's take a look at the following commands. These commands will be automatically added after heartbeat is installed. Our monitoring script will use these commands.
[Root @ usvr-210 libexec] # which cl_status/usr/bin/cl_stat
Deploy check_mysql_health on Nagios to monitor MySQLBlog Category:
Architecture
This monitoring is based on the Nagios server active monitoring method, using Check_mysql_health to implement a variety of monitoring modes:Connection-time (time to connect to the se
not Runni Ng on this node "exit $CRITICALfideclare-I i=0declare-i a=0nodes= ' $CL _st listnodes ' for node in $NODESdo status= ' $CL _st Nodestatus $node ' let i= $I +1# if [$status = = "Active"] by default, the number of active states is detected, but the ping status is also normal, so the following conditions are changed. if [$status = = "Active"-o $status = = "Ping"] then let a= $A +1 fidoneif [$A-eq 0]then echo "Heartbeat CRITICAL: $A/$I "Exit $CRITICALelif [$A-ne $I]then echo" Heartbeat
The following software is widely used in the Internet industry, but its pronunciation is often "one English, each expressing"
Nagios is the IT infrastructure monitoring software, Home PageHttp: // www.Nagios. Org/
(As pronounced by Ethan, the author of Nagios ):
Http://community.nagios.org/audio/nagiospronunciation.mp3
Cacti is a graphic tool for ne
BKJIA exclusive Article] I am a linux/unix system engineer who uses Nagios to automatically monitor the company's intranet development environment and Internet application environment. Nagios has powerful alarm functions, but sometimes our system group has this need, especially when the system is busy, we want to leave logs for analysis: whether it is under attack, the developer is not properly set, or the
system. In practical application scenarios, the Administrator optimizes Linux kernel parameters to improve the job running efficiency. The following are some useful adjustment options.(1) Increase the file descriptor and network connection limit opened at the same time.In a Hadoop cluster, due to the large number of jobs and tasks involved, the operating system kernel limits the number of file descriptors
and need to work with active NN and standby NN report block information; Advantages: Information is not lost, recovery fast (seconds) Disadvantage: Facebook based on Hadoop0.2 development, the deployment of a little trouble; additional machine resources are required, and NFS becomes another single point (but with a low failure rate) of 4. Hadoop2.0 directly supports standby NN, draws on Facebook's avatar, and then makes some improvements: information is not lost, recovery is fast (seconds), sim
node. When a job is committed, Jobtracker receives the submit job and configuration information, and distributes the configuration information to the node, dispatching the task and monitoring the execution of the Tasktracker.
As can be seen from the above introduction, HDFs and MapReduce together form the core of the Hadoop Distributed system architecture. HDFs realizes Distributed File system on the
Is the cluster set up correctly? The best way to answer this question is empirically: Run some jobs and confirm that you get the expected results. benchmarks make good tests, as you also get numbers that you can compare with other clusters as a sanity check on whether your new cluster is refreshing roughly as expected. and you can tune a cluster using benchmark r
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.