Nagios Monitoring MongoDB Shard Cluster Service combat

Source: Internet
Author: User
Tags mongodb commands mongodb monitoring



1 , monitoring plug-in download
Mongodb plug-ins are:  git://github.com/mzupan/nagios-plugin-mongodb.git, just started I did not install the gitpub Environment here, Find the user grassroots help download. Then uploaded to the csdn Resources page, the new is:http://download.csdn.net/detail/mchdba/8019077


2 , add a new MongoDB Monitoring Commands

Because the MongoDB service is a physical machine that is shared with MySQL from the library. Prior to doing the basic Nagios and MySQL service monitoring, so here just need to add MONGODB commands and services on the original basis. Nagios monitoring MySQL Please refer to: http://blog.itpub.net/26230597/viewspace-760141/and http://blog.itpub.net/26230597/viewspace-1217246/. So the MONGODB monitoring commands that need to be added here are as follows:

[[email protected] objects]# cd/usr/local/nagios/etc/objects[[email protected] objects]# vim Commands.cfgdefine command {command_name check_mongodb command_line $USER 1$/nagios-plugin-mongodb/check_mongodb.py -H $HOSTADDRESS $-A $ARG 1$-P $ARG 2$-W $ARG 3$-C $ARG 4$}define command {command_name check_mongodb_database comman D_line $USER 1$/nagios-plugin-mongodb/check_mongodb.py-h $HOSTADDRESS $-A $ARG 1$-P $ARG 2$-W $ARG 3$-C $ARG 4$-D $ARG 5$}d  Efine command {command_name check_mongodb_collection command_line $USER 1$/nagios-plugin-mongodb/check_mongodb.py-h $HOSTADDRESS $-A $ARG 1$-P $ARG 2$-W $ARG 3$ c $ARG 4$-D $ARG 5$-C $ARG 6$}define command {command_name check_mongodb_ Replicaset command_line $USER 1$/nagios-plugin-mongodb/check_mongodb.py-h $HOSTADDRESS $-A $ARG 1$-P $ARG 2$-W $ARG 3$- C $ARG 4$-R $ARG 5$}define command {command_name check_mongodb_query command_line $USER 1$/nagios-plugin-mongodb/chec K_mongodb.py-h $HOSTADDRESS $-A $ARG 1$-P $ARG 2$-W $ARG 3$-C $ARG 4$-Q $ARG 5$} 


3 , join MongoDB Monitoring Services MongoDB Services also need to be added separately. For example, see the following:
#检測mongodb服务的连接时间, more than 2 seconds of ordinary alarm, 5 seconds on serious alarm define service{host_name dbm1slave1 service_description Mongo Connect Ch        Eck Check_command check_mongodb!connect!30000!2!5 max_check_attempts 5 normal_check_interval 3 Retry_check_interval 2 check_period 24x7 notification_interval notification_period 24x7 No Tification_options w,u,c,r contact_groups Ops} #检查mongodb的连接数, more than 150 ordinary alarms, 200 critical alarms define service{Host_n Ame dbm1slave1 service_description Mongo free Connections check_command check_mongodb!connections!27017!70!8        0 max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 24x7        Notification_interval notification_period 24x7 notification_options w,u,c,r contact_groups Ops #检查mongodb复制完毕的百分比率, make sure that the time of the primary and standby is consistent. Define service{host_name dbm1slave1 service_description MongoReplication Lag check_command check_mongodb!replication_lag!27017!15!30 max_check_attempts 5 Normal_c Heck_interval 3 retry_check_interval 2 check_period 24x7 notification_interval notification _period 24x7 notification_options w,u,c,r contact_groups ops} #检查mongodb内存使用率. Threshold associated with the total amount of memory in the machine where MongoDB is located define service{host_name dbm1slave1 service_description Mongo memory Usage CHEC K_command check_mongodb!memory!27017!20!28 max_check_attempts 5 normal_check_interval 3 retry_check_i Nterval 2 check_period 24x7 notification_interval notification_period 24x7 notification_opt Ions w,u,c,r contact_groups OPS} #检查mongodb mapped memory utilization.        Threshold associated with the total amount of memory in the machine where MongoDB is located define service{host_name dbm1slave1 service_description Mongo Mapped Memory Usage Check_command check_mongodb!memory_mapped!27017!20!28 max_check_attempts 5 Normal_check_interval 3 Retry_check_interval 2 check_period 24x7 Notification_interval 10 Notification_period 24x7 notification_options w,u,c,r contact_groups ops} #检查Lock time Percentage. Assume that lock time occupies 5% of the MONGO runtime for ordinary alarms. Assuming that more than 10% of the severe alarms define service{host_name dbm1slave1 service_description Mongo Lock Percentage Check_comm         and check_mongodb!lock!27017!5!10 max_check_attempts 5 normal_check_interval 3 Retry_check_interval 2 Check_period 24x7 notification_interval notification_period 24x7 notification_options w,u, C,r contact_groups ops}# check Average flush time, check the average flush duration of the MONGO server, define service{host_name dbm 1slave1 service_description Mongo Flush Average check_command check_mongodb!flushing!27017!100!200 ma       X_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 24x7 Notification_interval notification_period 24x7 notification_options w,u,c,r contact_groups Ops }# Check last flush time, checking the latest flush times, assuming more than 200ms ordinary alarm. More than 400ms on serious alarm define service{host_name dbm1slave1 service_description Mongo last Flush time Check_comm and check_mongodb!last_flush_time!27017!200!400 max_check_attempts 5 normal_check_interval 3 retry_ch Eck_interval 2 check_period 24x7 notification_interval notification_period 24x7 Notificatio N_options w,u,c,r contact_groups Ops} # Check status of MongoDB replicaset, checking MONGO replication status define Servic e{host_name dbm1slave1 service_description MongoDB State Check_command check_mongodb!replset_state!2         7017!0!0 max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 24x7 Notification_interval Notification_period 24x7 Notification_optIons w,u,c,r contact_groups ops}# Check status of index Miss ratio, check index hit ratio. Define service{host_name dbm1slave1 service_description MongoDB Index Miss Ratio check_command Check _mongodb!index_miss_ratio!27017!. 005!.        Max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 24x7        Notification_interval notification_period 24x7 notification_options w,u,c,r contact_groups Ops # Check number of databases and number of Collectionsdefine service{host_name dbm1slave1 ser Vice_description MongoDB number of databases Check_command check_mongodb!databases!27017!300!500 max_check_a Ttempts 5 normal_check_interval 3 retry_check_interval 2 check_period 24x7 Notification_interv Al Notification_period 24x7 notification_options w,u,c,r contact_groups ops}define Service {HOST_NAME Dbm1slave1 service_description MongoDB Number of collections Check_command check_mongodb!collections!27017!        300!500 max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 24x7 Notification_interval notification_period 24x7 notification_options w,u,c,r contact_groups OPS} # Check size of a database, check library sizes define service{host_name dbm1slave1 service_de Scription MongoDB Database size your-database check_command check_mongodb_database!database_size!27017!300!500!your -database max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 check_period 24x 7 notification_interval notification_period 24x7 notification_options w,u,c,r Contact_group         S OPS} # Check index size of a database, check the size of the library index define service{host_name dbm1slave1 Service_descriptIon MongoDB Database Index size your-database check_command check_mongodb_database!database_indexes!27017!50!100!yo Ur-database max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 Check_period 2 4x7 notification_interval notification_period 24x7 notification_options w,u,c,r Contact_gro  UPS OPS} # Check index size of a collection, check the collection of the collection define service{host_name Dbm1slave1 service_description MongoDB Database index size your-database check_command Check_mongodb_collec Tion!collection_indexes!27017!50!100!your-database!your-collection max_check_attempts 5 Normal_check_interva         L 3 retry_check_interval 2 check_period 24x7 notification_interval notification_period 24x7 Notification_options w,u,c,r contact_groups Ops} # Check The primary server of Replicaset. Check for replicated primary service define Service{host_name dbm1slave1 service_description MongoDB replicaset Master monitor:your-replicaset Check _command check_mongodb_replicaset!replica_primary!27017!0!1!your-replicaset #演示样例: Check_command check_mongodb_rep Licaset!replica_primary!27017!0!1!shard2 max_check_attempts 5 Normal_check_interval 3 Retry_check_int Erval 2 check_period 24x7 notification_interval notification_period 24x7 Notification_optio        NS W,u,c,r contact_groups OPS} # Check the number of queries per second, check the amount of queries per second define Service{ HOST_NAME dbm1slave1 service_description MongoDB Updates per Second check_command check_mongodb_query!q Ueries_per_second!27017!200!150!update max_check_attempts 5 Normal_check_interval 3 Retry_check_inter  Val 2 check_period 24x7 notification_interval notification_period 24x7 notification_options W,u,c,r Contact_gRoups OPS} # Check Primary Connection, checking the connection time with the Primary library in the copy, more than 2 seconds of ordinary alarm, more than 4 seconds of serious alarm define service{Host_na        Me dbm1slave1 service_description Mongo Connect Check check_command check_mongodb!connect_primary!27017!2!4 Max_check_attempts 5 normal_check_interval 3 retry_check_interval 2 check_period 24x7 n        Otification_interval notification_period 24x7 notification_options w,u,c,r contact_groups Ops }# Check Collection State. Check the collection status, check each host for the MONGO Service group list, and be able to check for high availability of important collection (locks, timeouts, availability of service configurations). Suppose a query fails and the alarm is reported. Define service{host_name dbm1slave1 service_description Mongo Collection State Check_command Check_m        Ongodb!collection_state!27017!your-database!your-collection max_check_attempts 5 Normal_check_interval 3        Retry_check_interval 2 check_period 24x7 notification_interval notification_period 24x7   Notification_options W,u,c,r     Contact_groups OPS} 




4 to view the effects of some of the monitoring items

The Nagios end service is configured. Restart the service Nagios restart; After a few minutes, the Nagios monitoring interface will appear with the full MONGO service information, such as the following:




5
, from PS determined in MongoDB The Architecture

[Email protected] ~]# ps-eaf|grep MONGO

MongoDB 2457 1 0 2013?

2-03:39:08./mongod--configsvr--dbpath/home/data/mongodb/config--port 20000--logpath/home/data/mongodb/ Config.log--logappend--fork

MongoDB 2804 1 0 2013? 1-10:02:33 MONGOs--configdb 192.168.12.62:20000,192.168.12.63:20000,192.168.12.72:20000--port 30000--chunkSize 64- -logpath/home/data/mongodb/mongos.log--logappend--fork

MongoDB 3072 1 0 2013?

1-10:17:20 mongod--shardsvr--replset shard1--port 27017--dbpath/home/data/mongodb/shard11--oplogsize 2048--logpath /home/data/mongodb/shard11.log--logappend--fork

Root 11179 9391 0 11:14 pts/1 00:00:00 grep MONGO

MongoDB 30414 1 0 Feb14? 1-06:20:50 mongod--shardsvr--replset shard2--port 27018--dbpath/home/data/mongodb/shard21--oplogsize 2048--logpath /home/data/mongodb/shard21.log--logappend--fork

[Email protected] ~]#

See there are 4 MONGO processes,

A) the start-up of the "--configdb" is the cluster entry process;

b) Shard Server, start the reference with "--shardsvr--replset" is a cluster of shards of the boot process, the user stores the actual data blocks, that is, 27017port and 27018port MongoDB service instance. As to how to infer which of the 27017port is primary which is secondary need to log in 27107port run Rs.status (); Go check it out.

c) Config Server: Start the process with "--CONFIGSVR", storing the entire cluster Metadata, which contains chunk information, that is, 20000port MongoDB service instance.

d) Route Server: Start the process with "MONGOs--configdb", front-end routing, the client is connected. And make the whole cluster look like a single database, front-end applications can be used transparently. This is the 30000port MongoDB instance.



6, errors in debugging

Error 1:

[Email protected] Nagios ~]# tail-f/usr/local/nagios/var/nagios.log

[1412819956] Warning:return code of for check the service ' Mongo Memory Usage ' on host ' dbm1slave1 ' is out of bounds.

[1412819956] SERVICE alert:dbm1slave1; Mongo Memory Usage; CRITICAL; soft;1; (Return Code of Bounds)

[1412819975] Warning:return code of for check the service ' Mongodb Connect check ' on host ' dbm1slave1 ' is out of bound S.

[1412819975] SERVICE alert:dbm1slave1; Mongodb Connect Check; CRITICAL; soft;1; (Return Code of Bounds)

[1412820058] Warning:return code of for Check in service ' Mongo free Connections ' on host ' dbm1slave1 ' is out of Boun Ds.

need to assign a value Nagios Full user rights and R Run Permissions

chmod 770/usr/lib/nagios/plugins/check_mongodb.py

Chown-r nagios.nagios/usr/lib/nagios/plugins/check_mongodb.py

Error 2:

Monitoring interface status information A column appears No module named Pymongo error message:

This prompt appears due to the need to install the Pymongo module, run easy_install Pymongo command installation. For example, see the following:

[Email protected] objects]# Easy_install Pymongo

Searching for Pymongo

Reading http://pypi.python.org/simple/pymongo/

Best Match:pymongo 2.7.2

......

Zip_safe flag not set; Analyzing Archive Contents ...

Adding Pymongo 2.7.2 to easy-install.pth file

Installed/usr/lib/python2.6/site-packages/pymongo-2.7.2-py2.6-linux-x86_64.egg

Processing dependencies for Pymongo

Finished processing dependencies for Pymongo

--------------------------------------------------------------------------------------------------------------- -

< copyright all, the article agreed to reprint, but must be linked to the source address, otherwise investigate legal liability!>
Original Blog Address: http://blog.itpub.net/26230597/viewspace-1293589/
Hara Douglas Fir (MCHDBA)

--------------------------------------------------------------------------------------------------------------- -


Article: https://github.com/mzupan/nagios-plugin-mongodb/blob/master/README.md


Nagios Monitoring MongoDB Shard Cluster Service combat

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.