MongoDB Replica Set troubleshooting
1. Check the status of the Replica Set
Use db. runCommand ({"replSetGetStatus": 1}); or rs. status ();
2. Check the replication Delay Time
Source: m1.example.net: 30001
SyncedTo: Tue Oct 02 2012 11:33:40 GMT-0400 (EDT)
= 7475 secs ago (2.08hrs) source: m2.example.net: 30002
SyncedTo: Tue Oct 02 2012 11:33:40 GMT-0400 (EDT)
= 7475 secs ago (2.08hrs:
Network latency
You can use the ping and traceroute commands to detect network conditions.
Disk Throughput
If a Secondary disk cannot refresh data to the disk as quickly as a Primary disk, it cannot be synchronized with Primary. You can use iostat or vmstat to check disk usage.
Concurrency
In some cases, if there are long-term operations on Primary, the replication operation of Secondary may be blocked. Write concern can be considered. Then, check whether slow queries exist.
Appropriate Write Concern
Replica Acknowledge Write Concern
Replica Set Write Concern
3. Connection tests between all members
Members of the Replica Set must be able to communicate with each other and check the firewall settings.
4. Restart multiple Secondar Socket Exceptions
When you restart multiple members in the Replica Set, make sure that you can select a Primary. If the program encounters a socket connection error during maintenance, you can check the keepalive settings of TCP.
In cat/proc/sys/net/ipv4/tcp_keepalive_timeLinux, tcp_keepalive_time is set to 7200 seconds by default, that is, two hours. You can set the value of this parameter on the server where all MongoDB instances are located to 300 seconds.
Echo 300>/proc/sys/net/ipv4/tcp_keepalive_time will disappear after restart and need to be modified again. You can directly modify/etc/sysctl. conf and then execute sysctl-p
5. Check the Oplog size.
The larger the oplog, the more acceptable the latency.
Use db. printReplicationInfo (); to view the oplog size
123456 db. printReplicationInfo ();
Configured oplog size: 50278.6203125 MB
Log length start to end: 143109 secs (39.75hrs)
Oplog first event time: Wed Mar 18 2015 00:36:53 GMT + 0800 (CST)
Oplog last event time: Thu Mar 19 2015 16:22:02 GMT + 0800 (CST)
Now: Thu Mar 19 2015 17:32:42 GMT + 0800 (CST)
If you reset the oplog size, you must set all the members to the same size.
6. Oplog Entry Timestamp Error
If the following error occurs in the log:
ReplSet error fatal couldn't query the local. oplog. rs collection. Terminating between d after 30 seconds.
<Timestamp> [rsStart] bad replSet oplog entry?
MongoDB 3.0 official version released and downloaded
CentOS compilation and installation of MongoDB
CentOS compilation and installation of php extensions for MongoDB and mongoDB
CentOS 6 install MongoDB and server configuration using yum
Install MongoDB2.4.3 in Ubuntu 13.04
MongoDB beginners must read (both concepts and practices)
MongoDB Installation Guide for Ubunu 14.04
MongoDB authoritative Guide (The Definitive Guide) in English [PDF]
Nagios monitoring MongoDB sharded cluster service practice
Build MongoDB Service Based on CentOS 6.5 Operating System
MongoDB details: click here
MongoDB: click here
This article permanently updates the link address: