Popular text: MongoDB election process. mongoDB's replica set automatically tolerates the downtime of some nodes. When there is a problem with the replica set, the election-related process is triggered to automatically switch between the master and slave nodes. each replica integrator runs the heartbeat thread of all nodes in the replica set in the background. In either case, the status detection process is triggered:
Popular text: MongoDB election process. mongoDB's replica set automatically tolerates the downtime of some nodes. When there is a problem with the replica set, the election-related process is triggered to automatically switch between the master and slave nodes. each replica integrator runs the heartbeat thread of all nodes in the replica set in the background. In either case, the status detection process is triggered:
Popular text: MongoDB election process.
MongoDB's replica set automatically tolerates the downtime of some nodes. When there is a problem with the replica set, the election-related process is triggered to automatically switch between the master and slave nodes.
Each replica integrator runs the heartbeat thread of all nodes in the replica set in the background. In either case, the status detection process is triggered:
- The heartbeat detection result of the replica set Member changes, for example, if a node fails or a new node is added.
- No status detection process is executed for more than 4s.
The status detection process includes the following steps:
- Check whether you are in the election process. If yes, exit this process.
- Maintain the standby list of a master node. All nodes in the list may be elected as the master node. Each node checks whether the node and global conditions are met:
- Whether Majority is online in the replica set.
- Its own priority is greater than 0.
- It is not an arbiter.
- Your opTime cannot lag behind the latest node by more than 10 s.
- Cluster programs stored by themselves are up-to-date by information.
If all the conditions are met, add the node to the standby list of the master node. Otherwise, remove the node from the list.
- If all of the following conditions are met, the master node will be a slave node (if the master node to be downgraded is itself, the downgrade method will be called directly. If not, call the replSetStepDown command to downgrade the master node of the replica set to the slave node .):
- The master node in the cluster exists.
- The "Slave node list" contains a node higher than the current master node priority.
- The opTime of the node with the highest priority in the "Slave node list" is less than 10 s behind the latest opTime of all other nodes.
- Check whether the master node is the master node. If the master node cannot see the Majority of the replica Set Online, downgrade the master node to the slave node.
- If you cannot see that the cluster has a master node, check whether it is in the "Slave list of the master node". If not, print the log and exit the process.
- If you are in the "standby list of the master node", you can determine whether you can send a notification to the replica set to elect the master node. The judgment process includes:
- Can you see that the Majority in the replica set is online.
- Whether it is in the "standby list of the master node ".
If the conditions are met, set "you are already in the election process" to true, and enter the "elect yourself as the primary node" method.
- The method verifies that it meets the following conditions:
- This thread gets the thread lock.
- This node is not configured with the slaveDelay option or the configured slaveDelay is 0.
- This node is not configured as arbiter.
If yes, the system calls the Environment check. if the following conditions are triggeredDo not send"Elect me as the master node:
- The current time is less than the end freeze time of steppedDown (it is the time when steppedDown is executed + the freeze setting time, and the internal call is 60 s ).
- Your opTime is not the latest for all nodes.
- If a node has a newer opTime than itself, exit the process directly.
- If other newest nodes are at most as new as themselves, each of them will sleep randomly for a period of time and then continue to judge.
- Not all nodes in the replica set are online within 5 minutes after the cluster is launched.
- If there are no other problems, you can obtain the number of votes you have voted for. In this process, you can determine whether you have voted for within 30 s. If you have, you can exit the entire process.
- After a variety of complex tests, we can finally send a vote to the replica set for "election me as the master node.
- After sending the message, it will receive votes from all nodes. If the number of votes is less than or equal to half, it will not change itself to the master node. If it is more than half, it will set itself as the master node.
After the vote is over, set "you are already in the election process" to false.
We can see that some of the above judgment logic involves repeated judgments, but it does not affect the final result. It may be related to the complicated judgment logic. Before each decision, we must verify whether all the conditions are met, prevent conditional omission.
When a node in the replica set receives the "election me as the master node" Vote message sent by another node, the following judgment will be made:
- If the configuration version of the replica set stored by Alibaba Cloud is too low, do not vote.
- If the configuration version of the replica set stored on the node that initiates the request is too low, vote against it.
- If the replica set does not contain any node that initiates a vote, vote against the node.
- The master node exists in the replica set and the vote is negative.
- If a node that can participate in the election has a higher priority than a request-based node, it will vote against it.
If all the conditions pass, obtain the number of votes of the user (the user will also determine whether the user has participated in the vote within 30 s, and if the user has participated in the vote, the user will not vote again) and generate the number of votes.
It should be noted that an objection will result in the final number of votesReduced by 10000That is, in most cases, as long as there is a node objection, the requested node cannot become the master node.
The election process is very complex, and the actual use is summarized as follows:
- Generally, it takes about 5 seconds to select a master.
- If the newly selected master node crashes immediately, it takes at least 30 s to reselect the master node.
Original article address: MongoDB election process. Thank you for sharing it.