Primary manages all meta-data, usually stored in memory, so that access to metadata is efficient. But there is a hidden danger, that is if the primary node down, or power down, then all the metadata will be gone. If we can save a copy of the metadata in memory and also save a copy on the hard disk, the data can be recovered even if the power is lost.
The checkpoint mechanism is a mechanism for storing metadata in real time on a hard disk.
First, we introduce several key concepts:
Edits: Log file that records the action that caused the metadata to change.
Fsimage: A mirrored file of metadata that can be understood as a copy of the metadata saved on disk.
Problem 1:fsimage represents a moment of metadata mirroring, metadata is constantly changing, so how is this image updated in real time?
Question 2: How can I generate fsimage in the case of primary namenode normal external service?
The checkpoint steps are as follows:
First step: Secondary Namenode request Namenode stop using edits, temporarily recorded in the Edits.new file
Step Two: Secondary namenode copy fsimage from Namenode, edits to local
Step three: Secondary namenode merge fsimage, edits for fsimage.ckpt
Fourth step: Secondary namenode send fsimage.ckpt to Namenode
Fifth step: Namenode with the new fsimage cover the old Fsimage, with the new edits cover the old edits
Sixth step: Update checkpoint time
To here Fsimage update complete, that is guaranteed primary normal service, also completed the Fsimage update