The "Timeline" (Timeline) is a very distinctive concept for PG, which appears in the documentation of backup recovery. But the detailed explanations for this concept are very few and not very well understood, so let's take a closer look at them.
Introduction of the Time line
To understand the background of the introduction of the timeline, let's analyze if there is no timeline, what is the problem? Let's start with an example of recovering a database to a previous point in time. Assuming that during the run of a database, the DBA deleted a critical table 12:00am in Wednesday, but it was not discovered until noon in Friday. This time the DBA takes out the initial database backup, plus the log file with the archive directory, and restores the database to the Wednesday 11:00am point in time so that it can start and run normally. However, the DBA later realized that it was wrong to revert to the Thursday 8:00am data, and then found it impossible: because the database is constantly running, it will produce files with the same name as the old Wal file, which will overwrite the original log when it enters the archive directory, resulting in the recovery of the database required The Wal file is missing. To avoid this situation, you need to differentiate between the Wal files generated by the original database history and the new Wal files that are generated after the recovery is complete (duplicate name). The entire procedure is shown in 1:
To solve this problem, PostgreSQL introduces the concept of timelines. Whenever the archive file is restored, create a new timeline to differentiate the newly generated Wal record. The Wal file name consists of a timeline and a log sequence number, the source code is implemented as follows:
#define Xlogfilename (fname, Tli, log, seg) \ 1"%08x%08x%08x", Tli, log , SEG)
For example:
ls -100000002.history00000003. history00000003000000000000001a00000003000000000000001b
The timeline ID number is one of the Wal file names, so a new timeline does not overwrite the Wal that was generated by the previous time line. 2, each timeline resembles a branch, and the operation on the current timeline does not affect other time-line Wal, and with the timeline, we can revert to any previous point in time.
What happens at a end of recovery?
- End of recovery means the point where the database opens up for writing
- New Timeline is chosen
- A Timeline History file is written
- The partial last WAL file on the previous timeline are copied with the new timeline ' s ID
- A checkpoint record is written on the new timeline
Example:end of recovery
Log:database system was interrupted; LastKnown up at -- on- - +: $: -eetlog:starting Archive Recoverylog:redo starts at -/e00000c8log:could not openfile "pg_xlog/0000000100000013000000e4": No suchfileor Directorylog:redo DoneAt -/E3d389a0log: LastCompleted transaction is at log Time -- on- - +: $: -+ Genevalog:selected new Timeline ID:2log:archive Recovery completelog:database system is ready to accept connections
First WAL file with new timeline
Timeline history File
0000000100000013000000e10000000100000013000000e20000000100000013000000e30000000100000013000000e4000000010000001300000 0E500000002. history0000000200000013000000e30000000200000013000000e40000000200000013000000e5
Timeline history File
The scene of the new time line
Under what circumstances will the new timeline appear?
1. Instant recovery (PITR)
To configure the recovery.conf file:
Restore_command = " cp/mnt/server/ Archivedir/%f%p " // recover Log from archive directory recovery_target_time = " " // Specifies the archive point in time, such as not specifying the last completed transaction to revert to the failure recovery_target_timeline = latest " // Specify the archive timeline, ' latest ' for the latest timeline branch, such as the time line Standby_mode = ' off '
When the recovery.conf file is set up, starting the database will generate a new timeline and a new history file will be generated. The default behavior for recovery is to recover from the same timeline as the current base backup. If you want to revert to certain timelines, you need to specify a recovery.conf target timeline recovery_target_timeline
that cannot be restored to a point earlier than the base backup branch.
2, Standby promote
Set up a PG Master and then stop the main library and execute it on the standby machine:
$ pg_ctl promote–d $PGDATA
This time the repository will be upgraded to a master, and a new timeline is generated, as well as a new history file.
History file
Each time you create a new timeline, PostgreSQL creates a timeline history file with a similar file name. History, the contents of which are appended with the contents of the original timeline's historical file and a current timeline toggle record. Assuming that the database recovery is started, switch to the new timeline id=5, then the file name is 00000005.history, which records the time and which time line from what the reason is divided, the file may contain multiple lines of records, each record content format is as follows:
* <parentTLI> <switchpoint> <reason> * * parenttli * switchpoint * reason human-readable explanation of the timeline was changed
For example:
Cat 00000004 . History 1 0/140000c8 No recovery target specified2 0/19000060 No recovery target specified 3 0/1f000090 no recovery target specified
When a database is recovered from an archive that contains multiple timelines, these history files allow the system to pick the right Wal file and, of course, it can be archived to the Wal archive directory like a Wal file. History files are only small text files, so it is very expensive to save them.
When we recovery.conf the target timeline tli for recovery, the program first looks for the. History file, based on the timeline branch that is recorded in the. historical file, found from Pg_ Control the log files that correspond to all the timelines between Starttli and Tli, and then restore them.
Summarize
The timeline mechanism in PG can easily realize the database recovery to any point of time, which plays an important role in our database backup. We can reasonably back up and archive our data in the use of the database, and once the data is lost or corrupted, we can methodically use the timeline mechanism to recover the data we need.
Reference:
http://mysql.taobao.org/monthly/2015/07/03/
Https://wiki.postgresql.org/images/e/e5/FOSDEM2013-Timelines.pdf
PostgreSQL Time Line