3.4 Replay Transaction log
Once we have created an initial base backup of our own, we can collect the Xlog created by the database. When the time comes, we can use all these xlog files and perform the recovery process we expect. This works as described in this section.
Perform basic recovery
In PostgreSQL, the entire recovery process has a file management called recover.conf, which resides primarily in the home directory of the underlying backup. It is read at startup and tells the database server where to find the Xlog archive, when to terminate the replay, and so on.
To get you started, we decided to include a simple recovery.conf sample file for the execution of a basic backup process:
Restore_command = ' cp/archive/%f%p '
Recovery_target_time = ' 2013-10-10 13:43:12 '
Restore_command is basically the command that corresponds to the Archive_command command you've seen before. Archive_command should put the data into the archive, Restore_command should use a single file to provide data for the recovery instance. Again, it is a simple shell command that provides one xlog block after another, or a simple shell script. Your options here are limited only by imagination; PostgreSQL is going to check the return code of the code you wrote and provide the data through your script.
Just like in postgresql.conf, we use%p and%f as placeholders, and the two placeholders mean exactly the same as before.
To tell the system when to stop the recovery, we can set the Recovery_target_time. The variable is optional. If it is not specified, PostgreSQL will resume until it runs out of XLOG. In many cases, simply consuming the entire xlog is a very desirable process; if something fails, you need to recover as much data as possible. However, this is not always the case. If you want PostgreSQL to stop your recovery at a specific time, you have to put the critical logs here. The crucial part here is actually, to know how much you want to replay xlog; In a practical work scenario, this has been proven to be a tricky question to answer.
[If you happen to have a recovery_target_time in the future, don't worry, PostgreSQL will start at the closest available transaction in your xlog and simply stop the recovery.] The DB instance will remain consistent and ready to take action. You cannot terminate PostgreSQL, however, in the case of data loss you may terminate your application because the Xlog is missing. ]
Before you start PostgreSQL, you must perform chmod 700 on the directory that contains the underlying backup, otherwise PostgreSQL will output an error:
imac:target_directoryhs$ pg_ctl-d/target_directory\
Start
Server starting
Fatal:data directory "/target_directory" have group or world access
Detail:permissions should be u=rwx (0700).
This additional security check should ensure that your data directory cannot be accidentally read by some users. Therefore, from a security standpoint, an explicit permission change is definitely an advantage (be prepared).
Now that all the work is ready, we can start the replay process by starting PostgreSQL:
imac:target_directoryhs$ pg_ctl–d/target_directory \
Start
Server starting
Log:database system was interrupted; Last known on 2013-03-10
18:04:29 CET
log:creating missing WAL directory "Pg_xlog/archive_status"
Log:starting point-in-time Recovery to 2013-10-10 13:43:12+02
log:restored log File "000000010000000000000006" from archive
Log:redo starts at 0/6000020
Log:consistent recovery state reached at 0/60000B8
log:restored log File "000000010000000000000007" from archive
log:restored log File "000000010000000000000008" from archive
log:restored log File "000000010000000000000009" from archive
log:restored log File "00000001000000000000000A" from archive
CP:/tmp/archive/00000001000000000000000b:no such file or
Directory
Log:could not open File "pg_xlog/00000001000000000000000b" (log
File 0, segment): No such file or directory
Log:redo done at 0/ad5ce40
Log:last completed transaction was at log time 2013-03-10
18:05:33.852992+01
log:restored log File "00000001000000000000000A" from archive
CP:/tmp/archive/00000002.history:no such file or directory
log:selected New Timeline Id:2
CP:/tmp/archive/00000001.history:no such file or directory
log:archive Recovery Complete
Log:database system is ready to accept connections
Log:autovacuum Launcher started
The amount of log generated by the database tells us everything we need to know about the recovery process, which is definitely worth investigating in detail.
The first line indicates that PostgreSQL has found that it has been terminated and that it must be restarted. From the point of view of a DB instance, the underlying backup looks more or less like a crash that needs to be cared for immediately by replaying Xlog, which is exactly what we want.
Next few lines (recover log file ...) Indicates that we are replaying a Xlog file created by the underlying backup one by one. It is worth mentioning that the replay process starts at the beginning of the sixth file. The underlying backup knows where to start, so PostgreSQL will automatically find the appropriate Xlog file.
The information displayed after PostgreSQL arrives at the sixth file (consistent recovery state reached at 0/60000B8) is very important. PostgreSQL indicates that it has reached a consistent state. This is very important. The reason for this is that the data files in the underlying backup are actually explicitly interrupted (see our previous section on Xlog), but the data files are not corrupted and cannot be repaired. As long as we have enough xlog to recover, we have no problem. If you cannot reach a consistent state, your DB instance will not be available, and your recovery may not work without providing additional xlog.
[Realistically speaking, the inability to reach a consistent state usually indicates a problem with your archive process and your system settings.] So far, if everything is working, there is no reason not to reach a consistent state. ]
Once we reach a consistent state, one file after another will be successfully replayed until the system eventually finds the 00000001000000000000000B file. The problem is that the file is not created by the source DB instance. Logically, an error will pop up.
[It is perfectly normal to not find the last file; If Recovery_target_time does not let PostgreSQL stop recovering this type of error before reaching the end of the Xlog stream, it is expected. Don't worry, your system is actually good. Before the error message, you have successfully replayed everything that appears in the file. ]
Once all the xlog have been consumed and the error message has been discussed earlier, PostgreSQL reports the last transaction that could or should be replayed and started. Now that you have a full recovery db instance, you can connect to the database immediately. Once the restore is complete, PostgreSQL will rename the recovery.conf to Recovery.done to ensure that no damage is done when a new instance is restarted at a later point in time.
More advanced positioning in the Xlog
So far, we have used 16MB of transaction log blocks to recover a database to the latest available time. We also see that you can define the desired recovery timestamp. But now the question is: How do you know which time point to perform the recovery? Imagine that someone deleted a table during the day. What if you can't easily determine the recovery timestamp? What if you were to revert to a particular transaction?
Recovery.conf has all you need. If you want to replay until a specific transaction, you can refer to Recovery_target_xid. Simply specify the transaction you need and configure the recovery_target_inclusive to include this very special transaction or not configure it. As mentioned earlier, it is fairly easy to use this setting technically, so far it is not easy to find the right place to replay.
In a typical setup, the best way to find a reasonable point to terminate the recovery is to use Pause_at_recovery_target. If it is set to Ture, PostgreSQL will not automatically turn into a production instance when the recovery point is reached. Instead, he waits for instructions from the database administrator. This is useful if you do not know where to replay to. You can replay, log in, see where the database is, change to the next target time, and continue to replay in small steps.
[You must set Hot_standby = On in postgresql.conf to allow reading in recovery.] ]
Return to recovery after terminating PostgreSQL by invoking a simple SQL statement: SELECT pg_xlog_replay_resume (). It moves the instance to the next location that you set in recovery.conf. Once you have found the correct location, you can set Pause_at_recovery_target to false and call Pg_xlog_replay_resume. Alternatively, you can simply use pg_ctl–d ... promote to stop the recovery and make the instance work.
Is this explanation too complicated? Let's make it a list:
• Add Restore_command to the recovery.conf file.
• Add Recovery_target_time to the recovery.conf file.
• In the recovery.conf file, set Pause_at_recovery_target to True.
• In the postgresql.conf file, set Hot_standby to ON.
• Start the recovery instance.
• Once a consistent state is reached and the recovery is stopped, connect to the instance immediately.
• Check that you are already in the place you are going.
• If you are not in:
° Change recovery_target_time.
° Run Select Pg_xlog_replay_resume ().
° If necessary, check again and repeat the paragraph.
[Remember, once the recovery is complete, once PostgreSQL is started as a normal DB instance, you cannot replay Xlog later, such as 9.2.] Without this process, you can of course use file system snapshots directly. The file system snapshot will always work with PostgreSQL, because when you restart a snapshot DB instance, it will simply assume that it has crashed before and is recovering normally. ]
Mid-way Cleanup Xlog
Once you have configured the archive, you must store the Xlog created by the source server. Logically, this is never going to happen. Sometimes you have to get rid of these xlog; for your files, it is necessary to have a sound and sustainable cleanup strategy.
However, keep in mind that you must maintain enough xlog so that you can always perform recovery from the most recent underlying backup. However, if you are sure that a particular base backup is no longer needed, you can safely clean up any xlog that is earlier than the xlog of the underlying backup you want to keep.
How does the administrator find out what to delete? The simplest way is to look at your archive directory:
000000010000000000000005
000000010000000000000006
000000010000000000000006.00000020.backup
000000010000000000000007
000000010000000000000008
Check the file name in the middle of the list. The backup file has been created by the underlying backup. It contains some ways to create a basic backup and tells the system where to continue replaying xlog. If the backup file belongs to the earliest base backup you need, you can safely clear all xlog below the number 6, in which case the 5th file can be safely removed.
In our example, 000000010000000000000006.00000020.backup contains the following information:
START WAL location:0/6000020 (file 000000010000000000000006)
STOP WAL location:0/60000e0 (file 000000010000000000000006)
CHECKPOINT location:0/6000058
BACKUP method:streamed
BACKUP From:master
START time:2013-03-10 18:04:29 CET
Label:pg_basebackup Base Backup
STOP time:2013-03-10 18:04:30 CET
The. backup file will also provide you with relevant information, such as when the underlying backup was created. It is an ordinary file, so it should be easy for the average user to read this information. As a choice to delete all xlog files at a certain point, it is also possible to clear them during replay. One way to do this is to hide an RM command in your restore_command. While technically possible, it is not necessarily advisable to do so (what if you want to recover again?). )。
You can also add the Recovery_end_command command to your recovery.conf file. The goal of Recovery_end_command is to allow you to automatically trigger some actions as soon as the recovery is complete. Again, PostgreSQL will invoke a script to do what you want to do. When a database declares itself as a master library, you can easily use this setting to clean up older xlog.
Toggle Xlog File
If you want to do a recovery based on the Xlog file, you have seen that each xlog will be archived to 16MB. What happens if you don't try to create a 16MB change? If you are a small supermarket, this will only produce 50 trades per day. Finally, your system will never be able to fill up to 16MB.
However, if your system crashes, potential data loss can be seen as the amount of data in the Xlog file that you did not finish at last. Maybe it's not good enough for you.
A postgresql.conf setting in the source database may be helpful for this issue.
Archive_timeout tells PostgreSQL to create a new Xlog file at least every x seconds. So, if you are in this small supermarket, you can have the database create a new Xlog file every day before you go home. In this case, you can be sure that the data for the day will be securely on your backup device.
It is also possible to manually make PostgreSQL switch to the next Xlog file. The server provides a process called pg_switch_xlog () to do this work:
test=# SELECT Pg_switch_xlog ();
Pg_switch_xlog
----------------
0/17c0ef8
(1 row)
You might call this process when some important patching work is done or you want to make sure that a particular database is securely present in your Xlog archive.
3.5 Summary
In this chapter, you've learned about instant recovery, a safe and easy way to recover your PostgreSQL database to any time you need to ouch. Instant recovery will help you implement a better backup strategy and make your settings more robust.
In the next chapter, we'll expand on this topic and go to asynchronous replication. You will learn how to use the transaction logs of PostgreSQL to replicate the entire DB instance.
The third chapter of PostgreSQL Replication Understanding Instant Recovery (4)