Like anything that contains precious data, PostgreSQL databases should also be backed up frequently. Although this process is quite simple, we should still understand the skills and assumptions used to do this.
There are three completely different methods to back up PostgreSQL data:
SQL dump
File System-level backup
Online backup
Each type of backup has its own advantages and disadvantages.
SQL dump
The idea of SQL dump is to create a text file, which contains SQL commands. When the file is returned to the server, the database in the same state as the dump will be rebuilt. PostgreSQL provides the application tool pg_dump for this purpose. The basic usage of this command is:
Pg_dump dbname> OUTFILE
As you can see, pg_dump outputs the results to the standard output. We can see below what are the benefits of doing so.
Pg_dump is a common PostgreSQL client application (although it is quite clever .) This means that you can back up data from any remote host that can access the database. However, remember that pg_dump does not run with any special permissions. Specifically, you must have the read permission on the table you want to back up. Therefore, you almost always need to become a database superuser.
To declare which user should pg_dump be connected, use the command line options-H host and-P port. The default host is the value declared by the local host or your environment variable pghost. Similarly, the default port is the environment variable pgport or compiled default value (if it does not exist. (The server usually has the same default value, so it is quite convenient .)
Like any other PostgreSQL client application, pg_dump does not require time-saving connection with a database user name with the same name as the current operating system user name. To override this name, either declare the-u option or set the environment variable pguser. Note that the connection of pg_dump must also pass the customer authentication mechanism like that of common customer applications.
The backups created by pg_dump are consistent internally. That is to say, database updates will not be dumped when pg_dump is running. Pg_dump does not block other database operations. (But it will block the operations that require the exclusive lock, such as vacuum full .)
Important: If your database structure depends on OID (for example, using a foreign key), you must tell pg_dump to reverse the oId. To reverse the OID, you can use the-O command line option. It also does not dump "large objects ". If you are using a large object, see the pg_dump command manual page.
Recover from dump
The text file generated by pg_dump can be generated by PsqlProgramRead. Common commands for recovering from dump are:
Psql dbname <infile
Infile is the OUTFILE parameter of the pg_dump command. This command does not create the database dbname. You must create it from template0 (that is, run createdb-T template0 dbname) before executing Psql ). Psql supports options similar to pg_dump to control the database server location and user name. For more information, see the Psql manual.
Before the recovery starts, the target database and all users who have objects in the dumped database and who have been granted permissions on some objects must already exist. If the objects do not exist, the restoration fails because the objects cannot be restored to their original ownership and/or permissions. (Sometimes you want to restore permissions, but you usually do not need to do so .)
Once the recovery is completed, it is wise to run analyze on each database, so that the optimizer has useful statistics. You can always run vacuumdb-a-Z to vacuum analyze all databases. This is equivalent to running vacuum analyze manually.
Pg_dump and Psql can be read and written through pipelines, so that we may dump the database directory from one host to another, such
Pg_dump-H host1 dbname | Psql-H host2 dbname
Important: the dump output generated by pg_dump is relative to template0. This means that any language or process added to template1 will be dumped through pg_dump. As a result, if you use custom template1 during recovery, you must create an empty database from template0, as shown in the preceding example.
Suggestions on how to effectively load a large amount of data into PostgreSQL.
Use pg_dumpall
The above method is troublesome and inconvenient to back up the entire database cluster. Therefore, we provide the pg_dumpall program. Pg_dumpall backs up each database in a given cluster and ensures that the global data status such as user and group is retained. The basic usage of this command is:
Pg_dumpall> OUTFILE
The generated dump can be restored using Psql:
Psql template1 <infile
(In fact, you can declare any existing database for connection, but if you load data to an empty database, template1 is your only choice .) Restoring the pg_dumpall dump usually requires the database superuser permission, because we need it to restore user and group information.
Process large databases
Because PostgreSQL allows a table to be larger than the maximum file size allowed by your system, it is possible to dump the table to a file, because the generated file may be larger than the maximum file allowed by your system. Because pg_dump is output to the standard output, you can use the standard UNIX tool to bypass this problem:
Use the compressed dump. Use the compressed program you are familiar with, such as gzip.
Pg_dump dbname | gzip> filename.gz
Run the following command to restore data:
Createdb dbname
Gunzip-C filename.gz | Psql dbname
Or
Cat filename.gz | gunzip | Psql dbname
Use split .. The split command allows you to use the following method to split the output into acceptable sizes of the operating system. For example, set the size of each block to 1 MB:
Pg_dump dbname | split-B 1 m-filename
Run the following command to restore data:
Createdb dbname
Cat filename * | Psql dbname
The custom dump format is used. If PostgreSQL is created on a system with the zlib compression library installed, the custom dump format compresses data when writing data to the output file. It will generate a dump file of the same size as gzip, but it also adds an advantage: You can selectively restore tables in the database. The following command dumps a database in a custom dump format:
Pg_dump-Fc dbname> filename
The custom format dump is not a script and cannot be used for Psql, but must be used for pg_restore dump. For more information, see the pg_dump and pg_restore manuals.
Note:
For backward compatibility, pg_dump does not dump large objects by default. To dump large objects, you must use the custom or tar output format and use the-B Option in pg_dump. For more information, see the pg_dump manual. The contrib/pg_dumplo path in the PostgreSQL source code tree also contains a program that can dump large objects.
Another backup policy is to directly copy the files that PostgreSQL uses to store database data.
Tar-CF backup.tar/usr/local/pgsql/Data
However, you have to be subject to two restrictions, making this method less practical, or at least inferior to the pg_dump method:
The database server must be shut down for effective backup. A compromise method like rejecting all connections is not acceptable, because there is always some buffer data. (Mainly because tar and similar tools do not take atomic snapshots of the file system status during Backup ).
If you have a deep understanding of the layout of the database in the file system, you may try to back up several tables or databases from the corresponding file or directory. This is useless, because the information contained in these files is only part of the information. Half of the information is in the submitted Log File pg_clog/*, which contains the commit status of all transactions. Only with this information is the table file information available. Of course, it is futile to try to recover only tables and related pg_clog data, because this will take out all the other useless table information in the database cluster. Therefore, file system backup is only applicable to the full recovery of a Database Cluster.
Another file system backup method is to create a "consistent snapshot" for the data directory, provided that the file system supports this function (and you are willing to believe it is implemented correctly ). A typical process is to create a "Freeze snapshot" for the volume containing the database, and then copy the entire database directory (not just part, see the above) from the snapshot to the backup device, then release the frozen snapshot. In this way, the database server can run even when it is running. However, this backup will save the database file in a state that does not properly shut down the database server. Therefore, if you start the database server in this Backup Directory, it will think that the database server has experienced a crash and replays Wal logs. This is not a problem. Just be aware of it (and be sure to include the wal file in your backup ).
If your database is distributed on multiple volumes (for example, data files and Wal logs are stored on different disks), there may be no way to obtain accurate synchronized frozen snapshots on all volumes. Read your file system documentation carefully before using consistent Snapshot technology in the context of your news. The safest way is to shut down the database server for a long enough time to create all frozen snapshots.
It should also be noted that file system backup is not smaller than SQL dumping. On the contrary, it is larger in most cases. (For example, pg_dump does not need to reverse the index, but just creates their commands .)
At any time, PostgreSQL maintains a set of pre-write logs (WAL) in the pg_xlog/subdirectory of the Cluster's data directory ). These logs record the details of each modification to the database data file. These logs exist to prevent the crash: if the system crashes, the database can restore the integrity of the database by "replaying" the log records since the last checkpoint. However, the existence of logs allows it to be used in the third database backup policy: we can combine file system backup and Wal file backup. If recovery is required, we will recover the backup, replay the backed up Wal file, and restore the backup to the current time. This method is obviously more complex for administrators than the previous method, but has obvious advantages:
At the beginning, we didn't need a perfect consistent backup. Any inconsistency within the backup will be modified correctly by the log replay action (this is no different from what happened during crash recovery ). Therefore, we do not need the file system snapshot function. We only need tar or similar archiving tools.
Since we can connect infinite Wal file sequences, continuous backup is simplified to continuous Wal file archiving. This function is especially useful for large databases because full backup of large databases may not be convenient.
We do not have to replay the wal record until the end. We can stop replay at any point of interest, so that there is a consistent snapshot of the database at any time. Therefore, this technology supports Instant recovery: we can restore the database to any point in time since you started the backup.
If we continue to fill the wal file sequence with other machines with the same basic backup file, we have a "hot backup" system: we can start the second machine at any point, and it has nearly the current database copy.
Like the simple File System Backup technology, this method can only support the restoration of the entire database cluster, rather than a subset. Similarly, it also requires a large amount of archive storage: the Basic backup volume may be large, and the BUSY system will generate many megabytes of Wal traffic to be backed up. However, it is still the best backup technology in the case of high reliability.
To successfully recover from an online backup, you need a set of consecutive Wal archive files, which can be traced back to the time when you start the backup. Therefore, to start backup, you should set and test your steps before starting the first basic backup. According to the archive Wal file mechanism we have discussed.
1. Set Wal Archive
Abstract: A running PostgreSQL system generates an infinitely long Wal log sequence. Physically, the system splits the sequence into Wal files, which are usually 16 Mb in size (the size can be changed when PostgreSQL is created ). The names of these segment files are numerical values that reflect their locations in the extracted Wal sequence. When Wal archiving is not applicable, the system usually only creates several segment files and then "loops" them to use them by renaming the names of the segments that are no longer in use to a higher segment number. The system assumes that the segment files whose content is older than the previous checkpoint are useless and can then be recycled.
When archiving Wal data, we want to capture it after each segment file is filled up and save the data somewhere before the segment file is recycled. Depending on the application and available hardware, we can have many different methods to "save data somewhere": We can copy the segment file to an NFS Assembly directory, put them on another machine, or write them into a tape drive (you must ensure that you have a way to restore the files to their original names), or pack them into a package and burn them into a CD, or other methods. To provide database administrators with maximum flexibility, PostgreSQL tries not to make any assumptions about how to archive. Instead, PostgreSQL asks the Administrator to declare a shell command for execution to copy a complete segment file to the desired location. This command can be simply a CP, or it can call a complex shell script-all of which are determined by the Administrator.
The shell command used is declared by the configuration parameter archive_command. It is always stored in the PostgreSQL. conf file. In this string, % P is replaced by the absolute path of the file to be archived, while % F is replaced by the file name. If you need to embed a real % in the command, write %. The simplest and most useful commands are similar to the following:
Archive_command = 'cp-I % P/mnt/Server/archivedir/% F </dev/null'
It copies the wal segment to the/mnt/Server/archivedir directory. This is just an example. It is not the recommended method and may not run correctly on all systems.
The archive command is executed under the permission of the same user running the PostgreSQL server. Therefore, the archived Wal file actually contains everything in your database, so you should ensure that your archive data is not snooped into; for example, archive to a directory without group or global read permission.
One thing is important: it returns zero only when the archive command is successful. After a zero value result is obtained, PostgreSQL will assume that the wal segment file has been archived successfully, so it will be deleted later or overwritten by new data. However, a non-zero value tells PostgreSQL that the file is not archived. Therefore, it will retry periodically until the file is successfully archived.
Archive commands should generally be designed to reject overwriting existing archive files. This is a very important security feature. You can maintain the integrity of your archive when the Administrator misoperations (for example, sending the output of two different servers to the same archive directory. We recommend that you first test whether you are going to use the archive command to ensure that it does not actually overwrite the existing file and in this case it returns a non-zero state. We found that CP-I is correct on some platforms and incorrect on other platforms. If the selected command itself does not correctly handle this problem, you should add a command to detect whether the archive file exists in advance. For example, something similar to the following.
Archive_command = 'test! -F.../% F & CP % P.../% F'
Works correctly in almost all UNIX variants.
When designing your archive environment, consider what will happen if the archive command fails constantly because the operator is required to interfere or the archive space is insufficient. For example, if you write data to a tape drive but do not change it automatically, this may happen. If the tape is full, you won't be able to do anything unless you change the tape. You should ensure that people and error conditions, or that the operator is required to interfere with errors, are reported correctly so that these problems can be solved quickly. Otherwise, the pg_xlog/directory will keep filling the wal file until the problem is solved.
The speed of archiving commands does not matter, as long as it can keep up with the average speed of Wal data generated by your server. Even if the archiving process falls behind, normal operations will continue. If the archiving process is too slow, it will increase the amount of data lost in the event of a disaster. It also means that the pg_xlog/directory contains a large number of Unarchived log segment files and may eventually exceed the disk space. We recommend that you monitor the archiving process to ensure that it runs according to your consciousness.
If you are concerned about restoring to the current instant state, you may need to take several additional steps to ensure that the current partially filled Wal segment is also copied to some places. This is especially important for servers that generate few Wal traffic (or when there is a slack stage in operation), because it may take a long time before a Wal segment file is fully filled and can be archived. One possible way to handle this is to set up a cron job, periodically (for example, once every minute) to identify the current Wal segment files and save them to a safe place. The archived Wal segment and the saved current segment are enough to ensure that you can always restore to within one minute of the current time. This behavior is not yet built into PostgreSQL, because we do not want to complicate the definition of archive_command, because it is required to track and archive the same Wal file with different meanings at different times. Archive_command is only used to process Wal segment files that are no longer changed. Except for error retry, it is called only once for any given file name.
When writing your own archive command, you should assume that the archived file is up to 64 characters long and can contain ASCII letters, numbers, and any combination of points. We do not need to remember the original full path (% P), but it is necessary to remember the file name (% F ).
Please note that although Wal archive allows you to reply to any changes made to your PostgreSQL database, it will not reply to modifications to the configuration file after the initial basic backup (that is, postgreSQL. conf, pg_cmd.conf, and pg_ident.conf), because these files are manually edited, rather than being edited through SQL operations. Therefore, you may need to put your configuration file in a place that can be processed in a routine file system backup process.
2. Perform a basic backup
The basic backup process is quite simple:
Make sure that the wal archive is enabled and can run.
Connect to the database as a database superuser and issue commands
Select pg_start_backup ('label ');
The label here is the unique identifier of any backup operation you want to use. (A good habit is to use the full path of the destination where you want to place the backup dump file .) Pg_start_backup uses your backup information. In your cluster directory, create a backup label file called backup_label.
It does not matter which database you connect to the cluster. You can ignore the result returned by the function. However, if it reports an error, You can process it before continuing.
Perform backup and use any convenient file system tool, such as tar or cpio. During these operations, you do not need to shut down the database or the database operations.
Connect to the database as a database superuser again, and then issue a command
Select pg_stop_backup ();
If the returned result is successful, your work will be completed.
We don't need to worry too much about the overhead time between pg_start_backup and the actual backup start, or the time between the backup end and pg_stop_backup. A few minutes of delay won't mess up things. However, you must ensure that these operations are executed in sequence rather than overlapping operations.
Make sure that your backup dump contains files (such as/usr/local/pgsql/data) in the directory of all database clusters ). If you are using tablespaces that are not placed in this directory, be careful to include them (and make sure that your backup dump archive symbolic connections are symbolic connections; otherwise, recovery will mess up your tablespace ).
However, you can omit the pg_xlog/subdirectory in the cluster directory in the backup dump file. This slightly complex action is worth it because it reduces errors during recovery. If pg_xlog/is a symbolic connection pointing to a directory outside the cluster, this is easy to handle and is often done for performance considerations.
To use this backup, you need to save all the wal files starting from and after the backup. To help you implement this task, the pg_stop_backup function creates a backup history file, which is immediately stored in the wal archive area. The name of this file is the name of the first Wal file you need when using backup. For example, if the start Wal file is 000000000001234000055cd, the backup history file will be named something like 000000000001234000055cd. 007c9330. Backup. (The second part of the file name indicates the exact location in the wal file, which can be ignored .) Once you safely archive the backup dump file, you can delete all the archived Wal segments whose numeric names are in front of the file. The backup history file is just a small text file. It contains the tag string you gave pg_start_backup, And the backup start time and end time. If you use this label to indicate where the dump file is stored and if necessary, the archived history file is enough to tell you where the dump file is stored.
Because you must keep all the archive Wal files until your last basic backup, the interval between two basic backups is usually determined based on how much storage space you want to spend on archiving Wal files. You should also consider how much time you are going to spend on restoration. If you need to recover, the system will need to replay all those segments. If it takes a long time since the last basic backup, then those actions may take some time.
Another thing worth mentioning is that the pg_start_backup function creates a file named backup_label in the Database Cluster directory, which is deleted by pg_stop_backup. This file will also be archived as part of your backup dump file. This backup Tag file contains the tag string you gave pg_start_backup, the running time of pg_start_backup, and the name of the starting Wal file. If there is confusion, we can look at the backup dump file and then determine the backup Session from which the dump file comes from.
We can also create a backup dump when the postmaster is stopped. Under such conditions, it is obvious that you cannot use pg_start_backup or pg_stop_backup. Therefore, you must follow up what backup dump files are and where the wal files are most likely to be taken. Generally, the above online backup steps are better.
3. Restore from online backup
Okay, the worst thing happened. Now you need to recover from the backup. The steps are as follows:
Stop Postmaster if it is still running.
If you still have enough space, copy the entire cluster data directory And all tablespaces to a temporary location, just in case you need them again. Please note that this precaution requires that you have enough space in the system to keep two copies of the existing database. If you do not have enough space, you must at least copy the contents of the pg_xlog sub-directory in the cluster data directory to a safe place, because they may contain logs that are not archived When the system goes down.
Then, clear all the existing files in the Cluster's data directory and all the existing files in the Regan directory of the tablespace you are using.
Recover database files from your backup dump. Be careful with the correct owner (database system user, not root !) And restore them. If you use a tablespace, you may need to verify that the symbolic connections in pg_tblspc/are correctly restored.
Delete any files that are still in pg_xlog/. These files are from backup dump, so they may be older than the current one. If you do not archive pg_xlog/, you must re-create the directory pg_xlog/archive_status /.
If you have the wal segment files saved in step 2, copy them to pg_xlog /. (It is best to copy them instead of moving them back, so that even if something bad happens, you still have unmodified files when you need to restart them .)
Create a recovery command file recovery. conf in the cluster data directory (see recovery settings ). You may also need to temporarily modify pg_hba.conf to avoid normal user connection until you are sure that the recovery is normal.
Start postmaster. The postmaster enters the recovery mode and continues to read the archive Wal files it requires. After the recovery process is complete, postmaster will change the recovery. conf name to recovery. Done (to avoid accidentally entering the recovery mode again due to subsequent crashes) and then start normal database operations.
Check the database content to ensure that you have recovered to your desired location. If not, go back to step 1. If all are normal, the pg_mirror.conf status will be restored to normal, allowing your users to log on.
Set a recovery command file for all the key parts of these operations, which describes how you want to recover and where to recover. You can use recovery. conf. Sample (usually installed in the share/subdirectory of the installation directory) as a prototype. One thing you must declare in rediscovery. conf is restore_command, which tells the system how to retrieve the archived Wal file segment. Similar to archive_command, This is a script command string. It can contain % F, which will be replaced by the required Log File Name and % P, which will be replaced by the absolute path of the log file to be copied. If you need to replace the real % in the command, write %. The simplest and most useful command is something similar to the following:
Restore_command = 'cp/mnt/Server/archivedir/% F % P'
This command will copy the previously archived Wal segments from the directory/mnt/Server/archivedir. Of course you can use something more complex, or even a shell script that requires the operator to assemble suitable tapes.
It is important that the non-zero value is returned when the command fails. If the log file does not appear in the Rule file, the system will ask for the command; when asked, it must return non-zero. This is not an error condition. Note that the basic names of the % P path are different from those of % F. do not consider them interchangeable.
Wal segments that cannot be found in the archive will be considered in pg_xlog/; this allows the use of segments that have not been archived recently. However, the segments in the archive will take precedence over those in pg_xlog. When retrieving archived files, the system will not overwrite the existing pg_xlog/content.
Generally, recovery will process all available Wal segments, so the database is restored to the current time (or when the number of available Wal segments is given, we can go to the nearest place ). However, if you want to restore to some previous time points (for example, before cainiao DBA Deletes your main transaction table), you only need. in Conf, declare the required stop point. You can declare the stop point by date or time, or by the end of a specific transaction ID. This is called "Restore target ". When we write this, only the date/time option is more useful, because we don't have a tool to help you identify exactly which transaction ID should be used.
Note: The stop point must be after the backup end time (that is, the pg_stop_backup time ). You cannot use a basic backup to restore to a certain time when the backup is in progress. (To restore to this point, you must go back to your previous basic backup and scroll forward from that point .)
3.1. Restore settings
These settings can only be used in recovery. conf and only play a role in the recovery process. In any subsequent recovery, you must reset them. Their values cannot be changed after the recovery process starts.
Restore_command (string)
Execute the shell command to retrieve the sequence of archived Wal file segments. This parameter is required. Any % F in the string is replaced by the file name retrieved from the archive, and any % P is replaced by the absolute path on the server that copied it. When the real % characters need to be embedded in the command, write %.
It is very important that this command returns zero only when the command is successful. The system will ask this command about the file name that does not appear in the archive; in this case, it must return a non-zero value. For example:
Restore_command = 'cp/mnt/Server/archivedir/% F "% P "'
Restore_command = 'Copy/mnt/Server/archivedir/% F "% P" '# windows
Recovery_target_time (timestamp)
This parameter declares the timestamp at which the execution is resumed. You can declare a maximum of recovery_target_time or recovery_target_xid. By default, it is restored to the end of the wal log. The exact stop point is also affected by recovery_target_sive Sive.
Recovery_target_xid (string)
This parameter declares the ID of the transaction to be restored. It should be noted that although the transaction ID is considered sequential at the beginning of the transaction, the transaction can be completed in different numerical order. The transactions to be restored are committed before the declared transaction (which can be included when it is committed. Up to one recovery_target_xid or recovery_target_time can be declared. By default, it is restored to the end of the wal log. The exact stop point is also affected by recovery_target_sive Sive.
Recovery_target_inclusive (Boolean)
Declare whether we have stopped after the target is restored (true) or just before it (false. It applies to recovery_target_time and recovery_target_xid, regardless of which one is declared. It indicates the transactions with accurate commit time or ID, and whether the transactions will be included in the recovery. The default value is true.
Recovery_target_timeline (string)
The statement is restored to a specific timeline. By default, it is restored to the current time line at the time of basic backup. You only need to set this parameter in the case of complex recovery, that is, when you need to restore to a status that is actually restored, this is required. Refer to section 22.3.4 for discussion.
4. Timeline
The ability to restore the database to a previous point in time has led to complex situations like time tracking and parallel universe in science fiction. In the initial history of the database, you may have deleted a key table at on Tuesday. Then, the backups are taken out in an orderly manner and restored to the instant backup at on Tuesday evening. In the history of this database universe, you have never deleted that table. However, if you later realized that this was not an excellent idea and wanted to return to the point later in history. You can't do this, because when the database is running, it overwrites the sequences of some Wal files, which are in the range you want to go back. Therefore, you do need to differentiate the wal sequence generated after you complete Instant recovery from the history wal of the original database.
To solve these problems, PostgreSQL has a concept called timeline. Each time you instantly restore to a time point earlier than the end of the wal sequence, a new timeline is created to indicate the wal records generated after the restoration. (However, if the restoration action reaches the end of Wal, we will not start a new timeline: We just expand the existing one .) The timeline ID is part of the wal segment file name. Therefore, the new timeline does not overwrite the wal data generated by the previous timeline. In fact, we can archive many different timelines. Although these seemingly useless features, they may often be life-saving. Consider the situation where you are not sure you should recover to that time point. At this time, you have to perform several experimental Instant recovery and find the best branch in the old history. If there is no timeline, this process may soon lead to unmanageable confusion. With the timeline, you can restore to any previous status, including the status of the timeline branch you abandoned later.
Every time a new timeline is created, PostgreSQL creates a "timeline history" file, which shows the time from which it is split and when it is split. These historical files are necessary files that allow the system to select the correct Wal file when being restored from the rule party that contains multiple timelines. Therefore, they are archived to the wal archive just like the wal files. Historical Files are only small text files (do not want to block a large number of files). Therefore, it is worthwhile to save them independently. If you like, you can add comments to history files to record information such as why you set a timeline and how to set it. Such annotations are especially valuable when you have a thick pile of different timelines to choose and analyze.
The default Recovery behavior is performed along the same timeline as the backup basic backup. If you want to restore a sub-timeline (that is, you want to return to a status that occurs after the recovery attempt is started), you need to declare the target timeline ID in recovery. conf. You cannot restore the timeline branch earlier than the basic backup.
5. Note
When we write this, there are several limitations on the online backup technology. They may be patched in future versions:
Operations on non-B-tree indexes (hash, R-tree, and gist indexes) do not currently use Wal to record logs, so these index types will not be updated during replay. We recommend that you manually reindex each of these indexes after completing the restoration operation.
Note that the current Wal format occupies a very large footprint because it contains many Disk Images. This is suitable for crash recovery, because we may need to fix some of the disk pages written to it. However, pitr operations do not have to store so many pages. One aspect of future development is to delete useless page copies to compress the archived Wal data.