Data backup has always been considered a database of life, that is, one of the main skills of a DBA, this article is about the SQL Server backup principle, the SQL Server database is divided into data files and log files. To enable the database to recover a consistent point, the backup not only needs to copy the contents of the data data file, but also copies the contents of the log file. Then , depending on the target of each backup, we can divide the backup into data backup and log backup.
The scope of a data backup can be a complete database, a subset of databases, a set of files, or filegroups. Therefore, according to the scope of the backed up data file, it is divided into full database backup, file backup and partial backup.
Full database backup
A full database backup is a copy of all the information in the database, and with a single full backup, the database can be restored to a certain time state. However, because database backup is an online operation. A large full database backup can take up to one hours or longer. The database will also change during this time period. So a full database backup also backs up some of the transaction logs so that the database can be restored to a transactionally consistent state.
Full database backups are easy to use. It contains all the data in the database. For small data that can be backed up quickly, the best approach is to use a full database backup. However, as the database grows, full backups take more time to complete and require more storage space. Only a full backup may not meet the user's needs.
File backup
A file backup refers to the backup of all data in one or more files or filegroups. Under the full recovery model, a complete set of file backups combined with log backups across all file backups is equivalent to a full database backup. Using a file backup can speed up recovery by restoring only corrupted files without restoring the rest of the database. For example, if a database consists of several files located on a different disk, in which one of the disks fails, only the backup of the files on the failed disk needs to be restored, and the files on the other disks need not be restored. This will shorten the restore time.
Partial backup
A partial backup is a new feature in SQL Server2005. A partial backup is similar to a full database backup, but a partial backup defaults to only the parts that the database reads and writes, and the read-only files for the database are not backed up. Because the read-only part is not changed. Always going to back it up a bit wasteful. So a partial backup is useful when you want the backup to not include read-only filegroups.
A partial backup can be said to be an intermediate type between a database part and a file backup. If there is no read-only file in a database, then there is no difference between a partial backup and a database backup.
Database files are often very large. With the trend in the popular data set, databases with terabytes of storage are now commonplace. For such a database, to do a database backup, even if the file backup is a very expensive thing, may not be able to do every day. Then there's the background: differential backup.
From whether to copy all the data, data backup can be divided into full backup and differential backup.
Differential backups are based on differences, and backups require that a full backup be done before the database. A differential backup captures only data that has changed since the full backup. This full backup is referred to as the "benchmark" for differential backups. A differential backup includes only the data that was changed after the differential base. Differential backups are smaller and faster than differential baselines, making it easy to perform frequent backups, reducing the risk of data loss.
For full database backups, file backups, and partial backups, the 3 forms of data backup allow SQL Server to do full and differential backups. So, there are 6 different data backup modes: Full database backup, full file backup and full partial backup, differential database backup, differential file backup, and differential partial backup.
Data backup focuses on the backup of the data files. For log files, there is a corresponding transaction log backup. Each log backup includes a portion of the transaction log that was active when the backup was created, and all the log records that were not backed up in the previous log backup. An uninterrupted sequence of log backups contains a backup of the database (that is, a continuous) log chain. In the full recovery model (or at some point under the bulk-logged recovery model), a continuous log chain allows you to restore the database to any point in time.
After SQL Server2005, a new type of backup mode has been added, that is, copy-only backups. Copy-only backup is a SQL Server backup that is independent of the regular SQL backup sequence. Typically, making a backup changes the database and affects the restore sequence of subsequent backups. However, backups that are made for special purposes are sometimes useful in situations where the full database backup and restore process is not affected. To do this, SQL Server2005 introduces the following two copy-only backups:
1. Copy only full backups
Copying only full backups also backs up the entire database content. The difference between it and a normal full backup is that the differential backup benchmark does not change after it is done, so it does not affect the differential backup sequence.
2. Copy-only log backups
A copy-only log backup backs up only the existing content in the current log file, but does not truncate the log, so the next time you do a normal log backup, the content restore is backed up again, without affecting the sequence of regular log backups. This backup is mainly used on the database already has a backup scheduled task running, but now need to do a log backup, without affecting the original backup sequence.
The above two methods SSMS does not support the graphical operation, only need to add the Copy only option after the BACKUP statement
Now summarize the 11 main backup methods of SQL Server
Grade |
Data backup |
Log backup |
Database-level |
Full database backup |
Copy only full database backups |
Differential database backup |
General Log backup |
Copy-only log backups |
FILE-level |
Full file backup |
Copy only full file backups |
Differential file backup |
Part |
Full Partial backup |
Copy only full partial backups |
Differential partial backup |
There are many ways to backup, in fact we often use a few important ways.
First, the only way to replicate backups is to create a way to break the existing backup strategy for a backup that will be done, for example, for a database that has established a strict backup rule (such as log Shipping) and now needs to make a log backup to another folder. A normal log backup destroys the log chain maintained by the existing backup file system. Copy-only backups are not compromised. All this method is used only in the occasional special case. Not considered at the beginning of the specified backup strategy.
Second, in reality, few databases specialize in maintaining a read-only file or set of files. (This method is expensive to maintain and will only show an advantage on a very large database.) So part of the backup is rarely used.
So the above backup method can be simplified into a few more traditional, but also the most commonly used backup method
Grade |
Data backup |
Log backup |
Database-level |
Full database backup |
Differential database backup |
General Log backup |
FILE-level |
Full file backup |
Differential file backup |
Of course, there is a kind of violent backup method that is to copy the database file directly, and then use the file Attach (Attach) way to back up and restore the database, this way when the case is paralyzed, helpless can try to take, but this way is not recommended as a standard way.
There are several reasons for not recommending the following:
1, SQL Server at the time of operation, the file has been placed in an exclusive lock, through the general method is not directly copied files. Unless you pass some backup software, you can only stop the SQL Server service or shut down the database to back up files.
2, SQL Server in theory, only guaranteed by running the sp_detach_db statement to get the database file, it must be successfully attached. If the user obtains the file by pausing the SQL Server service or other method, SQL Server cannot guarantee that it will be attached.
3, some users only copy data files, do not copy the practice of log files, is very non-standard. It is easy to cause the database to not recover properly. Data loss.
Copy the method of the file is not sure to use, the author when the disaster recovery, if the database is not very large, will first do a database backup, in a file-level backup, in order to double insurance. The file copy occurs after SQL Server has been successfully shut down, or after sp_detach_db, and all files are backed up, including log files.
How to select a backup policy and recovery model
SQL Server provides enough technology to make a variety of database backups. As a database administrator, you should choose which backup method to use, depending on two issues:
1. How long can the database tolerate data loss?
2. How much manpower and resources will be invested to do database backup and recovery strategy?
In fact, in order to achieve the best results, we need more investment. This is especially true for database backup strategies. Regardless of the mirroring technology (including SQL Server's own database mirroring and physical disk-level mirroring), SQL Server cannot always make a database backup, with a certain interval between each backup. Data changes between this time interval are not protected until the next backup. So in the end, the maximum time period for data loss is the time interval between these two backups. protecting data with backup data recovery mechanisms is not a guarantee that data is not lost at all. if a user asks to ask for no data loss, it is necessary to communicate with the user so that they understand that it is unrealistic to use data backup technology only, and more investment is needed to introduce mirroring technology.
Since the maximum time period for data loss is two times between backups, the more backups are made, the less data is lost. However, the more frequently you do backups, the more you need to invest. The more factors involved are:
1. The more backups you have, the more backup files you want to manage, and the more files you will recover when the database is restored. A suitable system needs to be established.
2, although the backup will block the normal operation of the data, but will produce a series of disk read and write. If the server itself has more frequent IO, backup actions can further affect the performance of the database. It is necessary to enhance the read and write ability of the hard disk of the server to avoid this problem.
3, backup will inevitably because of a variety of factors fail. The more frequently you backup, the more chances you will encounter failure. Administrators are expected to handle errors in a timely manner, restoring the backup task to normal. This is also a higher requirement for administrators.
When you know what you're willing to invest in, you can decide what kind of backup strategy to use.
With log backups, you can recover a database to a failure or to a specific point in time. So log backups play a very important role in the backup strategy. However, log backups can only be performed on databases with the full recovery model and some bulk-logged recovery models.
To specify a backup strategy, first decide whether you want to make a log backup. If a log backup is required, the database recovery model is selected as full mode. (The bulk recovery model does not guarantee a successful log backup, so it is generally not recommended for use in a production environment.) If you do not log backup, the database schema is set to simple. Otherwise, you will encounter an infinite log file growth issue.
One, backup in simple recovery mode
A log backup cannot be done under the simple recovery model. So it only supports the simplest backup and restore modes. It's easy to manage. However, if you do not have a log backup, you can only restore the database to the end of the last backup. In the event of a disaster, modifications made after the last backup of the data are lost.
Shows the simplest backup and restore strategy under the simple Recovery model, which uses only full database backups that contain all the data in the database. There are five full database backups, but you only need to restore the most recent backup (a backup performed at T5 point in time), and restoring this backup restores the database to T5 point in time, because all subsequent updates represented by the T6 box are lost.
And in the simple recovery model, the transaction log is automatically truncated to remove the inactive virtual log files. Truncation usually occurs after each checkpoint, but in some cases it is delayed.
Under the simple recovery model, the risk of loss of work increases over time until the next full or differential backup. Therefore, it is advisable to schedule enough frequencies to avoid losing large amounts of data. At the same time, the frequency is not too high to make the backup difficult to manage.
Shows the risk of work loss for this backup plan. So this rough is only suitable for small databases that are used for frequent backups.
To mitigate the risk, SQL Server introduced a differential backup.
Shows a backup strategy that uses a differential database backup to complement a full database backup to mitigate the risk of work loss. After the first database backup, 3 of this differential backup was established consecutively. After the 3rd differential backup, a full database backup is made and a new differential benchmark is established. Because differential backups are typically less expensive than full backups, they can be run more often. Such a backup strategy can be used on a database with a slightly larger amount of data that can tolerate a longer loss of time.
The advantages of these two backup strategies are simple to manage, whether for backup or recovery. However, whether the database full backup, or differential backup, can not be more frequent frequency, usually only in the evening. If the database is large, or does not allow long-time data loss, such a backup strategy cannot meet the requirements. Log backups must be introduced.
Second, backup under the full recovery model
If the database is the full recovery model, you can use log backups. Because log backups only copy all of the log records since the last log backup, the overhead is much smaller than the database backup. You can define backups to be done at a very frequent frequency (5 minutes or less) to achieve maximum protection against loss of data in the event of a failure. The advantage of using log backups is that it allows you to restore the database to any point in the log backup ("Time-out recovery"). Assuming that the activity log can be backed up after a critical failure, the database may be restored to a point where no data loss occurred. The disadvantage of using log backups is that they are numerous, and when you restore backups, you need to restore them in exactly the order in which they were produced. There can be no backup missing or jumping in the middle. So the more log backups you make, the more time you restore. Management complexity is also higher.
Shows the simplest backup strategy under the full recovery model. Backup db_1 and two routine log backups log_1 and log_2 have been completed. At some time after the Log_2 log backup, the database fails. Before restoring these 3 backups, the database administrator must back up the activity log (the tail of the log). The db_1, log_1, and log_2 are then restored, and the database is not restored. The database administrator then restores and restores the tail (Tail) log backup. This step will allow the database to be restored to the point of failure. To recover all data. If the tail log can be successfully backed up and restored. This disaster may not even bring any data loss. If a disaster destroys a log file. So that the tail log can not be successfully backed up and restored, the disaster caused by the data loss is from Log_2 after all the changes.
Therefore, after the first full database backup is complete, and after the regular log backup is started, the potential job loss is at the time of the database corruption only. To the time of the last regular log backup, it is recommended that log backups be performed frequently to limit the risk of work loss to the extent permitted by the business requirements.
after a failure, you can try to back up the "log Tail" (the log that has not been backed up). if the tail-log backup succeeds, you can avoid any work loss by restoring the database to the point of failure. So the advantages of this backup plan are also obvious.
The backup policy is displayed and a series of routine log backups are used to supplement the full database backup. Using transaction log backups reduces the time that a potential work loss risk exists, so that the risk exists only after the most recent log backup. After the first database backup is complete, a differential database backup is made every day, and several log backups are made during business hours.
Before the first database backup is created, the database has a potential job loss risk (from time t0 to time T1). After the backup is established, routine log backups reduce the risk of work loss to the changes made since the most recent log backup (in this figure, the last backup time is t14). If a failure occurs, the database administrator should immediately attempt to back up the activity log (the tail of the log). If this "tail log backup" succeeds. The database can be restored to the point of failure.
However, there is a major flaw in the backup plan mentioned above, which is that there are too many log files to recover after a disaster. Suppose you make a log backup every hour, do a database backup once a week, and if the disaster happens in Friday, you have to recover hundreds of log backups. This amount of work and the time it takes is very large. Therefore, in order to minimize the restore time, the same database can be supplemented by a series of differential backups.
Third, file or filegroup backup
A full file backup refers to backing up all the data in one or more files or filegroups. Under the full recovery model, a complete set of full file backups and log backups of all file backups are combined. Equivalent to a full database backup. Using a file backup allows you to restore only corrupted files without restoring the rest of the database, thereby speeding up recovery. For example, if a database consists of several files located on a different disk, only the files on the failed disk should be restored if one of the disks fails.
In SQL Server7.0 and SQL Server2000, file backups and differential file backups do not contain log records. Log backups must be explicitly restored to recover their data. Therefore, in these two versions. Only file backups can be used in conjunction with the full and bulk-logged recovery models. After SQL Server 2005, file backups contain enough log records by default to roll forward files to the end of the backup operation. (However, in the simple recovery model, you must back up all read/write files together instead of specifying each read/write file or filegroup individually.) )
Compared to database backups, file backups have the following advantages:
1, can recover from the isolated media failure faster. You can quickly restore a damaged file.
2. File backups increase the flexibility of planning and media processing compared to full database backups (which can become difficult to manage for very large databases). Higher flexibility in file or filegroup backups is also useful for large databases that contain data with different update characteristics.
Compared to full database backups, the main disadvantage of file backup is that it is more complex to manage. If a damaged file is not backed up, a media failure can cause the entire database to be unrecoverable. Therefore, a complete set of file backups must be maintained, and one or more log backups must be maintained for the full/bulk-logged recovery model. These log backups cover at least the time interval between the first full file backup and the last full backup.
Maintaining and tracking these full backups is a time-consuming task that may require more space than a full database backup. So this kind of backup strategy in the actual application is still relatively small. And now storage has become very convenient, but this method in the management of large databases, can play its irreplaceable advantage.
Under the full recovery model, a complete set of files is backed up with sufficient log backups that cover all file backups starting with the first file backup equivalent to a full data backup.
Shows how the file backup works, and if it might be best to perform a full database backup and start the log backup before the first file backup begins. Shows the policy of performing a full database backup (at T1 time) immediately after the creation of the database (at t0 time). After you have created the first database backup, you can begin to perform transaction log backups. The transaction log backup schedule is performed at set intervals. File backups are performed at intervals that are most appropriate for the database business requirements. This figure shows 4 filegroups, one at a time, for each filegroup. Their backup order (a, C, B, a) reflects the business requirements of the database.
Under the full recovery model, restoring a filegroup backup requires not only restoring the filegroup backup itself, but also restoring all log backups from the last full database backup to the target point in time of recovery. To ensure that the file remains consistent with the rest of the database. So the number of transaction log backups to recover will be large. To avoid this scenario, consider using a differential file backup. However, this makes the entire backup plan more difficult to manage. This is the reason why file backups are not used very often. However, this may be the only option when managing a very large database.
Different libraries set up different backup schemes, which can be chosen by themselves.
SQL Server Enterprise Platform Management Practice book notes--about how SQL Server databases are backed up