Transaction logs in SQL Server (V)-functions of logs in high availability and disaster recovery

Source: Internet
Author: User

This is the fifth article in the series and a supplement to the previous log series. If you do not know the basic concepts of logs, refer to the articles earlier in this series:

Transaction logs in SQL Server (I)-physical and logical architecture of transaction logs

Transaction Log in SQL Server (2)-Role of transaction log in data modification

Transaction Log in SQL Server (III)-Role of log in simple recovery mode

Transaction Log in SQL Server (IV)-Role of log in full recovery mode

 

Introduction

Logs ensure durability and data consistency. You can use logs to Undo and Redo data. Therefore, using logs, SQL Server can not only achieve disaster recovery, the log Redo can also be used to achieve high availability. This article describes the functions of logs in the high availability of SQL Server and their roles in disaster recovery.

 

Log corruption

Logs may be damaged due to the fault of the I/O subsystem. When logs are damaged, if you have a slight understanding of the log and can save the data if the log is damaged, so it must be very good :-). Let's take a look at the recovery situations when several logs are damaged.

 

1. The database is shut down normally and logs are damaged.

When the database is shut down normally, log corruption is not so important, because at this time, all the dirty data corresponding to the committed transactions in the database has been CheckPoint to the physical disk, therefore, data inconsistency does not exist. Therefore, if the MDF and NDF files are intact, you can directly specify the FOR ATTACH_REBUILD_LOG parameter and append it, as shown in 1.

Figure 1. If the database is shut down normally, simply attach it.

 

However, it is worth noting that using this method to append the database will automatically recreate the log file. The size of the log file is 0.5 MB, that is, two VLF files, which will automatically increase to 10%, therefore, you need to manually set the log size to avoid too many VLF cases.

 

2. The database is shut down abnormally and logs are damaged.

Before talking about this situation, Let's first look at several statuses that the database can be in, as shown in a complete model 2.

Figure 2. Database status

The specific transition relationships of the above States are beyond the scope of this article, but here I will emphasize the two States that are closely related to log corruption: RECOVERY_PENDING and SUSPECT.

If the database is not shut down normally, that is, there is still data that has not been CheckPoint to the disk. If the database is to be started, it must go through the Recovery process. If the log is damaged, the Recovery process cannot be performed, data inconsistency may occur.

At this time, the database may be in one of the following two states:

  • RECOVERY_PENDING: You need to run crash recovery, but the process cannot begin due to resource waiting. For example, the log is completely damaged.
  • SUSPECT: crash recovery has started, but cannot be completed

 

Therefore, to handle this situation, you must determine whether data loss is allowed based on your business environment. You can choose to recover data from the backup or change the database status to EMERGENCY. EMERGENCY Mode means that the database skips the crash recovery stage. Although the database can be accessed at this time, data transaction inconsistency may occur. If only some data pages are inconsistent, however, if the transaction for table structure modification exists, the database architecture may be inconsistent. If you do not have an appropriate backup set, you can only restore the data in this way. Setting the database to EMERGENCY mode is simple, as shown in code list 1.

ALTER DATABASE AdventureWorks2012 SET EMERGENCY

Code List 1. Set the database to emergency mode

 

One option related to this mode is REPAIR_ALLOW_DATA_LOSS. This option will still execute the crash recovery process, but will skip the damaged days to fix Data Consistency issues as much as possible, this option creates a new log file and makes the database ONLINE. A simple example of using this option is shown in code list 2.

ALTER DATABASE AdventureWorks2012 SET SINGLE_USER

 

DBCC CHECKDB(AdventureWorks2012,REPAIR_ALLOW_DATA_LOSS)

Code List 2. Use the REPAIR_ALLOW_DATA_LOSS Option

 

It is worth noting that, as a DBA, you must always have "slave", and the above operations should be considered if you have insufficient preparation.

 

3. The database is online and logs are damaged.

In this case, if the log to be used by SQL Server during running is corrupted (for example, during rollback), SQL Server will deprecate the database and switch it to the SUSPECT mode.

If there is no backup, you can only consider using the EMERGENCY mode.

Another way is to change the database recovery mode to simple, manually initiate a CheckPoint to truncate the log, and then change the database back to the full recovery mode. However, this method will destroy the log chain. However, the damaged logs may be cleared.

 

 

Role of logs in High Availability

 

Images and AlwaysOn

Both High Availability technologies maintain a copy of a Database Based on logs. By transmitting logs to the replica in real time, REDO operations are performed continuously on the replica to ensure real-time synchronization between the master and the replica database.

For images, one copy of logs can be sent synchronously or asynchronously.

For the AlwaysOn availability group, logs can be synchronized to a maximum of two replicas + asynchronously to two replicas (it is said that SQL Server 2014 has doubled this feature, that is, up to four synchronous replicas and up to four asynchronous replicas can be created. However, it is not released yet, so it is just a gossip ).

The so-called synchronization concept is that the primary copy can be submitted locally only after the logs are sent to the secondary copy and the secondary copy returns ACK information, this may cause significant latency and affect performance. It is worth noting that the primary replica does not wait for the transaction to be committed after the secondary replica is committed, but only needs to receive the ACK information returned by the secondary replica.

Regardless of the above two high availability features, you must consider monitoring the sending queue and REDO queue. The long sending queue means that a large amount of data may be lost during failover and log truncation will be prevented. A long REDO queue means that when a fault is transferred, more time will be consumed for the RECOVERY truncation, thus increasing the downtime of the Failover. Both queues can be monitored using performance monitor, as shown in figure 3.

Figure 3. Monitoring queue counters

 

Transaction Log Transfer

Compared with other high availability functions, the transaction log transfer function is relatively simple. It is essentially a process of continuously backing up, transmitting, and restoring logs. Using transaction log transfer is very suitable for testing whether logs are valid.

Another noteworthy aspect of transaction log transmission is that when batch operations are performed, you should consider using the large transaction log mode to avoid the transmission of a large number of logs over the network.

Transaction logs are the easiest way to maintain a redundant copy of a database. Although data cannot be guaranteed in real time, they are very meaningful for specific business scenarios.

 

Transaction Replication

Unlike the previous high availability features, transaction logs cannot be directly transferred. Because the structure of the database on the publisher and subscription can be completely different, the subscription end can only subscribe to one or more tables, some columns in the table, or some data subsets. Because the table structure and data at the publisher and subscription end are inconsistent, logs at the publisher end cannot be directly transmitted to the subscription end.

Therefore, the principle of transaction replication is that the Log Reader Agent regularly reads the logs at the publishing end, summarizes the changes to the published content, and changes these changes to logical operations, this allows the subscription end to Replay these operations to achieve the goal of data synchronization.

It is worth noting that if the Log Reader Agent has not yet scanned for the latest modifications, transaction replication may cause the Log on the publishing end to fail to be truncated.

 

 

Summary

As a supplement to the previous four articles, this article describes the principles and functions of logs in disaster recovery and high availability. These principles are essential for designing a good backup plan and a high availability plan.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.