SQL Server AlwaysOn Concept Summary

Last Update:2017-10-24 Source: Internet

Author: User

Tags database issues failover readable

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

First, the AlwaysOn concept

An availability group supports a failover environment for a discrete set of user databases, known as availability databases, which collectively implement failover. An availability group supports a set of primary databases and a corresponding secondary database of one to eight groups (including one primary replica and two synchronous-commit secondary replicas). The secondary database is not a backup, and you should continue to back up your database and its transaction logs regularly.

Each set of availability databases is hosted by an availability replica. There are two types of availability replicas: one primary replica and one to four secondary replicas. It hosts the primary database and one to eight secondary replicas, each of which hosts a set of secondary databases and serves as a potential failover target for the availability group. The availability group fails over at the availability replica level . Availability replicas provide redundancy at the database level only-for that group of databases in an availability group. Failover is not caused by database issues such as database corruption due to loss of data files or transaction log, which makes databases a suspect database.

The primary replica makes the primary database available to read and write connections to clients. In addition, it is used in the process called "Data Synchronization" and is synchronized at the database level. The primary replica sends transaction log records for each primary database to each secondary database. Each secondary replica caches transaction log records ("hardened" logs) and then applies them to the appropriate secondary database. The primary database synchronizes data independently with each connected secondary database. Therefore, one secondary database can hang or fail without affecting other secondary databases, and one primary database can hang or fail without affecting other primary databases.

Alternatively, you can configure one or more secondary replicas to support read-only access to the secondary database, and you can configure any secondary replicas to allow backups of the secondary database.

Deploying always on availability groups requires a Windows Server failover cluster (WSFC) cluster. Each availability replica for a given availability group must be on a different node in the same WSFC cluster. The only exception is when migrating to another WSFC cluster, at which point an availability group may temporarily span two of clusters.

Create a WSFC resource group for each availability group that you create. The WSFC cluster monitors this resource group to assess the health of the primary replica. The quorum for Always on availability groups is based on all nodes in the WSFC cluster, regardless of whether a given cluster node hosts any availability replicas. In contrast to database mirroring, there is no witness server role in always on availability groups.

Ii. availability Mode

The availability mode is a property of each availability replica, and the availability mode determines whether the primary replica needs to wait for the secondary replica to write the transaction log to disk.

1. Asynchronous Commit mode

Asynchronous-commit mode is a disaster-recovery solution that is suitable for situations where the availability replicas are distributed far away. If each secondary replica is running in asynchronous-commit mode, the primary replica does not wait for any secondary replicas to force write to the log, and the transaction acknowledgement is sent to the client immediately after the log record is written to the local log file. The primary replica uses the minimum transaction lag run associated with the secondary replica configured for asynchronous-commit mode.

Under asynchronous commit mode, the secondary replica is never synchronized with the primary replica. Although a given secondary database may catch up to the corresponding primary database, any secondary database may fall behind at any point in time. Asynchronous-commit mode can be useful for scenarios where the primary and secondary replicas are far apart and you do not want a small error to affect the primary replica's disaster recovery scenario, or if performance is more important than synchronous data protection. Also, because the primary replica does not wait for confirmation from the secondary replica, the problem on the secondary replica never affects the primary replica.

The asynchronous-commit secondary replica tries to match the log records that receive the autonomous copy. However, asynchronous-commit secondary databases tend to remain unsynchronized and may lag slightly behind the corresponding primary database. Typically, this time difference between the asynchronous-commit secondary database and the corresponding primary database is small. However, if the server that hosts the secondary replica is running too high or the network is slow, the time difference becomes larger.

The only form of failover supported by the asynchronous commit pattern is forced failover, which can result in data loss. Forced failover is a last resort where only the current primary replica remains unavailable for a long time and the immediate availability of the primary database is more important than the risk of data loss. The failover target must be a copy of its role in the secondary or resolving state. The failover target is converted to the primary role, and its database copy becomes the master database. Any remaining secondary databases and the previous primary databases that become available will be suspended until you manually restore them individually. In asynchronous-commit mode, any transaction logs that the original primary replica has not yet sent to the previous secondary replica are lost. This means that some or all of the new primary databases may be missing the most recently committed transaction

2. Synchronous Commit Mode

Synchronous-commit mode emphasizes high availability in relation to performance, and the cost of this is increased transaction latency. In synchronous-commit mode, the transaction waits until the secondary replica has forced the log to be written to disk before the transaction acknowledgement is sent to the client.

In synchronous-commit availability mode, after a replica is joined to an availability group, the secondary database is consistent with the corresponding primary database and enters the SYNCHRONIZED (synchronized) state. As long as data synchronization is ongoing, the secondary database remains in SYNCHRONIZED state. This ensures that each transaction submitted to the primary database is also applied to the corresponding secondary database. After each secondary database on the secondary replica is synchronized, the synchronization run state of the secondary replica will be HEALTHY overall.

Attention:

1. If the asynchronous-commit availability mode is configured for the current primary replica, all secondary replicas are collected asynchronously committing transactions, regardless of their respective availability modes, so the primary and secondary replicas need to configure synchronous-commit mode to ensure synchronous-commit mode.

2. If the primary replica timed out with a synchronous secondary session, temporarily switch the secondary replica to asynchronous-commit mode. After the secondary replica is reconnected with the primary replica, they will resume synchronous-commit mode.

Three, the mode of failure transfer

The lead color and worker roles of an availability replica are usually interchangeable during the process known as failover. There are three types of failover: automatic failover (no data loss), planned manual failover (no data loss), and forced manual failover (possibly losing data). The last form is often referred to as a "forced failover"

1. Conditions required for automatic failover

Automatic failover occurs only under the following conditions:

There is an automatic failover set. This automatic failover set is made up of primary and secondary replicas (automatic failover targets), and both primary and secondary replicas are configured for synchronous-commit mode and are set to automatic failover. Automatic failover does not occur if the primary replica is set to manual failover, even if the secondary replica is set to auto-fail over
The automatic failover target has a healthy synchronization state (this indicates that each secondary database on the failover target is synchronized with its corresponding primary database).
The Windows Server failover cluster (WSFC) cluster has quorum.
The primary replica has become unavailable and the failover condition level defined by a flexible failover policy has been met.

Attention:

1. At the database level, database issues such as database corruption due to loss of data files, deletion of databases, or corrupt transaction logs do not cause an availability group to fail over

2. AlwaysOn Availability Groups monitor the health of the two replicas in an automatic failover set. If either replica fails, the health state of the availability group is set to critical. If the secondary replica fails, automatic failover is not feasible because the automatic failover target is not available. If the primary replica fails, the availability group fails over to the secondary replica. There will be no automatic failover targets until the previous primary replica goes online. In either case, in order to ensure availability in the unlikely event of a continuous failure, we recommend that you configure the other secondary replicas as an automatic failover target.

3. To set the failover mode to "automatic" is the premise that the availability mode is "synchronous commit".

4. If the primary replica is set to manual failover, automatic failover cannot occur even if the secondary replica is set to auto-failover.

5. Only one automatic failover secondary replica can be set

Four, readable auxiliary copy 1. Connection access types supported by the worker role

1. No connection
No user connections are allowed. The secondary database is not available for read access. This is the default behavior in the worker role.

2. Read-Only Intent connection
The secondary database applies only to connections whose applicationintent Connection property is set to ReadOnly ( read-Intent connection ).

3. Allow any read-only connection
The secondary database is all available for read access connections. This option allows a lower version of the client to connect.

2. Connection access types supported by the primary role

1. Allow all connections
The primary database allows both read and write connections and read-only connections. This is the default behavior of the primary role.

2. Allow only read/write connections
This connection is allowed when the applicationintent Connection property is set to ReadWrite or not set. A connection whose applicationintent connection string keyword is not allowed to be set to ReadOnly . Allowing only read and write connections can help prevent your customers from mistakenly connecting read-intent workloads to the primary replica.

Note: All restrictions are for the availability database only, the non-availability database is not limited by these connections, and the configuration read-write separation ensures that there are at least two readable copies, if only one readable copy becomes the primary replica and causes the read-only intent to be connected without replicas.

Five, AlwaysOn synchronization principle

1. Any SQL Server has a thread called log writer, when any SQL user submits a data modification transaction, it will be responsible for recording the log information of this modification into a memory of the log buffer, and then write to the physical log file (log cure), So for any database, there will be records of all data changes in the log file.

2. For a database configured as an AlwaysOn primary replica, SQL Server establishes a worker thread called Log scanner, which is specifically responsible for reading the log records from the log buffer or log file, packaging them into log blocks, and sending them to each secondary replica. Because of its uninterrupted work, the data on the primary replica is changed and can be propagated continuously to the secondary replica.

3. On the secondary replica, there will also be two threads to complete the corresponding data update actions, which are cured (harden) and Redo (Redo). The Cure thread writes the log block from the primary replica log scanner to the log file on the secondary replica's disk (this process is called "curing").

While the redo thread is responsible for reading the log blocks from the disk, translating the log records into data modification operations and completing them on the secondary replica's database. When the redo thread finishes its work, the database on the secondary replica is consistent with the primary replica. AlwaysOn is the mechanism by which the synchronization between replicas is maintained. A redo thread communicates with the primary replica at every fixed point in time, informing it of its own work progress. The primary replica is able to know how far apart the data on both sides is.

These threads are independent in their work to achieve higher efficiency. Logscanner is responsible for transferring log blocks without waiting for log writer to complete the journal Cure; the secondary replica completes the log cure and sends a message to the primary replica, informing that the data has been delivered without waiting for the redo to complete. Its design goal is to minimize the performance impact of the extra operations that AlwaysOn brings on normal database operations.

The synchronization operation is maintained in the following ways:

When a transaction is received from the client, the primary replica writes the log of the transaction to the transaction log and sends the log record to the secondary replica.
After a log record is written to the transaction log of the primary database, the transaction cannot be undone unless the secondary replica has not received the log at this time. The primary replica waits for confirmation from the synchronous-commit secondary replica.
The secondary replica forces the log to be written (cured) and returns the acknowledgment message to the primary replica.
When a confirmation is received from the secondary replica, the primary replica completes the commit process and sends a confirmation message to the client.

Vi. Session Timeout mechanism

Because soft errors cannot be detected directly by the server instance, soft errors can cause an availability replica to wait indefinitely for the response of another availability replica in the session. To prevent this, always on availability groups implement the session-timeout mechanism, which is based on the following criteria: The connected availability replica sends pings at regular intervals on each open connection. Receiving a ping within the time-out period indicates that the connection is still open and that the server instance is communicating through this connection. When you receive a ping, the replica resets the timeout counter on this connection. The primary and secondary replicas ping each other to indicate that they are still active, and the session time-out limit is a user-configurable replica property with a default value of 10 seconds.

If a ping from another replica is not received within the session time-out period, the connection will time out, the connection will be closed, and a time-out copy enters the disconnected state. Even if it is a copy of synchronous-commit mode, the transaction will not wait for the replica to reconnect temporarily to switch the secondary replica to asynchronous-commit mode. After the secondary replica is reconnected with the primary replica, they will resume synchronous-commit mode.

Summary

Understanding mastering these concepts is very helpful in deploying and maintaining AlwaysOn clusters, and can be combined with a more in-depth understanding of the concepts of testing.

Note:

pursuer.chen

Blog:http://www.cnblogs.com/chenmh

This site all the essays are original, welcome to reprint, but reprint must indicate the source of the article, and at the beginning of the article clearly to the link, otherwise reserves the right to hold responsibility.

Welcome to the exchange of discussions

SQL Server AlwaysOn Concept Summary

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More