In-depth discussion on Database Availability groups

Source: Internet
Author: User
Tags failover

High Availability in Exchange Server 2010


    • In-depth discussion on Database Availability groups

Mailbox databases and the data they contain are important to any exchange organization. To ensure high availability of the mailbox database, Exchange 2007 provides a variety of replication and cluster options, including local continuous replication, single copy clusters, and clustered mailbox servers. Although these functions have been improved compared with earlier functions, they still face many implementation difficulties. For beginners, each method to achieve high availability is managed in different ways. Through a single replication cluster, all mailbox servers in the cluster use shared storage. Implementing a cluster means that the exchange administrator must configure a Windows failover cluster, which is quite complex. It may take a lot of time for the Administrator to get a long running time. In continuous replication, Exchange 2007 uses built-in asynchronous replication to create copies of data, and then maintains these copies using transaction log transfer and replay. Although you use local continuous replication in a non-cluster environment to create local copies, you must use cluster continuous replication or standby continuous replication in a cluster environment, the method for managing each type of continuous replication varies.

Exchange Server 2010 has a completely different approach to high availability, which integrates high availability into its core architecture, this produces an end-to-end solution that provides service availability, data availability, and automatic recovery. The result is that a single key high availability solution replaces many different solutions previously used. This solution is the Database Availability Group (DAG ).

Dag provides automatic failover and recovery at the database level (rather than at the server level). You do not need to use a cluster when deploying multiple mailbox servers that contain multiple mailbox database copies. Due to these changes, cluster hardware or advanced cluster configuration is no longer required for building a high-availability mailbox server solution. Dag itself provides basic components for high availability. It is automatically implemented for failover of mailbox databases belonging to the same Dag. Dags can be extended to Multiple Active Directory Sites, and a single mailbox database can be moved between Active Directory Sites by modifying the mailbox server's architecture. In this way, a single mailbox database in an active directory site can fail over to another active directory site.

You must remember that the database copy only applies to the mailbox database. To get the redundancy and high availability of public folder databases, you can use public folder replication. Unlike cluster continuous replication (multiple copies of a public folder database cannot exist in the same group), you can copy a public folder database between servers in a Dag.

Before going into the details of the Dag, let's take a look at the changes in High Availability options in exchange 2010 in other ways.

A Brief Introduction to the high availability feature of Exchange Server 2010

In earlier versions, exchange was used as a cluster application using the cluster resource management model.ProgramRun. In this method, you first create a Windows failover cluster, and then run the exchange installer in cluster mode to achieve high availability of the mailbox server. As part of the installation process, the exchange cluster resource DLL (exres. dll) will be registered to allow the creation of the cluster mailbox server. In contrast, exchange 2010 does not run as a cluster application and the cluster resource management model is no longer used for high availability. Exchange cluster resource DLL and all the cluster resources it provides no longer exist. Exchange 2010 uses its Internal High Availability model. Some components of the Windows failover cluster are still used in this model, but they are currently only managed by exchange 2010.

It is very interesting that many basic replication technologies are retained, but the improvements are now working in a significantly different way. Because the storage group has been removed from exchange 2010, continuous replication is performed at the database level. Exchange 2010 uses the Administrator-defined single TCP port for data transmission, rather than the Server Message Block (SMB) for log transmission and seed settings. Instead of pulling a closed log file from the active copy, the active copy pushes the log file into the passive copy and uses encryption to ensure the security of the data stream, or compress the data stream to reduce the size of the copied data. Although Active copies of databases in earlier versions of exchange can only be used to set seeds and reset seeds, in Exchange Server 2010, both active and passive replicas of the mailbox database can be specified as the source for seed setting and resetting, allowing you to easily add database copies to other mailbox servers.

Another major change is related to the way data is replicated. In exchange 2007, the Microsoft Exchange replication service replays logs to passive database copies and builds a read/write operation cache to reduce I/O read operations. However, after the passive copy of the database is activated, the database cache will be lost because the Microsoft Exchange information storage service used to load the database does not provide the cache. This means that the passive copy is activated and is provided in a cold state where no cache is ready. The cold status is the same as that of the database cache after the server is restarted or the cache service is restarted. The cold state means that the server does not have cached read/write operations, which usually increases the number of I/O read operations required, until the cache size is increased to enough to reduce disk I/O on the server. In exchange 2010, Microsoft Exchange information storage service replays logs and processes load operations to ensure that there is a cache available when activating and providing passive copies. Therefore, after switching or failover, the server is more likely to use the cache to reduce I/O read operations.

For high-availability email servers, once an email arrives in the mailbox, it will become secure. However, protecting the email in transmission is another matter. If the hub transport server fails during email processing and cannot be recovered, the email may be lost. As a security measure to prevent data loss, Exchange 2007 introduces the transfer dump feature, which ensures that the hub transport server keeps a Message Queue recently sent to the recipient, the mailboxes of these recipients are protected by local continuous replication or cluster continuous replication. Emails are retained in the transfer dump until the time limit or size limit defined by the Administrator is reached. When a Failover occurs, the cluster mailbox server automatically requests each hub transport server in the Active Directory site to resubmit the email from the Transport dump queue. This method prevents the loss of emails within the time required for cluster failover. Although this method is effective, it can only be used for sending messages in a continuous replication environment, and cannot solve the possible mail loss problem during transmission between the hub transport server and the edge transport server.

Exchange 2010 makes up for these shortcomings in several ways. The transfer dump can now receive feedback to determine which messages have been transferred and copied. The Hub transport server retains copies of emails sent to the email address database copied in the Dag. This copy is retained in the transmission Queue (mail. que), until the hub transport server is notified that the transaction log of the mail has been successfully copied to all copies of the mailbox database and has been checked by these copies. The logs are then truncated from the Transport dump to ensure that the transport dump queue is only used to keep copies of those messages that have not copied the corresponding transaction log. In addition, when the mailbox database in an active directory site fails to another active directory site, the transfer dump will re-send the request to the original site and the new site.

To provide email redundancy throughout the transmission of mail, exchange 2010 adds the shadow redundancy feature. The Shadow redundancy adopts a method similar to the transfer dump program. The difference is that it will delay the operation of deleting emails from the transport database, it is not executed until the transport server verifies that all subsequent hops of the email have been delivered. If the transport server cannot verify the next hop, resubmit the email to send it to the next hop. This method consumes less network bandwidth than creating the same copy of an email on multiple servers. In this case, the only additional network traffic generated is the traffic generated when the Transmission Server exchanges the discarded message. The discard status message is generated by the shadow redundancy manager to indicate the time to prepare the email to be discarded from the transport database.

Shadow redundancy is an extension of the Simple Mail Transmission Protocol (SMTP) Service. This function can be used as long as both servers in the SMTP connection support this function. When redundant message paths exist in your routing topology, shadow redundancy can eliminate dependency on any specific hub status or edge transport server status, so that any transport server can be disposed. In this case, if the transmission server fails or you want to take it offline for maintenance, you can do so at any time by removing, replacing, or upgrading it, instead of clearing its queue or worrying about losing emails.

Shadow redundancy Manager uses a signal detection method to determine the availability of servers with shadow mail queues. When the server is started, an xquerydiscard message is sent, and the target server returns the discard notification as a response. This notification exchange is the detection signal.

If a server cannot establish a connection with the master server within the detection signal timeout interval (300 seconds by default, the server resets the timer and retries the timer up to three times (the default value of the number of detection signal retries ). If the master server does not respond until the maximum number of retries is reached, the server determines that the master server has failed, takes ownership of the Shadow emails, and submits them again. Then, the mail is sent to their target location. In some cases, for example, when the original server is connected to the original database again, duplicate mail may occur. Because exchange has the duplicate email detection function, Exchange Mailbox users do not see duplicate emails. However, recipients on non-exchange email servers may receive duplicate copies.

In-depth exploration of Dag

Although many of the high availability enhancement features I have introduced so far are very important, none of them affect the way exchange 2010 is managed more than the Database Availability group. Dag is a basic high availability component in exchange 2010, and its rules are very simple. Each Dag can have up to 16 email servers as its members. Each mailbox server can only serve as a member of a Dag and can carry only one database copy. The attached copy can be an active or passive copy. The difference between an active copy and a passive copy is that an active copy is a copy that the user has been using and accessing, rather than an offline copy. You cannot create two copies of the same database on the same server. In this way, any server in the Dag can carry a copy of any mailbox database on any other server in the Dag. Although multiple databases can be active at the same time, only one copy of any specific database can be active at any time, up to 15 passive copies of the database can be located on other servers in the Dag.

When you create the first DAG in the exchange organization, exchange creates a Windows failover cluster, but there is no exchange cluster group, and there is no storage resource in the cluster. Dag only uses the cluster detection signals, cluster network, and cluster database functions of Windows failover clusters. The cluster detection signal is used to detect faults. Each Dag requires at least one network for replication and at least one network for mapi and other communications. The cluster database stores database status changes and other important information. When you add other servers to the Dag, these servers are added to the basic cluster. The Arbitration Model of the cluster is automatically modified based on the number of member servers as needed.

Active manager is a component of exchange 2010. It provides resource models and Failover management functions. Active manager can run on all the mailbox servers that are members of the Dag. It acts as the master role owner (primary active manager) or the standby auxiliary role owner (standby active Manager) of a specific database ). The master role owner determines which database copies are active copies and which copies are to be activated. The master role owner receives topology change notifications and responds to server faults. The primary role owner also owns cluster arbitration resources. If the server that acts as the owner of the primary role fails, the primary role will automatically move to another server in the Dag, and the server will gain ownership of the Cluster's arbitration resources.

The secondary role owner detects faults in the copied local database and local information repository, sends fault notifications to the primary role owner, and requires the primary role owner to initiate failover. The owner of the secondary role does not determine which server to take over or update the database location status. The owner of the primary role executes these tasks. When an active database fails, active manager uses an optimal copy option.AlgorithmTo select the database copy to be activated. This algorithm determines the optimal database copy to be activated based on the database status, Content Index status, replica queue length, and replay queue length of the database copy. If more than one database copy meets the selection criteria, the activation preferred value is used, and the database with the lowest preferred value is activated and loaded.

After adding the server to the Dag, you can copy the active database on each server to another server in the Dag, And you can configure other Dag attributes, for example, network encryption or network compression for database replication. In the Dag, transaction logs are copied to each member server that has a copy of the mailbox database and replayed to the copy of the mailbox database. After multiple database copies are created, you can use the exchange console and the exchange command line management program to monitor the replication and running status of the Dag. Database failover can be performed automatically in the event of an interrupted fault, or you can manually start the Failover. During the switchover, the active copy will be removed, and then the passive copy on the other server in the Dag will be loaded and the copy will become the active copy.

Really simplified

As mentioned above, exchange 2010 has many important enhancements to improve availability, including integrating high availability features into the core and architectural changes to improve availability. Among all the new and changed features, my favorite feature is Dag. Dag truly simplifies the implementation of clusters, allowing you to focus on the most important thing (that is, data ). I hope this article will help you, and I suggest you read my new book: exchange Server 2010 administrator's pocket consultant, Windows 7 administrator's pocket consultant, and Windows Server 2008 administrator's pocket consultant, 2nd edition.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.