Plan high availability and site recovery

Source: Internet
Author: User

Microsoft ExchangeServer2010 includes a new unified framework for mailbox recovery, which includes some new features, such as database availability Group (DAG) and mailbox database copy. Although these new features can be deployed quickly and easily, you must first carefully plan to ensure that any high availability and site recovery solutions that use these features meet the intended purpose and business requirements.

During the planning phase, System Architects, administrators and other key stakeholders should determine deployment requirements, especially high availability and site recovery requirements. To deploy these features, you must meet common requirements and requirements for hardware, software, and network connections. For more information about DAG Storage requirements, see Mailbox Server Storage Design.

General requirements

Before deploying a DAG and creating a copy of The mailbox database, make sure that the following system recommendations are met:

  • The Domain Name System (DNS) must be running. In theory, the DNS server should accept dynamic updates. If the DNS server does not accept dynamic updates, you must create a dns host (A) record for each Exchange Server. Otherwise, Exchange cannot run normally.
  • Each mailbox server in the DAG must be a member server in the same domain.
  • You cannot add an Exchange2010 mailbox server that acts as a directory server to a DAG.
  • The name assigned to the DAG must be a valid, available, and unique computer name of no more than 15 characters.

Hardware requirements

Generally, there are no special hardware requirements specific to the DAG or mailbox database copy. The server used must comply with all requirements specified in the Exchange 2010 prerequisites and Exchange 2010 system requirements topic. For hardware planning information, see the following topics:

  • Understanding of processor configuration and Exchange performance
  • Measure the test taker's knowledge about memory configuration and Exchange performance.
  • Mailbox Server Storage Design
  • Understanding server role ratio and Exchange performance

Software requirements

Dags are available in both Exchange2010 Standard Edition and Exchange2010 Enterprise Edition. In addition, the DAG can contain a Hybrid server running Exchange2010 Standard Edition and Exchange2010 Enterprise Edition.

Each member of the DAG must also run the same operating system. Both WindowsServer2008 and WindowsServer2008 R2 support Exchange2010. All DAG members must run WindowsServer2008 or WindowsServer2008 R2. They cannot contain a combination of WindowsServer2008 and WindowsServer2008 R2.

In addition to the prerequisites for installing Exchange2010, you must also comply with the operating system requirements. DAG uses the Windows failover cluster technology, so they require the Windows Enterprise version.

Network requirements

Each DAG and each DAG member must comply with specific network requirements. The DAG network is similar to the public, hybrid, and private network used in earlier versions of Exchange. However, unlike previous versions, using a single network in each DAG member is a supported configuration. In addition, the term has been changed. Each DAG no longer uses a public, private, or hybrid network, but an "MAPI network" Other servers, such as other Exchange2010 servers and directory servers that use this network to communicate with DAG members) and zero or multiple "replication networks" These networks are dedicated to log shipper and seed settings ).

Although one network is supported, we recommend that each DAG have at least two networks: One MAPI network and one replication network. This can provide network and network path redundancy so that the system can distinguish between server and network faults. Using a single network adapter will prevent the system from distinguishing the two types of faults.

Note:
When writing the product documentation in this content area, it is assumed that each DAG member contains at least two network adapters, each DAG is configured with one MAPI network and at least one replication network, in addition, the system can distinguish between network faults and server faults.

Consider the following when designing a network infrastructure for a DAG:

  • Each member of a DAG must have at least one network adapter that can communicate with all other DAG members. If you are using a single network path, we recommend that you use a Gigabit Ethernet. When using a single network adapter for each DAG member, you must enable the DAG network for replication and configure it as an MAPI network. Because there is no other network, the system also uses the MAPI network as the replication network. In addition, when using a single network adapter for each DAG Member, we recommend that you consider a single network adapter and path when designing the overall solution.
  • Using two network adapters in each DAG member provides an MAPI network and a replication network, and the following Recovery behavior:
    • If a fault affects the MAPI network, a server failover is assumed that a healthy email database copy can be activated ).
    • When a fault affects the replication network, if the MAPI network is not affected by the fault, the log transfer and Seed Setting operations will be restored to using the MAPI network. When a faulty replication network is restored, the log transfer and Seed Setting operations will be restored to the replication network.
  • Each DAG member must have the same number of networks. For example, if you plan to use a single network adapter in a DAG member, all members of the DAG must also use a single network adapter.
  • Each DAG cannot have multiple MAPI networks. The MAPI network must be connected to other Exchange servers and other services such as ActiveDirectory and DNS.
  • You can add other replication networks as needed. Network adapters can be grouped or similar technologies to prevent single point of failure (spof) in individual network adapters. However, even if a group is used, the network itself cannot be prevented from a single point of failure.
  • Each network in each DAG Member Server must be on its own network subnet. Each server in the DAG can be on a different subnet, but the MAPI and replication network must be able to route and provide connections:
    • Each network in each DAG member server is located on its own network subnet, And the subnet is separated from the subnet used by each other network in the server.
    • The MAPI network of each DAG Member Server can communicate with the MAPI network of each other DAG member.
    • The replication network of each DAG Member Server can communicate with the replication network of each other DAG member.
    • Without direct routing, the detection signal communication is transmitted from the replication network on One DAG Member Server to the MAPI network on another DAG Member Server, and vice versa ), there is no direct route between multiple replication networks in the DAG.
  • Regardless of the geographic location of each member of a DAG relative to another DAG member, the round-trip network latency between each Member cannot exceed 250 milliseconds (MS ).
  • For multi-data center configuration, the round-trip latency requirement may not be the strictest network bandwidth and latency requirement. You must evaluate the total network load, including client access, ActiveDirectory, transmission, continuous replication, and other application communications, to determine the network requirements required for your environment.
  • The DAG network supports Internet Protocol Version 4 (IPv4) and IPv6. IPv6 is supported only when both IPv4 addresses are used. IPv6 address and IP address range are supported only when both IPv6 and IPv4 are enabled on the computer and the network supports these two IP address versions. If Exchange2010 is deployed in this configuration, all server roles can send and receive data from devices, servers, and clients that use IPv6 addresses.
  • APIPA is a function of Microsoft Windows. When no Dynamic Host Configuration Protocol (DHCP) server is available on the network, it automatically allocates an IP address. APIPA addresses include addresses manually allocated from the APIPA address range) cannot be used by DAG or Exchange2010.

DAG name and IP address requirements

During creation, a unique name is specified for each DAG, one or more static IP addresses are allocated, or DHCP is configured. Whether static or dynamically assigned addresses are used, any IP address allocated to the DAG must be on the MAPI network.

Each DAG requires at least one IP address on the MAPI network. When the MAPI network is expanded across multiple subnets, the DAG needs other IP addresses. Description DAG. All nodes in the DAG have an MAPI network on the same subnet.

Database Availability groups with MAPI networks on the same subnet

In this example, the MAPI network in each DAG member is located at 172.19.18.XSubnet. Therefore, the DAG must have a single IP address on this subnet.

The following figure shows a DAG with an MAPI network. The network spans two subnets: 172.19.18.XAnd 172.19.19.X.

Database Availability groups with MAPI networks on multiple sub-networks

In this example, the MAPI network in each DAG member is located on a separate subnet. Therefore, the DAG requires two IP addresses, and each subnet in the MAPI network has one address.

Each time the DAG's MAPI network expands across other subnets, you must configure other IP addresses for this subnet for the DAG. Each IP address configured for the DAG is assigned to the basic failover cluster of the DAG and used by the cluster. The DAG name is also used as the name of the basic failover cluster.

At any specific time, the DAG cluster will only use one of the allocated IP addresses. When the cluster IP address and network name resource are online, the Windows failover group will register this IP address in DNS. In addition to the IP address and network name, the cluster network object (CNO) will be created in ActiveDirectory ). The system also uses the cluster name, IP address, and CNO internally to protect the DAG for internal communication. The Administrator and end user do not need to connect to or connect to the DAG name or IP address.

Note:
Although the IP address and Network Name of the cluster are used inside the system, there is no hard dependency in the Exchange2010 that provides these resources. Even if the IP address and network name resources of the basic cluster are offline, internal communication still occurs in the DAG by using the server name of the DAG member. However, we recommend that you regularly monitor the availability of these resources to ensure that they are offline for no more than 30 days. If the basic cluster is offline for more than 30 days, the garbage collection mechanism in ActiveDirectory may invalidate the cluster CNO account.

Network Adapter configuration of DAG

Each network adapter must be correctly configured based on the intended purpose. The configuration of the network adapter used for the mapi network is different from that used to copy the network adapter. In addition to correctly configuring each network adapter, you must also configure the network connection sequence in Windows so that the MAPI network is at the top of the connection sequence. For more information about how to modify the network connection sequence, see modify Protocol binding sequence.

MAPI network adapter Configuration

Network adapters for MAPI networks should be configured as described in the following table.

Networking functions Set

Microsoft Network Client

Enabled

QoS packet scheduler

Optional

File and printer sharing in Microsoft Network

Enable

Internet Protocol version 6 (TCP/IP v6)

Optional

Internet Protocol Version 4 (TCP/IP v4)

Enabled

Link Layer topology discovery er I/O Driver

Enabled

Link Layer topology discovery Response Program

Enabled

The TCP/IP v4 attribute of the MAPI network adapter is configured as follows:

  • You can manually assign an IP address or configure it to use DHCP. If DHCP is used, we recommend that you retain the IP address of the server permanently.
  • The default gateway is usually used in the MAPI network, although the gateway is not required.
  • At least one DNS server address must be configured. To achieve redundancy, we recommend that you use multiple DNS servers.
  • Select "register this connection address in DNS" Check box.

Copy network adapter configurations

Network adapters for the replication network should be configured as described in the following table.

Networking functions Set

Microsoft Network Client

Disabled

QoS packet scheduler

Optional

File and printer sharing in Microsoft Network

Disabled

Internet Protocol version 6 (TCP/IP v6)

Optional

Internet Protocol Version 4 (TCP/IP v4)

Enabled

Link Layer topology discovery er I/O Driver

Enabled

Link Layer topology discovery Response Program

Enabled

The TCP/IP v4 attribute of the replication network adapter is configured as follows:

  • You can manually assign an IP address or configure it to use DHCP. If DHCP is used, we recommend that you retain the IP address of the server permanently.
  • There is usually no default gateway for the replication network. If the MAPI network has a default gateway, no default gateway exists for other networks. You can use persistent Static Routing to configure a network communication route on the replication network, and route the network communication to the corresponding network of other DAG members using the gateway address, the gateway address can be routed between the replication networks. All other communications that do not match this route will be handled by the default gateway configured on the MAPI network adapter.
  • The DNS server address should not be configured.
  • Do not select "register this connection address in DNS" Check box.

Witness server requirements

"Witness server" Is a server outside the DAG. When the number of members of the DAG is even, the server can be used to implement and maintain arbitration. When the number of members in the DAG is an odd number, the witness server is not used. The witness server is used for all directed acyclic graphs (dags) with an even number of members. The witness server can be any computer running Windows server. It is not required that the Windows server operating system version of the witness server matches the operating system used by DAG members.

Cluster-level maintenance for arbitration under DAG. Most members of a DAG are online and can communicate with other online members of the DAG. This arbitration concept is an aspect of the concept of centralized arbitration for Windows failover groups. "Arbitration resources" are essential for the concentration of failover groups related to arbitration" . The arbitration resource is the internal resource of the Failover cluster. It can provide an arbitration method to lead to cluster status and membership decision-making. Arbitration resources also provide a permanent storage area for storage configuration information. The supporting component of the arbitration resource is "arbitration log" It is the configuration database of the cluster. The arbitration log contains the following information: Which servers are members of the cluster, which resources are installed in the cluster, and the status of these resources, such as online or offline ).

Each DAG Member should have a consistent view of how to configure the basic cluster of the DAG, which is crucial. Arbitration acts as an authoritative repository for all configuration information related to the cluster. Arbitration is also used as a link disconnection referee to avoid the symptoms of "network partitioning. Network Partitioning occurs when DAG members cannot communicate with each other but can run normally. Always require most DAG members to use the DAG witness server when the DAG members are even) available and interactive, so that the DAG can work normally, so as to prevent symptoms of network partitions.

Plan site recovery

More and more business personnel recognize that daily access to reliable and available email systems is the foundation for their success. For many organizations, the mail system is part of the business continuity plan and site recovery should be considered when designing the mail service deployment. Basically, many site recovery solutions involve deploying hardware in the second data center.

In the end, the overall design of the DAG includes the number of members of the DAG and the number of mailbox Database Replicas) will depend on the recovery service level agreement (SLA) for each organization including various fault situations ). During the planning phase, the solution's architects and administrators will determine deployment requirements, especially site recovery requirements. They determine the location to use and the target SLA for recovery. SLA will determine two specific elements, which should be the basis for designing high availability and site recovery solutions: recovery time objective (RTO) and recovery point objective (RPO ). Both values are measured in minutes. RTO is the time required to restore the service. RPO refers to the degree to which data is updated after the recovery is completed. The SLA can also be defined as a complete service after the primary data center is resolved.

The solution's structure designer and administrator will also determine which group of users need site recovery protection, and determine whether the multi-site solution is active/passive or active/active. In active/passive configuration, the Standby data center usually does not reside in any user. In the activity/activity configuration, users reside in two locations at the same time. In this solution, there is a certain percentage of the total number of databases in the second data center's preferred activity location. When a user's service in a data center fails, the user will be activated in another data center.

When constructing an appropriate SLA, you usually need to consider the following basic issues:

  • What level of service is required when the primary data center fails?
  • Do users need data services or only mail services?
  • How much data is in urgent need?
  • How many users must be supported?
  • How do users access their own data?
  • What is the standby data center activation SLA?
  • How do I migrate services back to the primary data center?
  • Are resources dedicated to site recovery solutions?

By answering these questions, you have actually begun to build a general framework for site recovery design for the mail solution. The core requirements for recovery from site faults are: create a solution and put necessary mail data into the backup data center that carries the backup mail service.

Namespace Planning

When you deploy the site to restore the configuration, Exchange2010 will change the plan namespace design method. Correct namespace plan is the key to successful data center switching. From the perspective of namespace, each data center used in site recovery configuration is considered as an active data center. Therefore, each data center needs its own unique namespace for various Exchange2010 services on the site, including OutlookWebApp, Outlook Anywhere, ExchangeActiveSync, Exchange Web service, RPC client access, Post Office Protocol Version 3 (POP3), Internet mail access protocol version 4 (IMAP4) and the namespace of the Simple Mail Transfer Protocol (SMTP. In addition, one data center also carries the namespace of Autodiscover. This design also enables you to switch a single database from the primary data center to the second data center, to verify the configuration of the second data center as part of the verification and practice of data center switching.

As a best practice, we recommend that you use "sharding DNS" The Exchange host name used by the client. DNS splitting refers to a DNS server configuration. The internal DNS server returns the internal IP address of the host name, and the external IP address is Internet-oriented.) the DNS server returns the public IP address of the same host name. Because the same host name can be used internally and externally by splitting DNS, this policy can minimize the number of host names required.

Describes the namespace plan for site recovery configuration.

Namespace for site recovery DAG deployment

As shown above, each data center uses a unique namespace, and each namespace contains a DNS server in the sharding DNS configuration of these namespaces. Raymond data center is regarded as the master data center. A namespace is configured.Protocol.Contoso.com. The Portland data center is configured with a namespaceProtocol.Standby.contoso.com. The namespace can include the standby flag, as shown in the example image. They can be named based on the region, for exampleProtocol.Portland.contoso.com) can also be named based on other naming conventions suitable for the Organization's needs. Regardless of the naming conventions used, the key point is that each data center must have its own unique namespace.

Certificate Planning

When deploying a DAG in a single data center, there are no unique or special design considerations for the certificate. However, when you expand the DAG across multiple data centers in the site recovery configuration, you must pay attention to the certificate. Generally, the certificate design depends on the client in use and the certificate requirements of other applications that use the certificate. However, specific suggestions and best practices should be followed for the type and number of certificates to be used.

As a best practice, the number of certificates used for client access servers, reverse proxy servers, transport server margins, and hubs should be minimized. We recommend that you use a single certificate for all these service endpoints in each data center. This method minimizes the number of required certificates, reducing the cost and complexity of the solution.

For Outlook Anywhere clients, we recommend that you use a single topic backup name (SAN) certificate for each data center and include multiple host names in the certificate. To ensure the connectivity of Outlook Anywhere after the database, server, or data center switch, you must use the same certificate subject name on each certificate and use Microsoft standard format (msstd) configure the same subject name for the Outlook provider configuration object ActiveDirectory. For example, if you use the certificate subject name mail.contoso.com, you can configure the attributes as follows:

 

Set-OutlookProvider EXPR -CertPrincipalName "msstd:mail.contoso.com"

Some applications integrated with Exchange have specific certificate requirements that may require additional certificates. Exchange2010 can coexist with Office Communications Server (OCS. OCS requires a 1024-bit or higher certificate, which uses the OCS server name as the certificate subject name. Because using the OCS server name as the certificate subject name will prevent Outlook Anywhere from working properly, you need to use other separate certificates in the OCS environment.

For more information about how to use the SAN certificate to access the Exchange2010 client, see configure the SSL certificate to use multiple clients to access the server host name.

Network Planning

In addition to meeting the specific network requirements of each DAG and each server that belongs to the DAG member, there are also some site recovery configuration-specific requirements and suggestions. As with all DAG members, the back-to-back network latency between DAG members cannot exceed 250 milliseconds (MS), regardless of whether they are deployed on a single site or multiple sites ). In addition, some specific configuration suggestions are available for DAG extensions across multiple sites:

  • The MAPI network should be independent of the replication network.Windows Network policy, Windows Firewall policy, or Router Access Control List (ACL) should be used to prevent communication between the MAPI network and the replication network. This configuration is required to prevent cross-communication between network detection signals.
  • Client-oriented DNS records should have a 5-minute TTL)The downtime of the client experience depends not only on the switching speed, but also on the speed of DNS replication and the speed at which the client queries DNS update information. Access to all Exchange client services, including OutlookWebApp, ExchangeActiveSync, Exchange Web service, Outlook Anywhere, SMTP, POP3, IMAP4, and RPC Clients, should be performed on internal and external DNS servers) the DNS record life time is set to 5 minutes.
  • Configure cross-replication network connections using static routesTo provide network connections between each replication network adapter, use persistent Static Routing. When a static IP address is used, this is a one-time quick configuration for each DAG member. If DHCP is used to obtain an IP address for the replication network, you can also use it to allocate a static route for the replication to simplify the configuration process.

Regular website recovery plan

In addition to the high availability requirements listed above, there are other suggestions for deploying Exchange2010 in site recovery configuration, for example, extending the DAG across multiple data centers. In the planning phase, what problems will directly affect the success of the site recovery solution. For example, poor namespace design can cause certificate problems, and incorrect certificate configuration may prevent users from accessing the service.

To minimize the time required to activate the second data center and allow the second data center to host the service endpoints of the faulty data center, appropriate planning must be completed. For example:

  • Service level agreement (SLA) objectives of site recovery solutions must be fully understood and documented.
  • The server in the second data center must have sufficient capacity to carry the combined user groups of the two data centers.
  • The second data center must enable all services provided in the primary data center unless these services are not included as part of the site recovery SLA ). This includes ActiveDirectory, network infrastructure DNS, TCP/IP, and so on), telephone services if the use of unified messaging) and site infrastructure power, heat dissipation, etc ).
  • To enable certain services to serve users in the faulty data center, you must have configured the correct server certificate for these services. Some services cannot be instantiated, such as POP3 and IMAP4. Only a single certificate is allowed. In these cases, the certificate must either be a topic backup name (SAN) Certificate with multiple names or multiple similar names, to enable the use of wildcard certificates, assuming that the Organization's security policy allows the use of wildcard certificates ).
  • The necessary services must be defined in the second data center. For example, if the first data center has three different SMTP URLs on different transmission servers, you must define the appropriate configuration in the second data center, make at least one, if not all three, transmission servers carry workloads.
  • To support data center switching, you must have configured the necessary network. This may mean that Server Load balancer is configured, global DNS is configured, and Internet connections configured with appropriate routes are enabled.
  • You must understand the DNS change policies required to support data center switching. You must define and record specific DNS changes, including their TTL settings, to support valid SLAs.
  • You must also establish a policy for testing the solution and include it in the SLA. Regular deployment verification is the only way to ensure that the quality and practicability of the deployment do not degrade over time. After the deployment is verified, we recommend that you clearly record the configurations that directly affect the success of the solution. In addition, we recommend that you enhance the change management process around these deployment components.

Plan data center switchover

Correct planning and preparation not only involves the deployment of the second data center resources, such as active client access and hub transport server), but also involves pre-configuring these resources as part of the data center switchover operation, to minimize the required changes.

Note:
The second data center requires client access and hub transmission services, even if the mailbox database in the second data center is blocked from being automatically activated. These services are required to perform database switchover and test and verify services and data in the second data center.

To better understand how data center switching works, it is very useful to understand the basic operations of data center switching in Exchange2010.

As shown in, site recovery deployment includes a DAG with members in both data centers.

The Member is located in a Database Availability group in two data centers.

When you expand a DAG across multiple data centers, you should design it so that most DAG members are in the primary data center, or when each data center has the same number of members, enable the primary data center to host the witness server. This design ensures that services are provided in the primary data center, even if the network connection between the two data centers fails. However, this also means that when the primary data center fails, arbitration for members in the second data center will be lost.

Some data center faults may also occur. If the primary data center hinders effective services and management due to insufficient functions, you should perform a data center switch to activate the second data center. The activation process involves the Administrator configuring the surviving servers in some operation status to stop the service. You can activate the service in the second data center. This prevents both attempts and operations on two groups of services.

Due to the loss of arbitration, the DAG members in the second data center cannot automatically come online. Therefore, activating the mailbox server in the second data center also requires a step to force the DAG Member Server to create arbitration, and will delete only temporarily from the DAG) the server in the faulty data center. This provides a stable service solution that can experience other faults at a certain level and continue to work normally.

Note:
One prerequisite for experiencing other failures is that the DAG has at least four members, and the four members are distributed between two ActiveDirectory sites, that is, each data center has at least two members ).

This is the basic process for re-establishing the mailbox role function in the second data center. Activating other roles in the second data center does not involve explicit operations on the affected servers in the second data center. On the contrary, servers in the second data center will become the service endpoints of those services normally hosted by the primary data center. For example, users that typically reside in the primary data center can connect to the OutlookWebApp using a https://mail.contoso.com/owa. After the data center fails, these service endpoints are moved to the second data center as part of the switchover operation. During the switchover, the server point of the primary data center is redirected to the backup IP address of the same service in the second data center. During the switchover, this reduces the number of times the configuration information stored in ActiveDirectory must be changed. Generally, you can perform this step in two ways:

  • Use DNS to record updates; or
  • Use global DNS and Load balancer to reconfigure the Enable and disable the backup IP address to move services between data centers.
  • A policy for testing the solution must be established. Must be included in the SLA. Regular verification deployment is the only way to ensure that the deployment does not degrade over time.

Carefully completing these planning steps will have an immediate effect on successful data center switching. For example, poor namespace design may cause certificate problems, and incorrect certificate configuration may prevent users from accessing the service.

After the deployment is verified, we recommend that you explicitly record all configurations that have been successfully switched to the data center. In addition, the change management process should be strengthened around these deployment components.

For more information about data center switchover, including activating the secondary data center and re-activating the primary data center that fails, see Data Center switchover.

Original article address

View more articles

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.