Before you build a cluster in the Windows Server R2 DataCenter environment, you first have to failover the cluster to the Windows Server (Windows Server Failover CLuster, referred to as WSFC) has a basic understanding. WSFC must be deployed in a domain-managed environment, consisting of multiple servers, each of which is called a node, and the Windows Server Failover Cluster service is running on each node, and the entire cluster system allows some nodes to be dropped, faulted, or corrupted without affecting the proper functioning of the entire system. The cluster automatically detects the health state of the node, and once the active node becomes unavailable, another node server automatically takes over the failed server, upgrades to the Active server, and continues processing the task, a server that takes over the failed server as a "failover".
One, the basic components of a Windows failover cluster
node and active node (active node): Each server that makes up a cluster is called a node, at the same time, there can only be one node to process the user's request, provide the service, the node is called the active node, the active node is determined by the cluster, and the user is completely transparent;
Virtual Server: All nodes in the cluster make up a virtual server, that is, from the outside of the cluster, you can see only one server, but not a heap of nodes behind the server, the virtual server has its own machine name and IP address, also known as " Virtual Network Name "and" Virtual IP ", the user through the virtual network name and virtual IP address to access the cluster, in fact, the virtual network name and virtual IP are registered on the DNS server, and the physical server network name and IP address no difference;
shared array: All resources that need to be shared between nodes, such as SQL Server data files, error logs, and so on, are stored on a shared array. Files that do not need to be shared are stored on the local disk of each node;
private networks and public networks : between multiple nodes that make up a cluster, connected through private networks and public networks, where nodes send "heartbeat" across private networks to sense that each other is working properly Public network is used by the clients outside the cluster to use the network, the external client access to the cluster node through the public network;
Second, the features provided by the Windows cluster
Windows Clustering does not provide the ability to load balance, at any time, only one node in the cluster can handle the user's request, while the other nodes are idle, the node that processes the user request is called active node, and the active node is determined by the cluster and is completely transparent to the user.
1, health detection and automatic failover
AlwaysOn High Availability Technology leverages the health detection and automatic failover characteristics of Windows Server failover clusters, so AlwaysOn must be built on top of WSFC:
Health Detection : Between the nodes, through the private network to send each other heartbeat signals to see if each other is working properly, such a signal is called "Heartbeat Line", once a server because of abnormal and unable to respond to the signal, then the remaining node is considered the node "Dead", the node is excluded from the current cluster The overall health condition of a cluster is determined by the quorum vote of the cluster node.
automatic failover : Monitor the health of each node with "Heartbeat", if the primary node (Primary node) does not respond to heartbeat, then another server automatically upgrades to Primary Node to continue processing the task, The failover process does not affect the application, and the user is unaware of the failover within the virtual server;
WSFC provides a lot of functionality, but the deployment of AlwaysOn only needs to understand these two mechanisms, for other WSFC applications, you can temporarily do not understand, the following third, four, five chapters, the next step to build WSFC.
2, quorum configuration for the cluster
Arbitration poll (Quorum voting): Quorum is a quorum, the legal amount of meaning, in the arbitration mode, the quorum configuration determines how many nodes can be tolerated by the cluster in the event of a normal service delivery. The cluster can continue to provide services until the failure node in the cluster reaches the amount specified in the Quorum (Quorum).
WSFC makes health checks and quorum polls between nodes in the cluster, each node sending a heartbeat signal periodically, detecting the health of other nodes, and sharing health data with other nodes, and the nodes that cannot respond to the heartbeat signal are considered to be in an abnormal state. All the health nodes of the cluster will soon know that the node has failed.
The quorum node set is a combination of a polling node and a witness node (Witness), and the result of the arbitration is determined by the majority (Majority) node, and the overall health state of the cluster is determined by the outcome of the periodic arbitration vote, and WSFC, based on the results of the arbitration vote, Perform an automatic failover or take the cluster offline: if the Quorum node collection (Quorum node Set) poll results indicate that most of the nodes are healthy, the cluster will fail over and continue to serve, and if the poll result is a small number of nodes, the cluster will be offline. For the quorum configuration of the cluster, please refer to my essay: Quorum for Failover cluster.
3, cluster resource group
Resource group: A resource group is a group of one or more resources, and failover occurs in a resource group, and at any time, each resource group belongs to only one node in the cluster, which is an active node. When you configure a resource group, the other resources that a resource depends on must be configured in the same resource group as the resource, and dependencies across resource groups do not exist.
An active node has a cluster of resource groups that can handle client requests, that is, nodes that have resources to serve the user. The active node is also known as the primary node (Primary node), and other nodes in the cluster are called secondary nodes (secondary Nodes), and the cluster automatically transfers resources to other secondary nodes when Primary Node fails. Based on the health detection strategy to control the process of automatic failover, failover is actually the transfer of the resource ownership relationship (Resource Ownership).
Third, install the Windows Server Failover Clustering (WSFC) service
Each node server for the cluster must be in the same domain, and the Windows Server Failover Clustering (WSFC) service is installed, and the installation process is simple, and the steps in that section require a few steps to install successfully.
1. Open Server Manager, select "Add Roles and Features"
2, in the Add Features Wizard, tick "Failover Clustering" and click "Next" to start the installation
3, in the "Confirmation" tab, confirm, click on the "INSTANLL" button to install the Operation
Iv. Configuring the Failover Cluster
1. Open the Failover Machine Manager
After installing the FailOver Clustering feature, open the Server Manager and select "FailOver Cluster Manager" in the menu tools to open the Failover Machine Manager
2, in Failover Manager, create the cluster
In the Failover Manager (Failover Cluster Manager), the user is able to manage the Failover Cluster that have been created, view the information of the cluster, monitor the status of the cluster and verify (Validate) The configuration of the cluster. At the mark, click Create Cluster to start creating a new cluster.
3. Add a Clustered node server
Enter the node server name of the cluster, the nodes server must be in the same network segment, can access each other;
4, validation warning
If you need to validate the WSFC requirements for the basic environment of the cluster, including the hardware, you can choose "Yes", this example chooses "No", does not verify
5, define "access points for administering the cluster"
To name the cluster, Cluster name is actually the network name of the virtual server, and the IP address of the cluster is automatically configured by the system, which is actually the IP address of the virtual server;
6, confirm the configuration information, click "Next", create a new cluster
Five, configuring the cluster quorum setting
When a node in a cluster fails, the service continues to be served by other nodes, but when there is a problem with communication between nodes, or when most nodes fail, the cluster stops serving, but how many nodes can the cluster tolerate failure? This is determined by the quorum configuration (Quorum config), which uses the majority principle, as long as the number of healthy nodes in the cluster reaches the quorum (majority of the nodes voted in favour), the cluster will continue to provide services, or the cluster ceases to serve. During the service outage, the normal node continuously monitors whether the fault node is back to normal, and once the number of normal nodes is restored to the quorum, the cluster is back to normal and continues to serve.
1. Return to failover Cluster Manager and show the successful cluster in the "Failover Cluster Manage" drop-down list
2, right click on the cluster node, click "More Actions" in the context menu, select "Configure Cluster Quorum Settings" in the extended menu to configure quorum for this group .
3. Open the wizard to configure cluster quorum
4, select the Quorum configuration option, use the default quorum configuration, determine the quorum management option by cluster
Microsoft recommends configuring a quorum Witness to help achieve the highest availability of the cluster, and if the quorum configuration is not well understood, the default option can be used to determine the quorum configuration by the cluster.
5. Confirm the Quorum configuration option and click "Next" to start configuring quorum settings for the cluster
Now that the Windows server failover cluster is built, users can deploy AlwaysOn high-availability technologies on the basis of WSFC and create availability groups (availability Group).
Reference Documentation:
Failover Cluster Step-by-Step guide:configuring the Quorum in a Failover Cluster
Build SQL Server AlwaysOn second (configure failover cluster) starting from 0
Windows Server Failover Clustering (WSFC) with SQL Server
AlwaysOn Failover Cluster Instances (SQL Server)
Deploy AlwaysOn First step: Build a Windows Server failover cluster