Once the Cluster service is installed and running on the server, the server can join the cluster. Clustering can reduce the number of single points of failure and enable high availability of clustered resources. The following sections briefly describe node behavior in cluster creation and cluster operations.
Note: For information about installing a clustered server, see the Help and Deployment guide for the >2003 product family for Windows server http://www.aliyun.com/zixun/aggregation/19058.html.
Creating a Cluster
The cluster installation utility that is used to install the cluster software on the server and create a new cluster is included in the server cluster product. When you create a new cluster, you first run the utility on the computer that you selected as the first member of the cluster. The first step is to determine the cluster name and create the cluster database and the initial cluster member list to define the new cluster. A Windows Server 2003 cluster has added a Cluster Management Setup wizard and the ability to create, including from a remote, cluster using the Cluster.exe command line interface.
The second step in creating a cluster is to add a shared data storage device that can be used by all cluster members. This creates a new cluster with a node, its own local data storage device, and cluster shared resources-typically disk or data storage and connection media resources.
The final step in creating a cluster is to run the installation utility on every other computer that will be a member of the cluster. Whenever a new node is added to a cluster, the new node automatically obtains a copy of the existing cluster database from the original member of the cluster. When a node joins or forms a cluster, the Cluster service updates the private metabase copy of the node.
Form a cluster
If the server is running the Cluster service and cannot find other nodes in the cluster, it can form a cluster itself. To form a cluster, the node must be able to acquire exclusive rights to the quorum resource.
When the cluster is initially formed, the first node in the cluster will include the cluster configuration database. Each time a new node joins the cluster, the new node obtains and maintains a copy of the cluster configuration database locally. The quorum resource stores the latest version of the configuration database in the form of recovery logs, which contain node-independent cluster configuration and state data.
In a cluster operation, the Cluster service uses the quorum recovery log to perform the following actions:
Ensure that only one group of active, communicating nodes can form a cluster
Allows a node to form a cluster only if it can gain control of the quorum resource
Allow a node to join or remain in an existing cluster only if it can communicate with the node that controls the quorum resource
From the perspective of other nodes in the cluster and the Cluster service management interface, each node in the cluster may be in one of three different states when the cluster is formed. These states are logged by the event handler, which the event log manager replicates to other nodes in the cluster. The Cluster service status includes:
Offline。 The node at this time is not a fully valid cluster member. The node and its cluster server may or may not be running.
Online。 The node at this time is a fully valid cluster member. It complies with the update of the cluster database, exerts its own influence on the quorum algorithm, maintains heartbeat traffic, and can own and run resource groups.
Time out. The node at this time is a fully valid cluster member. It complies with the update of the cluster database, exerts its own influence on the quorum algorithm, and maintains heartbeat traffic, but it cannot accept resource groups. It can only support those resource groups that it currently owns. The paused state is provided to allow certain maintenance to be performed. Most server cluster components treat online and paused as equivalent states.
Join the cluster
If a server is to join an existing cluster, it must run the Cluster service and must successfully locate other nodes in the cluster. After the other nodes are found, the joined server must accept cluster Membership authentication and obtain a copy of the cluster configuration database.
The process of joining an existing cluster begins when Windows Server 2003 or Windows Service control Manager starts the Cluster service on a node. During the startup process, the Cluster service configures and loads the local data device for that node. It does not attempt to bring the shared cluster data device online as a node, because these devices may be in use by an existing cluster.
To find other nodes, a discovery process is initiated. When a node discovers any cluster member, it performs an authentication sequence. The first cluster member authenticates the new join and returns a successful status after the new server has been successfully validated. If the validation is unsuccessful (the cluster membership for the node being joined is not recognized, or it uses an invalid account password), the request to join the cluster is rejected.
After successful validation, the cluster node that is first online checks the copy of the metabase on the join node. If the replica is obsolete, the cluster node that is authenticating to the join server sends an updated copy of the database for the joined server. The node that just joined the cluster, after receiving the replicated database, can use it to find shared resources and bring them online as needed.
Detach from Cluster
Nodes may detach from the cluster when the node is closed or the Cluster service is stopped. However, when a node does not perform cluster operations (such as not submitting updates to the cluster configuration database), the node may be forced to detach (evicted) from the cluster.
If a node is detached from the cluster according to a predetermined schedule, it sends ClusterExit messages to all other node members notifying them that it will be detached from the cluster. The node immediately shuts down the resource and all cluster connections without waiting for any response. Because the remaining nodes receive an exit message, they do not perform a regrouping process to re-establish cluster membership when the node fails unexpectedly or when network traffic stops.