When an administrator creates a Database
Availability Group [DAG], it is initially created as an empty object in
Active Directory [AD]. This object is used to store relevant information
about the DAG, such as server membership information. When the first
server is added to the DAG, a failover cluster is automatically created
for the DAG and used exclusively by the DAG. DAGs make limited use of
Windows failover clustering technology, such as the cluster heartbeat,
cluster networks and the cluster database (for storing data like
database state changes from active to passive or vice versa, or from
mounted to dismounted and vice versa). As each, when a subsequent server
is added to the DAG, it is joined to the underlying cluster, the
cluster's quorum model is automatically adjusted by Exchange, and the
server is added to the DAG object in AD.
Failover clusters use the concept of
quorum, which uses a consensus of voters to ensure that only one subset
of the cluster members (which could be all members or a majority of
members) is functioning at one time. Highly available Mailbox servers in
previous versions of Exchange also use failover clustering and its
concept of quorum, so this is not a new concept. Quorum represents a
shared view of members and resources, and the term quorum is
also used to describe the physical data that represents the
configuration within the cluster that is shared between all cluster
members. As a result, all DAGs require their underlying failover cluster
to have quorum. If the cluster loses quorum, all DAG operations
terminate and all mounted databases hosted in the DAG are dismounted.
Quorum is important to 1) ensure consistency so each of the members always has a view of the cluster that is consistent with the other members; and 2)
to act as a tie-breaker to avoid partitioning (such as split brain
syndrome scenarios) and to make sure that only one collection of the
members in the DAG is considered official.
Majority Node Set Clustering
Majority Node Set [MNS] is a Windows
Clustering model used since early versions of Exchange. This model
requires 50% of the voters (servers and/or one file share witness) to be
up and running.
DAGs with an even number of members use the failover cluster's Node and File Share Majority
quorum mode, which uses an external witness server that acts as a
tie-breaker. In this quorum mode, each DAG member gets a vote. In
addition, the witness server is used to provide one DAG member with a
weighted vote. The cluster quorum data is stored by default on the
system disk of each member of the DAG and is kept consistent across
those disks. A file on the witness server (thus the name File Share)
is used to keep track of which member has the most updated copy of the
data - the witness server does not have a copy of the cluster quorum
data.
In this mode, a majority of the voters must
be operational and able to communicate with each other to maintain
quorum. If a majority of the voters cannot communicate with each other,
the DAG's underlying cluster loses quorum and the DAG will require
administrator intervention to become operational again. When the witness
server is needed for quorum, any member of the DAG that can communicate
with the witness server can place a Server Message Block [SMB] lock on
the witness server's witness.log file. The DAG member that
locks the witness server (the locking node) retains an additional vote
for quorum purposes. The DAG members in contact with the locking node
are in the majority and maintain quorum. Any DAG members that cannot
contact the locking node are in the minority and therefore lose quorum.
Consider a DAG with four members. Because
this DAG has an even number of members, an external witness server is
used to provide one of the cluster members with a fifth, tie-breaking
vote. To maintain a majority of voters (and therefore quorum), at least
three voters must be able to communicate with each other. At any time, a
maximum of two voters can be offline without disrupting service and
data access. If three or more voters are offline, the DAG loses quorum
and all databases are dismounted.
Figure 1.1: Database Availability Group with an Even Number of Members
The following formula helps administrators
calculate how many nodes in a cluster have to be available before the
cluster is brought offline: (n / 2) + 1 where n is the
number of DAG nodes within the DAG (note that n/2 is always rounded
down). So, in this example, we have: (5/2)+1 = 2+1 = 3.
DAGs with an odd number of members use the failover cluster's Node Majority
quorum mode. In this mode, each member gets a vote and each member's
local system disk is used to store the cluster quorum data. If the
configuration of the DAG changes, that change is reflected across the
different disks. The change is only considered to have been committed
and made persistent if that change is made to the disks on half the
members (rounding down) plus one. For example, in a three-member DAG,
the change must be made on one plus one members, or two members in
total. In this scenario, and using the formula above, only one server
can be down at one time. If a second server is also offline, the entire
cluster will be brought offline.
Figure 1.2: Database Availability Group with an Odd Number of Members
Windows Server 2012
Windows Server 2012 introduced a new model called Failover Clustering Dynamic Quorum, which we can use with Exchange. When using Dynamic Quorum,
the cluster dynamically manages the vote assignment to nodes based on
the state of each node. When a node shuts down or crashes, it loses its
quorum vote. When a node successfully re-joins the cluster, it regains
its quorum vote. By dynamically adjusting the assignment of quorum
votes, the cluster can increase or decrease the number of quorum votes
that are required to keep it running. This enables the cluster to
maintain availability during sequential node failures or shutdowns.
With a dynamic quorum, the cluster quorum
majority is determined by the set of nodes that are active members of
the cluster at any time. This is an important distinction from the
cluster quorum in Windows Server 2008 R2 where the quorum majority is
fixed, based on the initial cluster configuration.
Important:
The advantage this brings, is that it is now possible for a cluster to run even if the number of nodes remaining in the cluster is less than 50%! By dynamically adjusting the quorum majority requirement, the cluster can sustain sequential node shutdowns down to a single node and still keep running. It does not allow the cluster to sustain a simultaneous failure of a majority of voting members though. To continue running, the cluster must always have a quorum majority at the time of a node shutdown or failure.
The advantage this brings, is that it is now possible for a cluster to run even if the number of nodes remaining in the cluster is less than 50%! By dynamically adjusting the quorum majority requirement, the cluster can sustain sequential node shutdowns down to a single node and still keep running. It does not allow the cluster to sustain a simultaneous failure of a majority of voting members though. To continue running, the cluster must always have a quorum majority at the time of a node shutdown or failure.
The cluster-assigned dynamic vote of a node can be verified with the DynamicWeight property of the cluster node by using the Get-ClusterNode
cmdlet. A value of 0 indicates that the node does not have a quorum
vote, while a value of 1 indicates that the node has a quorum vote:
Figure 1.3: Dynamic Weight Property of a Dynamic Quorum
To change the quorum configuration in a failover cluster by using the Failover Cluster Manager, follow these steps:
- In Failover Cluster Manager, select the cluster that you want to change;
- With the cluster selected, under Actions, click More Actions, and then click Configure Cluster Quorum Settings:
Figure 1.4: Configure Cluster Quorum Settings Option
- The Configure Cluster Quorum Wizard appears. Click Next:
Figure 1.5: Configure Cluster Quorum Wizard
- On the Select Quorum Configuration Option page, the default is to allow the cluster to automatically configure the quorum settings that are optimal for our current cluster configuration (Use typical settings). To configure quorum management settings and to add or change the quorum witness, click Advanced quorum configuration and witness selection and then click Next:
Figure 1.6: Select Quorum Configuration Option
- On the Select Voting Configuration page, select All Nodes and click Next. For certain scenarios, you might want to assign votes only to a subset of the nodes or even to No Nodes. This is generally not recommended, because it does not allow nodes to participate in quorum voting and it requires configuring a disk witness which becomes the single point of failure for the cluster.
Figure 1.7: Select Voting Configuration
- On the Configure Quorum Management page, you can enable or disable the Allow cluster to dynamically manage the assignment of node votes option. Selecting this option enables dynamic quorum which increases the availability of the cluster by allowing it to continue running in failure scenarios that are not possible when this option is disabled. This option is enabled by default and it is strongly recommended not to disable it:
Figure 1.8: Configure Quorum Management
- On the Select Quorum Witness page, select an option to configure a disk witness or a file share witness. The wizard indicates the witness selection options that are recommended for our cluster. In this case, because the current DAG has an odd number of members, no witness is required:
Figure 1.9: Select Quorum Witness
- Click Next. Confirm your selections on the confirmation page that appears and then click Next:
Figure 1.10: Confirmation
After the wizard runs and the Summary page appears, if you want to view a report of the tasks that the wizard performed, click View Report. The most recent report will remain in the systemroot\Cluster\Reports folder with the name QuorumConfiguration.mht.
You can also use the Shell to check if dynamic quorum is being used by running the following cmdlet:
Figure 1.11: Checking Dynamic Quorum Configuration
To enable or disable dynamic quorum through the Shell, simply set the DynamicQuorum property to 1 (enabled) or to 0 (disabled) by running:
(Get-Cluster “cluster_name”).DynamicQuorum=0
Conclusion
In the first part of this article series,
we had a high level overview of the importance of quorum in a windows
cluster and how it affects Database Availability Groups. We also looked
at the advantages of the new Dynamic Quorum in Windows Server 2012.
No comments:
Post a Comment