Friday, 20 March 2015

Transport High Availability in Exchange 2013 (Part 2)

Configuring Shadow Redundancy

Shadow redundancy is a global setting which is enabled by default (it is not possible to enable or disable Shadow Redundancy on a per-server basis). To disable it, we use the Set-TransportConfig cmdlet to set the ShadowRedundancyEnabled parameter to $False:

Image
Figure 2.1: 
Disabling Shadow Redundancy
It is possible that when trying to create a redundant copy of an e-mail, the SMTP connections between the primary and the shadow servers or the one between the sending and the primary servers, times out. If this happens or, if for some reason, a Mailbox server is unable to generate a redundant copy, the e-mail is not rejected by default. However, if generating a redundant copy for every e-mail entering the organization is paramount, Exchange can be configured to reject an e-mail if a redundant copy of it is not generated. This is done by using the Set-TransportConfig cmdlet to set the RejectMessageOnShadowFailure parameter to $True:
Image
Figure 2.2: Enabling Reject Message On Shadow Failure
In this scenario, the e-mail will be rejected with a “451 4.4.0 Message failed to be made redundant” SMTP response code, but the sending server can retransmit it again.
Because Shadow Redundancy is unable to protect e-mails in single server environment, this setting should only be enabled when extremely necessary and in an environment with multiple Mailbox servers available.
If a shadow copy of the e-mail gets created but the SMTP session between the sending and primary servers time out, the primary e-mail is accepted and processed by the primary server, but the sending server will try to redeliver the unacknowledged e-mail. Exchange’s duplicate e-mail detection kicks in and prevents the recipient from receiving two identical e-mails, even though the primary server generates another shadow copy of the e-mail upon resubmission.
The following Set-TransportConfig parameters control how shadow e-mails are created:
  • ShadowMessagePreferenceSetting controls where a shadow copy of an e-mail gets generated. It can be set to PreferRemote (default value) to try to create a copy in a different AD site preferably, LocalOnly to only create a copy in the local AD site or RemoteOnly to only create a copy in a different AD site. This parameter is used only for members of a DAG spread across multiple AD sites;
  • MaxRetriesForRemoteSiteShadow specifies how many times (4 by default) a Mailbox server attempts to generate a shadow copy of an e-mail on a remote AD site (if ShadowMessagePreferenceSetting = RemoteOnly) or on a remote site before trying to create it in the same local AD site (if ShadowMessagePreferenceSetting = PreferRemote). If unable to, the e-mail will either be rejected or accepted depending on the setting of RejectMessageOnShadowFailure;
  • MaxRetriesForLocalSiteShadow specifies how many times (2 by default) a Mailbox server attempts to generate a shadow copy of an e-mail on the same local AD site. This parameter is used when ShadowMessagePreferenceSetting = PreferRemote or LocalOnly, when the Mailbox server is not a DAG member, or if its DAG is on a single AD site. If unable to, the e-mail will either be rejected or accepted depending on the setting of RejectMessageOnShadowFailure.
Both Receive and Send connectors have a ConnectionInactivityTimeOut setting that can be used to specify for how long an SMTP connection can remain idle before it is closed. Receive connectors have an additional parameter, ConnectionTimeout, to specify for how long an SMTP connection can remain open before it is closed even if data is still being transmitted.

Maintaining Shadow E-mails

The work of Shadow Redundancy does not stop when a shadow e-mail is created as the primary and shadow servers have to remain in contact so they can track the status and progress of the e-mail. When it is transmitted successfully to the next hop by the primary server, and the next hop acknowledges receiving the e-mail, the discard status of the e-mail is updated to delivery complete by the primary server. This discard status is simply a message containing a list of e-mails being monitored by Shadow Redundancy.
When an e-mail is delivered successfully, there is no need to keep it in a shadow queue. Therefore, when the shadow server realizes the e-mail was successfully transmitted to the next hop by the primary server, it moves the shadow e-mail into Safety Net.
In order for a shadow server to determine the discard status of shadow e-mails in its own shadow queues, it queries the primary server and issues a XQDISCARD command. The primary server then responds with the discard notification for e-mails that apply to that shadow server. Discard notifications are not stored in memory but on disk so they persist if the server or the Exchange Transport service restart.
This communication between the primary and shadow servers is also used as a heartbeat to determine the servers’ availability. If the shadow server is unable to establish a session with the primary server after a certain amount of time (3 hours by default, configurable using the ShadowResubmitTimeSpan parameter), then the shadow server will promote itself to primary server, promote the shadow e-mails to primary e-mails and transmit them to the next hop.
Several parameters of the Set-TransportConfig cmdlet are used to control how shadow e-mails are maintained:
  • ShadowHeartbeatFrequency specifies how long (2 minutes by default) a shadow server waits before checking for discard status by establishing a session with the primary server;
  • ShadowResubmitTimeSpan specifies how long (3 hours by default) a server waits before determining a primary server failed and assuming control of shadow e-mails in the shadow queue for the unreachable primary server;
  • ShadowMessageAutoDiscardInterval specifies how long (2 days by default) to retain discard events of e-mails successfully delivered. A primary server queues discard events until the shadow server queries it. If it is not queried during the time specified in this parameter, the queued discard events are deleted by the primary server;
  • SafetyNetHoldTime specifies how long (2 days by default) successfully processed e-mails are kept in Safety Net. Shadow e-mails that are not acknowledged expire from Safety Net after SafetyNetHoldTime + MessageExpirationTimeout.
MessageExpirationTimeout, part of the Get-TransportService cmdlet specifies how long (2 days by default) e-mails can remain in a queue before expiring.

Shadow Redundancy After an Outage

When a server outage occurs, Shadow Redundancy will minimize e-mail loss. When the failed Mailbox server is brought back online, two scenarios are possible:
  • The server has a new transport database and, therefore, it is recognized by other transport servers in the organization as being a new route. As such, servers with shadow e-mails queued for the recovered server assume ownership of those e-mails and resubmit them. The e-mails are then delivered to their destinations. E-mails are delayed for a maximum amount of time specified by ShadowHeartbeatFrequency;
  • The server has the same transport database but was offline enough time for the shadow server to take ownership of the e-mails and resubmit them. When the server is brought back online, it delivers the e-mails in its queues, resulting in duplicate delivery as the shadow server had already delivered these e-mails. However, duplicate e-mail detection prevents recipients from receiving duplicate e-mails. E-mails are delayed for a maximum amount of time specified by ShadowResubmitTimeSpan.

Conclusion

In this second article we finished looking at how Shadow Redundancy works to ensure no e-mail is lost while in transit. In the next article we will start exploring the new version of Transport Dumpster, Safety Net.

No comments:

Post a Comment