Tuesday, October 16, 2007

Mail Delivery Slow: Messages Waiting to be Routed Queue Filled up

This issue stuck us 2 weeks back and we went all over by this. Suddenly we found out that the emails sent are being received after a lot of delay. Looking at the Exchange Server, I saw thousands of messages in Messages Waiting to be Routed Queue for Default SMTP Virtual Server and the number kept on increasing and the messages were going out very slow.

Slow mail delivery and mass queuing of mail in the Messages Waiting to be Routed queue in Exchange is typically caused by either Anti-Virus software on the Exchange server, by Distribution List expansion problems, or by connectivity problems between Exchange and Active Directory.

Antivirus was already ruled out as we tried disabling it from the registry as well but with no affect.

I opened up a case with MS PSS... turned up the diagnostics logging for MS Transport and MS DSAccess but nothing conclusive from the logs.

WE monitored the LDAP read and search times, SMTP categorizer queue length as it seemed to be the performace issue. Here is excellent MS guide for Troubleshoting Exchange Server Performance Issues.

Then we collected the Hang Dumps for Store.exe and Inetinfo.exe.

In the Store dump we see that we are waiting on WLAP calls to the GC’s.
From the Inetinfo dumps, we are waiting for the HrCheckRestrictions which means that the mail was probably to a DL that had restrictions placed on it.

So we figured out that Delviery Restrictions might be the cause of mail delivery being slow as we have applied the delivery restrictions on some Distribution Lists quite recently.

This problem occurs when lots of Lightweight Directory Access Protocol (LDAP) searches are initiated. Lots of LDAP searches are initiated when you send mail to distribution groups that include lots of users who have delivery restrictions configured on their mailboxes.

If you send a message to a group that contains many recipients, and if each of those recipients is also configured with a delivery restriction to reject messages from the members of a distribution group that contains many members, Exchange 2000/2003 Server must expand the restricted distribution group one time for each member of the group to which you sent the message. Also, if a failure that can be retried occurs during this process, Exchange Server stops the group expansion process, and then retries the connection an hour later. This causes the messages to be held in the categorizer queues, delays message processing, puts load on transports' Advanced Queuing and SMTP components, and eventually causes system queues to start backing up.

Here is the Excellent MS Exchange Team Blog for Performance issues due to connector restrictions.

Well, the solution to this above was the hotfix (already included in Exchange SP2) and change in one registry entry that defines the new Expansion Logic for restricted Distribution Group.

Here are the details:

1. Locate and then click the following registry subkey:
2. On the Edit menu, point to New, and then click Key
3. Type Parameters, and then press ENTER to name the new registry subkey.
4. Right-click Parameters, point to New, and then click DWORD Value.
5. Type RestrictionMethod, and then press ENTER to name the new registry entry.
6. Right-click RestrictionMethod, and then click Modify.
7. Type 2, and then press ENTER .

Reference: http://support.microsoft.com/default.aspx?scid=kb;EN-US;895407