6
votes

I have multiple existing applications that send and receive messages using MSMQ via the System.Messaging API. The queues are generally non-transactional and are a mixture of MSMQ 3 and 4.

The receiving applications handle poison messages now to the extent that on the first occurrence of any exception the messages are put on an error queue for manual intervention. But it turns out that the vast majority of manual intervention consists of simply moving the message back to the main queue for another try at which point it succeeds. So to automate that process, I want to add a retry feature to the receiver such that messages are moved back to the main queue a given number of times with a given delay between each.

Rather than reinventing the wheel, I want to leverage anything I can that MSMQ provides out of the box as well as any popular or best practice patterns around this. To that end, there is a lot out there about the additional support for poison messages in MSMQ 4. But they don't appear to be easily accessible via .Net. Furthermore, the only references I can find to using them is via WCF with an MSMQ binding.

Can anyone suggest any patterns or point to any examples that implement retry if one is not using WCF?

2

2 Answers

7
votes

I wasn't able to find a single, popular pattern for doing this. But after poking around a bit in System.Messaging, I was able to leverage Message properties and MSMQ behavior in what I think was an appropriate way to get the job done with a minimum of moving parts.

Here is what I implemented. It turned out to be fairly simple and lightweight - not much code and easy to maintain:

I created an object called RetryLevel that has three properties:

int Order, int NumberOfRetries, TimeSpan Delay

The configuration of the receiver application now has a list of RetryLevel. So the new feature basically supports n-level retries.

Then I created an object called RetryInfo. This object has two properties:

int Attempts, string SourceQueuePath

An instance of the RetryInfo object is serialized and stored in the Extension property of each Message that ends up being retried. This allows me to track the current retry state on the message itself, thus eliminating the need to maintain a separate retry metadata store and all the overhead of reconciling message Ids, keeping the data in sync, etc.

Finally, I added a wait queue path to the receiver's configuration. This queue is where messages will be dropped while they are in "timeout".

So now, when a message handler rejects a message, the receiver deserializes it's RetryInfo, if there is one, and looks at the number of (previous) Attempts to determine which of the configured RetryLevels it has reached.

The receiver then sets the the Message's TimeToBeRecieved (TTBR) property to DateTime.Now plus the Delay value of the appropriate RetryLevel. It then sets the AdministrativeQueue property to a Queue created from the RetryInfo's SourceQueuePath property and sets the Message's AcknowledgeType to AcknowledgeTypes.NegativeReceive. Finally, it puts the Message on the wait queue.

From here, MSMQ watches the Message's TTBR. When it times out, MSMQ puts the message back on the queue in its AdministrativeQueue property which is the queue the message originally came from. Should the message continue to be rejected by handlers, it just moves its way up the RetryLevels.

If a Message's Attempts is beyond that of all the NumberOfRetries on the configured RetryLevels, the message's TTBR property is set to TimeSpan.Zero, the UseDeadLetterQueue property is set to true and the message is put on the wait queue just like any other retry. This time, however, it times out immediately and MSMQ ships it to the wait queue's host's system dead letter queue (DLQ) where it can be dealt with manually.

0
votes

As you say you do not want to re-invent the wheel, I suggest using one the many available frameworks like MassTransit or others.

Personally, I made positive experience with NServiceBus, which sits on top of the MSMQ.

It makes configuring the error handling pretty easy. You can define the actual working queue for your application and you can additinally define a dedicated error queue. Also, you configure how many tries the application will - automatically and completely transparent to your code - attempt until it moves the poison message into the queue of your choice.

This allows you to easily configure something like: "If I cannot process this message correctly within 5 retries, then move it into my error queue."

This is an out-of-the-box functionality.

Example configuration (from official documentation), note the app.config of the Publisher part:

Basic Publisher/Subscriber configuration from official documentation http://images.nservicebus.com/basic_pubsub.png

As for the non-WCF part, NServiceBus works just as well for any .NET application. You have the choice to simply reference the DLLs, or you can use NServiceBus as a container to host your application as a local service, or you can register it permanently as a windows service. And finally, if you would one day change your mind, however, and switch to WCF that would be supported, too. :-)