First Some info about the setup:
- We are running NServiceBus v4.6.3
- Distributors are running on clustered machines with clustered queues.
- The TimeoutManager is disabled on the workers and enabled on the distributor.
What we are experiencing
- When a Saga-timeout is handled, the number of messages in the storage queue is increased by one.
- When a message is deferred, the number of messages in the storage queue is increased by one.
What do we know
The timeout message is delivered directly to a specific local queue (on the server which requested the timeout) and not to the clustered queue. This bypasses the regular path which involves removing a message from the storage queue and deliver the destination queue on the worker. The worker does not know this and reports back that it is ready to handle a new message after handling the timeout message.
The reason for this behavior is that the destination-queue stored in the TimeoutData is set to be the local queue on the worker and not the clustered queue. Here is an example:
{
"Destination": {
"Queue": "DemoService",
"Machine": "STAGESVC04"
},
"SagaId": "40bf1976-dab0-48a3-8e20-a39d00b59bb9",
"State": "PD94bWwgdmVyc2lvbj0iMS4wIiA/Pg0KPE1lc3NhZ2VzIHhtbG5zOnhzaT0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEtaW5zdGFuY2UiIHhtbG5zOnhzZD0iaHR0cDovL3d3dy53My5vcmcvMjAwMS9YTUxTY2hlbWEiIHhtbG5zPSJodHRwOi8vdGVtcHVyaS5uZXQvUmlrc3RvdG8uU2VydmljZS5CZXR0aW5nLk5TZXJ2aWNlQnVzSGFuZGxlcnMuU2FnYXMuVGVzdFNhZ2EiPgo8SVRoZVRpbWVvdXQ+CjwvSVRoZVRpbWVvdXQ+CjwvTWVzc2FnZXM+DQo=",
"Time": "2014-09-05T09:02:14.2110980Z",
"CorrelationId": "60558fe4-7111-4fe3-a4ff-a39d00b5c060\\0",
"OwningTimeoutManager": "DemoService",
"Headers": {
"NServiceBus.MessageId": "60558fe4-7111-4fe3-a4ff-a39d00b5c060",
"NServiceBus.CorrelationId": "60558fe4-7111-4fe3-a4ff-a39d00b5c060",
"NServiceBus.OriginatingEndpoint": "DemoService",
"$.diagnostics.originating.hostid": "23d0f908b1691ba8111e8ac282989a3d",
"NServiceBus.MessageIntent": "Send",
"NServiceBus.Version": "4.6.3",
"NServiceBus.TimeSent": "2014-09-05 09:01:44:211098 Z",
"NServiceBus.OriginatingMachine": "STAGESVC04",
"NServiceBus.SagaId": "40bf1976-dab0-48a3-8e20-a39d00b59bb9",
"NServiceBus.IsSagaTimeoutMessage": "True",
"NServiceBus.SagaType": "DemoService.NServiceBusHandlers.Sagas.TestSaga.TestSaga, DemoService.NServiceBusHandlers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null",
"NServiceBus.OriginatingSagaId": "40bf1976-dab0-48a3-8e20-a39d00b59bb9",
"NServiceBus.OriginatingSagaType": "DemoService.NServiceBusHandlers.Sagas.TestSaga.TestSaga, DemoService.NServiceBusHandlers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null",
"NServiceBus.ContentType": "text/xml",
"NServiceBus.EnclosedMessageTypes": "DemoService.NServiceBusHandlers.Sagas.TestSaga.ITheTimeout, DemoService.NServiceBusHandlers, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null",
"NServiceBus.RelatedTo": "232aa1a9-dea2-4d98-9251-a39d00b59bc7",
"NServiceBus.ConversationId": "4661130d-a961-4b1d-abda-a39d00b59beb",
"WinIdName": "",
"CorrId": "60558fe4-7111-4fe3-a4ff-a39d00b5c060\\0",
"NServiceBus.IsDeferredMessage": "True",
"NServiceBus.Temporary.DelayDeliveryWith": "00:00:30",
"NServiceBus.Timeout.Expire": "2014-09-05 09:02:14:211098 Z",
"NServiceBus.Timeout.RouteExpiredTimeoutTo": "DemoService@STAGESVC04",
"NServiceBus.Timeout.ReplyToAddress": "DemoService@STAGEMSMQ01"
}
}
I'm not sure what we are experiencing is a bug in NServiceBus or a misconfiguration. Somebody knows?