We have two windows services that live on a Corporate On-Premise Server and that continually send messages to Azure Service Bus in the cloud. Although the messages do end up on the service bus eventually, there are periods of time where the messages just seem to never make it through for a long stretch of time.
This is causing delay issues for us, as we depend on the message arriving onto the service bus and being processed within a minute. However, as can be seen below, a message can be 'blocked' for stretches of up to 30-40 minutes before making its way through to Azure Service Bus. This happens every day, and almost at some time during every hour.
The errors are mainly one of the following (example logs at end of this post):
A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 191.239.XX.XXX:443
Error during communication with Service Bus. Check the connection information, then retry.
No such host is known
The request operation did not complete within the allotted timeout of 00:01:10. The time allotted to this operation may have been a portion of a longer timeout. TrackingId:f2db6377-e17d-401a-b339-11fbb51c7bf7, Timestamp:19/05/2017 12:47:36 AM
The way that we send messages to the service bus is as follows, simplified below:
private TopicClient _azureTopic;
...
<Begin Loop>
if (_azureTopic == null)
{
var connectionString = "Endpoint=sb://mynamespace.servicebus.windows.net/;SharedAccessKeyName=managerfiddev;SharedAccessKey=AABBCCDDEEFFGGHHHASDFADFAadfadfdfz=EntityPath=mytopic";
_azureTopic = TopicClient.CreateFromConnectionString(connectionString);
_azureTopic.RetryPolicy = RetryPolicy.NoRetry;
}
var brokeredMessage = new BrokeredMessage(message.Message)
{
MessageId = message.Id.ToString()
};
brokeredMessage.Properties["ReceivedTimestamp"] = DateTime.Now;
_azureTopic.Send(brokeredMessage);
<End Loop>
Note: There is a deliberate reason why we have a NoRetry policy. Without wanting to add too much noise to the question, the same message that failed will be tried again in the next iteration (it sends the message to subscribers in a round robin fashion).
Example log of errors during a small window of time.
20:31:51 Event.WindowsService Event.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1191251
Error during communication with Service Bus. Check the connection information, then retry.20:32:00 Event.WindowsService Event.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1191251
No such host is known20:32:00 RFID.WindowsService RFID.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1930029
No such host is known20:32:10 RFID.WindowsService RFID.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1930029
No such host is known20:32:10 Event.WindowsService Event.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1191251
No such host is known20:32:10 RFID.WindowsService RFID.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1930029
No such host is known20:34:00 RFID.WindowsService RFID.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1930034
Error during communication with Service Bus. Check the connection information, then retry.20:38:34 Event.WindowsService Event.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1191269
Error during communication with Service Bus. Check the connection information, then retry.20:38:51 RFID.WindowsService RFID.WindowsService::PublishAzureServiceBusTopicMessage() error trying to synchronise message with Azure. Message ID: 1930043
Error during communication with Service Bus. Check the connection information, then retry.