0
votes

I have a Premium Service Bus namespace in one region, and have created another one in another region, as a secondary. I have enabled Geo-Recovery on the primary, and have configured the pairing with the secondary. I ran a test to continuously send messages to a Topic, and I have a receiving application subscribed to it. The sender will send "Sending message: Message {number in sequence}" and receiver will display "Received message: SequenceNumber:{SB assigned sequence number} Body:Message {number in sequence}". However, when I tried to initiate a Failover to the secondary through the Portal, I noticed that though the sender kept on sending messages, the receiver dropped some messages while completing the failover. Please see below:

Logs from Sender:

Sending message: Message 244
Sending message: Message 245
Sending message: Message 246
Sending message: Message 247
Sending message: Message 248
Sending message: Message 249
Sending message: Message 250
Sending message: Message 251
Sending message: Message 252
Sending message: Message 253
Sending message: Message 254
Sending message: Message 255
Sending message: Message 256
Sending message: Message 257
Sending message: Message 258
Sending message: Message 259
Sending message: Message 260

Logs from Receiver:

Received message: SequenceNumber:255 Body:Message 244
Received message: SequenceNumber:256 Body:Message 245
Received message: SequenceNumber:257 Body:Message 246
Received message: SequenceNumber:258 Body:Message 247
Message handler encountered an exception Microsoft.Azure.ServiceBus.UnauthorizedException: Connection rejected after GeoDRFailOver. TrackingId:7bb0b78d-2bf5-4807-8bcb-c831b00c6692, SystemTracker:AmqpGatewayProvider, Timestamp:2019-08-12T17:42:38
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<OnReceiveAsync>d__86.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<>c__DisplayClass64_0.<<ReceiveAsync>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.ServiceBus.RetryPolicy.<RunOperation>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
   at Microsoft.Azure.ServiceBus.RetryPolicy.<RunOperation>d__19.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<ReceiveAsync>d__64.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.ServiceBus.Core.MessageReceiver.<ReceiveAsync>d__62.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at Microsoft.Azure.ServiceBus.MessageReceivePump.<MessagePumpTaskAsync>d__11.MoveNext().
Exception context for troubleshooting:
- Endpoint: rjpremium.servicebus.windows.net
- Entity Path: topic1/Subscriptions/sub1
- Executing Action: Receive
Received message: SequenceNumber:1 Body:Message 254
Received message: SequenceNumber:2 Body:Message 255
Received message: SequenceNumber:3 Body:Message 256
Received message: SequenceNumber:4 Body:Message 257
Received message: SequenceNumber:5 Body:Message 258
Received message: SequenceNumber:6 Body:Message 259
Received message: SequenceNumber:7 Body:Message 260

The messages between 247 and 254 are dropped. Though the sender sent all those, the receiver never received those messages. If I enable Geo-Recovery, ae these messages should also be received by the receiver?

2
When you paired the secondary namespace, did you update sender/receiver to use the alias obtained after pairing namespaces?Sean Feldman

2 Answers

1
votes

When using geo-disaster recovery feature of the Azure Service Bus (Premium), you have to pair the primary and the secondary namespaces first. When that's done, you get an alias to be used from that point in time and on. The alias ensures applications connected to the primary namespace continue to function when failover takes place. Make sure you use the issued alias in your sender and receiver applications. For details, see the documentation.

1
votes

First, quoting from the docs (and as Sean Feldman also points out in parallel)

"Geo-Disaster recovery currently only ensures that the metadata (Queues, Topics, Subscriptions, Filters) are copied over from the primary namespace to secondary namespace when paired."

That means messages are not (yet) copied.

Second, GeoDR is for rare scenarios where you have to move out of the region because something there is somehow completely broken. That means it's very unlikely that your test scenario above reflects any reality. You will be confronted with a crisis situation where you have an outage and then make a very deliberate effort to abandon the region and do a failover not only of Service Bus but everything else you have there.