We are seeing exceptionally strange behaviour on our consumption plan Function App please with regards to the following exceptions we are seeing repeatedly:
Microsoft.Azure.EventHubs.RecieverDisconnectedException
(New receiver with higher epoch of '2' is created hence current receiver with epoch '1' is getting disconnected.)System.Net.WebException
(Exception of type 'Microsoft.ServiceBus.Messaging.LeaseLostException' was thrown.)
We get these exceptions whenever we stress the functions i.e. go from 0 to 50,000 events in a matter of moments but they are pegged to the cloud_role matching our Function App.. which would lead me to believe that it is a host error..
Reading various doco i.e. (https://docs.microsoft.com/en-us/azure/event-hubs/event-hubs-features), i think i understand how the EventHub receiver is meant to be working [but honestly i am reading between the lines as it's quite unclear] in that for my one receiver relies on a consumer group to manage reading batches of messages from the EventHub partitions (of which i am using 32).
My hypothesis was that under load, there were too many function instances for the single consumer group to 'cope' with, and it was simply repeatedly switching out the leases of partitions... however, in my testing scenario, i removed all logic from functions apart from relaying messages between event hubs, and the errors persisted even with only 4 partitions on the EventHub
In a desperate bid to see if was resolved in later versions, i mocked up exactly the same functionality in Functions v2, and receive what i assume is .net core equivalent..
Microsoft.Azure.EventHubs.RecieverDisconnectedException
(New receiver with higher epoch of '2' is created hence current receiver with epoch '1' is getting disconnected.)Microsoft.WindowsAzure.Storage.StorageException
(The lease ID specified did not match the lease ID for the blob.)System.ArgumentOutOfRangeException
(Ignoring out of date checkpoint with offset 1184072/sequence number 1038 because..)
So, can someone please
- explain what on earth is actually going on under the covers
- help me to suppress these, if they are not actual 'real' errors, and they are just the host managing things...
These exceptions are really annoying because it makes it quite tricky to actually see genuine un-handled exceptions.