6
votes

I'm a bit confused regarding the EventHubTrigger for Azure functions.

I've got an IoT Hub, and am using its eventhub-compatible endpoint to trigger an Azure function that is going to process and store the received data.

However, if my function fails (= throws an exception), that message (or messages) being processed during that function call will get lost. I actually would expect the Azure function runtime to process the messages at a later time again. Specifically, I would expect this behavior because the EventHubTrigger is keeping checkpoints in the Function Apps storage account in order to keep track of where in the event stream it has to continue.

The documention of the EventHubTrigger even states that

If all function executions succeed without errors, checkpoints are added to the associated storage account

But still, even when I deliberately throw exceptions in my function, the checkpoints will get updated and the messages will not get received again.

Is my understanding of the EventHubTriggers documentation wrong, or is the EventHubTriggers implementation (or its documentation) wrong?

2

2 Answers

7
votes

This piece of documentation seems confusing indeed. I guess they mean the errors of Function App host itself, not of your code. An exception inside function execution doesn't stop the processing and checkpointing progress.

The fact is that Event Hubs are not designed for individual message retries. The processor works in batches, and it can either mark the whole batch as processed (i.e. create a checkpoint after it), or retry the whole batch (e.g. if the process crashed).

See this forum question and answer.

If you still need to re-process failed events from Event Hub (and errors don't happen too often), you could implement such mechanism yourself. E.g.

  1. Add an output Queue binding to your Azure Function.
  2. Add try-catch around processing code.
  3. If exception is thrown, add the problematic event to the Queue.
  4. Have another Function with Queue trigger to process those events.

Note that the downside of this is that you will loose ordering guarantee provided by Event Hubs (since Queue message will be processed later than its neighbors).

0
votes

Quick fix. As retry policy would not work if down system is down for few hours. You can call Process.GetCurrentProcess().Kill(); in exception handling. This would stop the checkpoint moving forward. I have tested this with consumption based function app. You will not see anything in logs but i added email to notify that something went wrong and to avoid data loss i have killed the function instance. Hope this helps. Would put an blog over it and other part of workflow where I stop function in case of continuous failure on down system using logic app.