1
votes

I have some questions regarding scaling up Azure Functions and the Event Hub trigger. I understand that an AF app stores a checkpoint number somewhere (Azure Storage I think), but I am wondering to what this applies (the affinity) when AF scales up or changes.

  1. If I have an existing event hub with messages and redeploy the AF app, will it restart from the beginning with a new checkpoint, or will it start where it left off? Seems like it should start from the last checkpoint for that AF app somehow.
  2. If I change the name of a function in the AF app which uses an event hub trigger, does it keep the same checkpoint or start over?
  3. If I stop/start the AF app, does it lose its position? Seems like it shouldn't since it's stored externally, but I don't know how they map to each other.
  4. When AF scales up, do the multiple instances all share the same checkpoint, or do subsequent instances start at the beginning and have their own? The latter seems unlikely. From what I've read, a new EventProcessorHost is created for each AF app instance, but do they all share the same checkpoint?
  5. When scaling up, do the new AF app instances start at the existing/initial instance's checkpoint, or does the new EPH start from the beginning?

In case it matters, I am using C# DLLs and VS2015 for dev, and VSTS for building/deploying.

1

1 Answers

5
votes

Event Hub checkpoints are saved per Partition per Consumer Group. So, if your hub has 2 partitions and 3 consumer groups, it will have 6 checkpoints.

You can explicitly define a consumer group for your function trigger:

EventHubTrigger("myhub", ConsumerGroup = "mygroup")

otherwise a default value $Default will be used. So, to your questions:

  1. If there are existing checkpoints for the given consumer group, it will restart from those.

  2. Function name is not important, it's based on consumer group.

  3. After a restart, the app will start from the checkpoints.

  4. Multiple instances all share the same checkpoints. How it works: each partition will be locked by one of the instances, so events from that partition will only be processed by a single instance at any given time. The same instance will update the corresponding checkpoint to new offsets.

  5. The new instance will start from existing checkpoint, as soon as it manages to lock the corresponding partition.

Note that the amount of partitions limits the amount of instances that can processes events in parallel (for one consumer group).