0
votes

We have an Azure function with a Cosmos DB trigger that needs to process items within a partition sequentially, but not in any specific order. My understanding is the trigger will always send all changes for a partition to one function instance at a time. However, I am seeing changes for one partition being processed by multiple function instances within a few seconds. So the change feed distribution to function instances is not working as I expect it to.

This function app runs on the latest V2 function host. The function is a durable function. We use a 'leases' collection with a specific prefix to manage the leases for this change feed.

[FunctionName("ProcessChanges")]
public static async Task RunAsync([CosmosDBTrigger(
    databaseName: "MyDatabase",
    collectionName: "MyCollection",
    ConnectionStringSetting = "AzureWebJobsCosmosDBConnectionString",
    LeaseCollectionName = "leases",
    LeaseCollectionPrefix = "chgproc",
    CreateLeaseCollectionIfNotExists = true)]IReadOnlyList<Document> documents,
    [OrchestrationClient]DurableOrchestrationClient starter,
    ILogger log)
{
    // Processing code that calls the orchestrator function
}

I expect all changes for a given partition at a given time to go to one function instance, but sometimes they go to multiple function instances.

1
A lease can only be owned by 1 instance at a time. Each lease represents a Partition Key Range, not a physical partition. You can verify this on your leases collection with the Owner property docs.microsoft.com/en-us/azure/cosmos-db/…. Do you have multiple Triggers with different LeaseCollectionPrefix values by any chance like this docs.microsoft.com/en-us/azure/cosmos-db/… ?Matias Quaranta
@MatiasQuaranta Yes, we have two different triggers on this collection and each one has a different LeaseCollectionPrefix.Joe Falkenburg
That means that each change in your collection will be sent once to each Function, and thus, land in two instances (in your case because you have 2 Functions). The goal of the LeaseCollectionPrefix is to allow different independent Functions to process changes independently and in parallel. Were you expecting a change to be processed by both Functions in parallel?Matias Quaranta
@MatiasQuaranta I'm good with the two different functions processing the same document change at the same time. The issue I'm experiencing is when document A in partition A is processed by the MyFunc function on one function instance and document B in partition A is processed by the MyFunc function on a different function instance within one second. I need these two documents that are in the same partition to be processed sequentially by the MyFunc function.Joe Falkenburg
Glad to help. I never tried the Trigger with Durable Functions, so maybe the orchestrator is just scaling instances. In any case, even across different instances, the changes are received in order (doc A before doc B), relying on the same instance to receive the changes on a serverless environment might not be correct.Matias Quaranta

1 Answers

0
votes

I can't find any documentation on how the distribution actually should work, do you have the source of that?

Does your leases collection have the /id partition, as is mentioned in the docs?

The lease container: The lease container maintains state across multiple and dynamic serverless Azure Function instances and enables dynamic scaling. This lease container can be manually or automatically created by the Azure Cosmos DB Trigger.To automatically create the lease container, set the CreateLeaseCollectionIfNotExists flag in the configuration. Partitioned lease containers are required to have a /id partition key definition.