0
votes

I have a list of documents that belong to a partitioned collection. Instead of querying for every document from the .NET client and either do update or insert, I thought I could use a Stored Procedure to accomplish this.

What I did not initially realize is that Stored Procedures are executed in the transaction scope of a single partition key. So I am getting PartitionKey value must be supplied for this operation.

The thing is that the documents (that I am trying to upsert) may belong to different partitions. How can I accomplish this in the Stored Procedure? In my case, the SP is useless unless it can operate on multiple partitions.

This is how I constructed my SP:

function upsertEcertAssignments(ecerts) {
    var collection = getContext().getCollection();
    var collectionLink = collection.getSelfLink();
    var response = getContext().getResponse();

    // Validate input
    if (!ecerts) throw new Error("The ecerts is null or undefined");
    if (ecerts.length == 0) throw new Error("The ecerts list size is 0");

    // Recursively call the 'process' function
    processEcerts(ecerts, 0);

    function processEcerts(ecerts, index) {
        if (index >= ecerts.length) {
            response.setBody(index);
            return; 
        }               

        var query = {query: "SELECT * FROM DigitalEcerts c WHERE c.code = @code AND c.collectionType = @type", parameters: [{name: "@code", value: ecerts[index].code}, {name: "@type", value: 0}]};
        var isQueryAccepted = collection.queryDocuments(collectionLink, query, {partitionKey: ecerts[index].code}, function(err, foundDocuments, foundOptions) {
            if (err) throw err;

            if (foundDocuments.length > 0) {
                var existingEcert = foundDocuments[0];
                ecerts[index].id = existingEcert.id;
                var isAccepted = __.replaceDocument(existingEcert._self, ecerts[index], function(err, updatedEcert, replacedOptions) {
                    if (err) throw err;

                    processEcerts(ecerts, index + 1);        
                });
                if (!isAccepted) {
                    response.setBody(index);                
                }
            } else {
                var isAccepted = __.createDocument(__.getSelfLink(), ecerts[index], function(err, insertedEcert, insertedOptions) {
                    if (err) throw err;

                    processEcerts(ecerts, index + 1);        
                });
                if (!isAccepted) {
                    response.setBody(index);                
                }
            }
        });

        if (!isQueryAccepted)
            response.setBody(index);                
    }
}

From .NET, if I call it like this, I get the partitionKey value problem:

var continuationIndex = await _docDbClient.ExecuteStoredProcedureAsync<int>(UriFactory.CreateStoredProcedureUri(_docDbDatabaseName, _docDbDigitalEcertsCollectionName, "UpsertDigitalMembershipEcertAssignments"), digitalEcerts);

If I call it with a partition key, it works...but it is useless:

var continuationIndex = await _docDbClient.ExecuteStoredProcedureAsync<int>(UriFactory.CreateStoredProcedureUri(_docDbDatabaseName, _docDbDigitalEcertsCollectionName, "UpsertDigitalMembershipEcertAssignments"), new RequestOptions { PartitionKey = new PartitionKey(digitalEcerts[0].Code) }, digitalEcerts.Take(1).ToList());

I appreciate any pointer.

Thanks.

2
The real question is why did you use a stored procedure and not the Upsert methods provided by the SDK?Nick Chapsas
This is because I need the document id to do upsert! Unless I am totally wrong, the id is how Cosmos knows that the document is the same. The documents that I have are not pulled from Cosmos DB....but they have the same properties. So I end up querying for each doc using the PK and type. If the document exists, I use the id to upsert. This is taking along time especially that I have about 100 docs to partially update..Khaled Hikmat
The id is not a unique property in CosmosDB. It is only unique inside it's own partition.Nick Chapsas
Ok...thank you...but I am not sure I understand. Let us say I have an entity that has the following 3 properties: id = '', code = '123' and type = 0. The id is unknown. If I do upsert, it will always insert the document because the id does not match any existing document ...right? Unless I query first using code and type, there is no way I know to update or insert. This is my problem. I was hoping that I can do the query and either replace or insert on the server using SP to avoid round trips.Khaled Hikmat
Are query and type enough to give you a unique id? If yes then why don't you make the id to be a combination of code and type?Nick Chapsas

2 Answers

1
votes

By the sound of it, your unique id is a combination of code and type. I would recommend making your id property to be the combination of two.

This guarantees that your id is unique but also eliminates the need to query for it.

0
votes

If the collection the stored procedure is registered against is a single-partition collection, then the transaction is scoped to all the documents within the collection. If the collection is partitioned, then stored procedures are executed in the transaction scope of a single partition key. Each stored procedure execution must then include a partition key value corresponding to the scope the transaction must run under.

You could refer to the description above which mentioned here. We can query documents cross partitions via setting EnableCrossPartitionQuery to true in FeedOptions parameter. However, the RequestOptions doesn't have such properties against executing stored procedure.

So, It seems you have to provide partition key when you execute sp. Of course, it can be replaced by upsert function. It is useless from the perspective of the business logic, but if bulk operations, the SP can release some of the performance pressure because the SP is running on the server side.

Hope it helps you.