0
votes

In a CreateDocumentQuery I am using MaxItemCount and then HasMoreResults and ExecuteNextAsync - which has been described in other posts.

My issue is that sometimes - particularly after a large update to the DocumentDB - looping through every document has somewhat random results with up to half the documents being ignored.

This ONLY happens if I include a SQL query in the query setup - as I only need to process some fields/columns. If I allow all fields to come back it works 100%. But this is inefficient as I am exporting a couple of columns only and there are close to a million records.

I need to use C# as it is a scheduled job linked up with other C# modules.

Has anyone been able to consistently loop through a large collection using paging?

Code extract below - with the sql included - if I remove the sql from the query there is no issue.

sql = "select d.field1, d.field2 from doc d";
var query = client.CreateDocumentQuery("dbs/" + database.Id + "/colls/" + documentCollection.Id, sql
            new FeedOptions { MaxItemCount = 1000 }
            ).AsDocumentQuery();

while (query.HasMoreResults)
{
    FeedResponse<Document> res;
    while (true)
    {
        try
        {
            res = await query.ExecuteNextAsync<Document>();
            break; // success!
        }
        catch (Exception ex)
        {
            if (ex.Message.IndexOf("request rate too large") > -1)
            {
                // DocumentDB is under pressure - wait a while and retry - this will resolve eventually
                System.Threading.Thread.Sleep(5000);
            }
            else
            {
                errorcount++;
                throw ex;
            }
        }
    }
    if (res.Any())
    {
        foreach (var liCurrent in res)
        {
            try
            {
                // Convert the Document to a CSV line item
                // DO THE FILE LINE CREATION HERE
                fileLineItem = "test";

                // Write the line to the file
                writer.WriteLine(fileLineItem);
            }
            catch (Exception ex)
            {
                errorcount++;
                throw ex;
            }
            totalrecords++;
        }
    }
} 
1
What is the consistency level for your database? For more on consistency level please check azure.microsoft.com/en-us/documentation/articles/… - Arnab Chakraborty
Thanks - changing consistency to consistent rather than lazy seems to have solved it. - smulldino
You should post that as an answer. Could be useful to someone else. - Arnab Chakraborty

1 Answers

0
votes

The solution as pointed out by Amab was setting the consistency level to consistent. I had done this previously - but I deleted and recreated the collection since then. The default setting is lazy when you create a collection. So you need to either specify it on creation or alter it later.

Thanks Amab