2
votes

I may (probably) be doing this all wrong. But I made a simple benchmark after having some strange results in a web-app I'm making using DocumentDb.

        var id = "62734c17-e939-43bd-8719-08e4c7b51f75";
        var client = new DocumentClient(new Uri(LocalDbConfig.Endpoint),LocalDbConfig.Key);
        var sw = new Stopwatch();
        var collectionLink = UriFactory.CreateDocumentCollectionUri(LocalDbConfig.DatabaseId,
            LocalDbConfig.CollectionId);

        var tsResults = new List<long>();
        var docResults = new List<long>();

        for (int i = 0; i < 10; i++)
        {
            sw.Restart();
            var result = client.CreateDocumentQuery(collectionLink, $"select c._ts from c where c.id = \"{id}\"").AsEnumerable().FirstOrDefault()._ts;
            sw.Stop();
            tsResults.Add(sw.ElapsedMilliseconds);
            Console.WriteLine($"TS: {sw.ElapsedMilliseconds}");
        }

        for (int i = 0; i < 10; i++)
        {
            sw.Restart();
            var doc = client.CreateDocumentQuery(collectionLink).Where(q => q.Id == id).AsEnumerable().FirstOrDefault();
            sw.Stop();
            docResults.Add(sw.ElapsedMilliseconds);
            Console.WriteLine($"DOC: {sw.ElapsedMilliseconds}");
        }

        Console.WriteLine($"TS: Average: {tsResults.Average()} - Longest: {tsResults.Max()} - Shortest: {tsResults.Min()}");
        Console.WriteLine($"DOC: Average: {docResults.Average()} - Longest: {docResults.Max()} - Shortest: {docResults.Min()}");


        Console.ReadLine();

This is the Main() function of a C# console app. The document I'm requesting is about 5MB of JSON.

In the first loop I'm just requesting a single property of the document, it's timestamp. (I started this to try to implement av very naive caching startegy) And the other loop returns the complete document.

This is the output:

TS: 1450
TS: 17
TS: 18
TS: 22
TS: 548
TS: 156
TS: 532
TS: 174
TS: 519
TS: 182
DOC: 557
DOC: 1725
DOC: 1916
DOC: 1868
DOC: 1876
DOC: 1832
DOC: 1861
DOC: 1851
DOC: 1881
DOC: 1870
TS: Average: 361,8 - Longest: 1450 - Shortest: 17
DOC: Average: 1723,7 - Longest: 1916 - Shortest: 557

As you can see, returning the timestamp is all over the place, but returning the complete document is sort of stable after the first request, and why the first request is 400% faster than the rest I can't figure out either.

A caveat is that the test is run on the DocumentDb emulator, but the original problem was present in the Azure DocumentDb service also.

Anyone got any idea why getting the timestamp is so all over the place? And why the first fetch of the document is 500ms, while the rest are almost 2000ms?

1
Can you run this from the same data center as your DocumentDB rather than locally in the emulator? I know you say the problem was present, but I think more people would be interested in helping if your data was from a production environment. - Larry Maccherone
I'd also recommend turning retry on throttling off. When initializing DocumentClient, set ConnectionPolicy.RetryOptions.MaxRetryAttemptsOnThrottle = 0 to eliminate any variation from throttling. - Aravind Krishna R.
Also, if you can share the collection contents to [email protected], we can debug/analyze and post findings back to SO. - Aravind Krishna R.

1 Answers

0
votes

Based on the information provided, you have probably reached the provisioned throughput of the collection and running into rate limiting. The SDKs automatically retry when rate limitied, so you would perceive this as longer latencies especially for the requests with small payloads like SELECT c._ts.

You can get predictable single-digit millisecond latency by increasing the provisioned throughput of the collection. Emailing Azure DocumentDB support at [email protected] with your endpoint details will also get you a conclusive answer.