0
votes

When I run the following query directly in Data Explorer in Azure Portal it takes about 2 seconds to finish. The same query when run using DocumentDB SDK it takes about 1 minute to finish. Why is there such a wide gap in performance between two methods and how can I achieve the same perf using the SDK?

select distinct c.docId from c where c.documentType = 'Article' and LOWER(c.userId) = 'someone' and c.docId in (812,838,901,1557,1558,1372,1367,1592,1081,1590,1364,1441)

The code we use:

    public async Task> GetItemsAsync(string expr, int pageSize = -1, string continueToken = null)
    {
        using (var bench = new Benchmark(expr))
        {
            var queryable = _Client.CreateDocumentQuery(UriFactory.CreateDocumentCollectionUri(DatabaseId, CollectionId)
            , expr
            , new FeedOptions { MaxItemCount = pageSize, RequestContinuation = continueToken, EnableCrossPartitionQuery = true })
            .AsDocumentQuery();
    
            List results = new List();
            while (queryable.HasMoreResults)
            {
                var queryResult = await queryable.ExecuteNextAsync();
                results.AddRange(queryResult);
            }
    
            return results;
        }
    }

We use package Microsoft.Azure.DocumentDB.2.11.2. _Client is opened with ConnectionMode.Direct and Protocol.Tcp

RequestDiagnosticsString:

QueryMetrics: Partition 0
Retrieved Document Count                 :           1,847             
Retrieved Document Size                  :       1,166,051 bytes       
Output Document Count                    :               0             
Output Document Size                     :              49 bytes       
Index Utilization                        :            0.00 %           
  Utilized Indexes
  Potential Indexes
Total Query Execution Time               :           21.20 milliseconds
  Query Preparation Times
    Query Compilation Time               :            0.22 milliseconds
    Logical Plan Build Time              :            0.22 milliseconds
    Physical Plan Build Time             :            0.12 milliseconds
    Query Optimization Time              :            0.01 milliseconds
  Index Lookup Time                      :            0.87 milliseconds
  Document Load Time                     :           13.38 milliseconds
  Runtime Execution Times
    Query Engine Times                   :            5.68 milliseconds
    System Function Execution Time       :            1.35 milliseconds
    User-defined Function Execution Time :            0.00 milliseconds
  Document Write Time                    :            0.00 milliseconds
Client Side Metrics
  Retry Count                            :               0             
  Request Charge                         :           74.85 RUs         

  Partition Execution Timeline
  ┌────────────┬────────────────────────────────────┬────────────────┬──────────────┬──────────────────┬───────────────────┬───────────┐
  │Partition Id│Activity Id                         │Start Time (UTC)│End Time (UTC)│Duration (ms)     │Number of Documents│Retry Count│
  ├────────────┼────────────────────────────────────┼────────────────┼──────────────┼──────────────────┼───────────────────┼───────────┤
  │           0│54693cfe-843d-404b-82da-446347d5a71f│  02:11:28.1185Z│02:11:28.3639Z│            245.38│                  0│          0│
  └────────────┴────────────────────────────────────┴────────────────┴──────────────┴──────────────────┴───────────────────┴───────────┘

  Scheduling Metrics
  ┌────────────┬────────────────────┬────────────────────┬────────────────────┬────────────────────┬─────────────────────┐
  │Partition Id│Response Time (ms)  │Run Time (ms)       │Wait Time (ms)      │Turnaround Time (ms)│Number of Preemptions│
  ├────────────┼────────────────────┼────────────────────┼────────────────────┼────────────────────┼─────────────────────┤
  │           0│               84.02│              245.38│              106.94│              352.32│                    1│
  └────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴─────────────────────┘


ClientSideRequestStatistics: {
0:[

RequestStartTime: 2020-07-25T02:11:28.3895792Z, RequestEndTime: 2020-07-25T02:11:28.3895792Z,  Number of regions attempted:1

]}

==

QueryMetrics: Partition 21
Retrieved Document Count                 :          21,747             
Retrieved Document Size                  :      14,783,785 bytes       
Output Document Count                    :               7             
Output Document Size                     :             171 bytes       
Index Utilization                        :            0.00 %           
  Utilized Indexes
  Potential Indexes
Total Query Execution Time               :          187.95 milliseconds
  Query Preparation Times
    Query Compilation Time               :            1.10 milliseconds
    Logical Plan Build Time              :            1.83 milliseconds
    Physical Plan Build Time             :            0.17 milliseconds
    Query Optimization Time              :            0.01 milliseconds
  Index Lookup Time                      :            1.99 milliseconds
  Document Load Time                     :          123.47 milliseconds
  Runtime Execution Times
    Query Engine Times                   :           52.73 milliseconds
    System Function Execution Time       :           11.06 milliseconds
    User-defined Function Execution Time :            0.00 milliseconds
  Document Write Time                    :            0.01 milliseconds
Client Side Metrics
  Retry Count                            :               0             
  Request Charge                         :          849.38 RUs         

  Partition Execution Timeline
  ┌────────────┬────────────────────────────────────┬────────────────┬──────────────┬──────────────────┬───────────────────┬───────────┐
  │Partition Id│Activity Id                         │Start Time (UTC)│End Time (UTC)│Duration (ms)     │Number of Documents│Retry Count│
  ├────────────┼────────────────────────────────────┼────────────────┼──────────────┼──────────────────┼───────────────────┼───────────┤
  │          21│e0745d8f-ec34-4a40-9dac-556ffc8e8603│  02:16:14.5334Z│02:16:14.8970Z│            363.53│                  7│          0│
  └────────────┴────────────────────────────────────┴────────────────┴──────────────┴──────────────────┴───────────────────┴───────────┘

  Scheduling Metrics
  ┌────────────┬────────────────────┬────────────────────┬────────────────────┬────────────────────┬─────────────────────┐
  │Partition Id│Response Time (ms)  │Run Time (ms)       │Wait Time (ms)      │Turnaround Time (ms)│Number of Preemptions│
  ├────────────┼────────────────────┼────────────────────┼────────────────────┼────────────────────┼─────────────────────┤
  │          21│            54314.14│              363.53│            54314.22│            54677.76│                    1│
  └────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴─────────────────────┘


ClientSideRequestStatistics: {
21:[

RequestStartTime: 2020-07-25T02:16:14.8967387Z, RequestEndTime: 2020-07-25T02:16:14.8967387Z,  Number of regions attempted:1

]}

==

QueryMetrics: Partition 20
Retrieved Document Count                 :          44,693             
Retrieved Document Size                  :      30,777,951 bytes       
Output Document Count                    :               9             
Output Document Size                     :             256 bytes       
Index Utilization                        :            0.00 %           
  Utilized Indexes
  Potential Indexes
Total Query Execution Time               :          434.05 milliseconds
  Query Preparation Times
    Query Compilation Time               :            2.37 milliseconds
    Logical Plan Build Time              :            4.01 milliseconds
    Physical Plan Build Time             :            0.46 milliseconds
    Query Optimization Time              :            0.04 milliseconds
  Index Lookup Time                      :            2.61 milliseconds
  Document Load Time                     :          283.05 milliseconds
  Runtime Execution Times
    Query Engine Times                   :          126.52 milliseconds
    System Function Execution Time       :           23.76 milliseconds
    User-defined Function Execution Time :            0.00 milliseconds
  Document Write Time                    :            0.02 milliseconds
Client Side Metrics
  Retry Count                            :               0             
  Request Charge                         :        1,740.93 RUs         

  Partition Execution Timeline
  ┌────────────┬────────────────────────────────────┬────────────────┬──────────────┬──────────────────┬───────────────────┬───────────┐
  │Partition Id│Activity Id                         │Start Time (UTC)│End Time (UTC)│Duration (ms)     │Number of Documents│Retry Count│
  ├────────────┼────────────────────────────────────┼────────────────┼──────────────┼──────────────────┼───────────────────┼───────────┤
  │          20│5b5045a4-613d-408e-9d94-d44fd7a8e5d3│  02:16:13.8872Z│02:16:14.3596Z│            472.41│                  5│          0│
  │          20│45c14bfb-f9bf-48f8-ab0d-7680f7aeac99│  02:16:14.3602Z│02:16:14.5278Z│            167.61│                  4│          0│
  └────────────┴────────────────────────────────────┴────────────────┴──────────────┴──────────────────┴───────────────────┴───────────┘

  Scheduling Metrics
  ┌────────────┬────────────────────┬────────────────────┬────────────────────┬────────────────────┬─────────────────────┐
  │Partition Id│Response Time (ms)  │Run Time (ms)       │Wait Time (ms)      │Turnaround Time (ms)│Number of Preemptions│
  ├────────────┼────────────────────┼────────────────────┼────────────────────┼────────────────────┼─────────────────────┤
  │          20│            53667.87│              640.03│            53668.52│            54308.55│                    2│
  └────────────┴────────────────────┴────────────────────┴────────────────────┴────────────────────┴─────────────────────┘


ClientSideRequestStatistics: {
20:[

RequestStartTime: 2020-07-25T02:16:14.3599211Z, RequestEndTime: 2020-07-25T02:16:14.3599211Z,  Number of regions attempted:1
,

RequestStartTime: 2020-07-25T02:16:14.5274725Z, RequestEndTime: 2020-07-25T02:16:14.5274725Z,  Number of regions attempted:1

]}
1
Is this the first operation you do on the client or are you warming up the connections with a previous one (or calling OpenAsync)?Matias Quaranta
The connection is open with _Client.OpenAsync() during initialization. I repeated the run multiple times and it's consistently slow.samsu
Try capturing the Network diagnostics with PopulateQueryMetrics = true in the FeedOptions and reading them from the response's RequestDiagnosticsString. That and docs.microsoft.com/en-us/azure/cosmos-db/…Matias Quaranta
I edited my post with the output from RequestDiagnosticsString. Can you comment on what I should be looking for in the output?samsu
The data in your posts only shows a very small query duration. Are you certain the time is spent in the SDK? I suggest stepping through the code with a debugger and find the culprit for the durations.Alex AIT

1 Answers

0
votes

The attached diagnostics seem to show that you are doing a cross partition query (it seems to be hitting 3 partitions to get the results for your query), you can see the "Run time" in each partition statistics (245ms, 363ms, 640ms).

It seems the high number is due to also high number of results (44K in one of the queries).

We don't know which type of query is the one you are executing, but it does look like a scan (reference). This seems to align with your use of LOWER (also see Index Utilization: 0.00 % in your diagnostics).

Another useful link is the guide to tune query performance that has some good tips (please see link for explanation:

  • Set MaxItemCount to -1
  • Set MaxDegreeOfParallelism to -1

But overall it seems the problem is the fact that you are using no indexes due to the use of LOWER.