DocumentDB Query Requires Unexpected High RUs

Question

I have about 200MB of data stored in a S3 document collection (the highest one). Each document is about 300KB in size.

But when I ran my query, I am surprised to see that it requires 7245.81 RUs. Because S3 is at 2500 RUs/Second, this performance won't scale for my application.

I just want to ask if I am doing anything wrong and if there is anything that I can do to improve it?

My query is like this:

SELECT item.Id,item.Priority, va.something, wa.something... 11 fields in total
FROM model.Item item 
JOIN va in item.Content.Children 
JOIN wa in va.Children 
WHERE item.State != 5

I am using "Range" as the index type. Would that impact the performance of this query?

The docs say "index type of Range enables range queries (using >, <, >=, <=, !=)", but you might want to try WHERE item.State > 5 OR item.State < 5 Also try switching it to a hash index. For a single value, I would think that a hash index would be as efficient or more than a range index. — Larry Maccherone

Aravind Krishna R. Aravind Krishna R. · Accepted Answer · 2015-05-30T23:46:28

Queries using != will require a scan, since all index entries have to be looked up to eliminate the values that are not equal to 5. This will lead to a high RUs. Whenever possible, please try rewriting the query as an equality or a range query.

If the number of possible values of item.State is limited, e.g., 1-10, then you can enumerate the valid values using IN, i.e., item.State IN (1,2,3,4,6,7,8,9,10)
If that's not possible, the please use ranges e.g., item.State > 5 OR item.State < 5 like suggested by Larry above.

Hope this helps. Please email me at arramac at microsoft dot com in case you'd like to discuss in more detail.

DocumentDB Query Requires Unexpected High RUs

1 Answers