0
votes

I have about a few million documents (800MB) in a collection (session consistency), default range indexes, partitioned on /id.

I run these two queries, which return same results, but the time and RUs they take are orders of magnitude different.

This takes 16K RUs and >= 30 seconds.

SELECT * FROM c WHERE ( c.Status = 'SomethingDistinctive' AND c.User != null)

This takes 40 RUs and about 1 second.

SELECT * FROM c WHERE ( c.Status = 'SomethingDistinctive' AND c.User.Email != null)

Basically, any time there is a user, it will be a user with an email.

Can someone familiar with CosmosDB or someone from Microsoft could provide some insight or guidelines? Should I be providing some extra indexes? I used query metrics on the slow query, and found the index ratio is 0, which seems to indicate the index is not used at all!

Thanks for any help!

1
In my experience, seeing such a huge difference in the number of RUs consumed from such a small change in the query usually means Cosmos is performing a scan. You can see if this is the case from opting out of scans in the FeedOptions you provide: docs.microsoft.com/en-us/azure/cosmos-db/… - Joshua Krstic
@JoshuaKrstic - Both cross-partition queries and non-indexed-property scan are disabled by default, and only need to be opted in. - David Makogon

1 Answers

-1
votes

What is the partition key defined as on this collection? I am guessing the partition key is /user/email rather than /user.

You can check partition key and ID from the data explorer and then documents tab.