retrieving a document by id is slow across partitions in cosmos db

Question

I have a scenario where I need to retrieve a single document based on its id property from azure cosmos db. The only problem is I don't know the partition key and thus cannot use the document URI to access it.

From my understanding writing a simple query like

SELECT * from c WHERE c.id = "id here"

should be the way to go but I'm experiencing severe performance issues with this query. Most queries take 30s to 60s to complete and seem to consume insane amounts of RU/s. When executing 10 concurrent queries the max RU/s per partition went as high as 30.000. (10.00 per partition was provisioned) Resulting in throttling and even slower responses.

The collection comprises 10 partitions with around 3 Mb per partition, so 30 Mb in total and around 1,00,000 documents. My indexing policy looks like this:

{
    "indexingMode": "lazy",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*",
            "indexes": [
                {
                    "kind": "Range",
                    "dataType": "Number",
                    "precision": -1
                },
                {
                    "kind": "Hash",
                    "dataType": "String",
                    "precision": 3
                }
            ]
        }
    ],
    "excludedPaths": []
}

And the consistency is set to EVENTUAL since I don't really care about read/write order. The collection is under some write pressure with about 30 writes per minute and there's a TTL of 1 year for each document, yet this doesn't seem to produce a measurable impact on the RU/s. I experience this sort of problem only when querying documents.

Has anyone had similar problems and can offer a fix/mitigation? Am I doing something wrong with my query or indexing policy? I don't know why my query is consuming that much resources.

I've experienced the same problem, but have no write pressure at all. Querying by any other indexed field is actually faster than querying by id. — Mike W.

Grey Wolf Grey Wolf · Accepted Answer · 2017-12-14T02:25:32

My test DB about 300k record When i try to select with ID only like this

SELECT * FROM c where c.id = 'xxx'

It take me alot of time and RU

But when i try with partition key in that

SELECT * FROM c where c.id = 'xxx' AND c.partitionField = 'yyy'

It's very fast

So I think you must recontruct your db, and thinking which field to make a partition

retrieving a document by id is slow across partitions in cosmos db

3 Answers