I have a Cosmos DB collection with numerous partitions based on a device ID. I frequently have use cases that require retrieving the most recent document by a specific device ID. I'm currently using the SELECT TOP 1
functionality available in the DocumentDB API as shown below to accomplish this:
SELECT TOP 1 *
FROM c
WHERE c.deviceId = 5
ORDER BY c.timeStamp DESC
This approach results in increased RU/s consumption and decreased performance as the collection and individual partitions grow in size, as one would expect . As a temporary remedy to this issue, I have added additional where clauses to limit the scope of the query by timestamp:
SELECT TOP 1 *
FROM c
WHERE c.deviceId = 5
AND c.timeStamp >= 1506608558 --timestamps are unix/epoch based to optimize indexing
AND c.timeStamp <= 1506694958
ORDER BY c.timeStamp DESC
I would like to know if there's a better way to select the latest document by partition id, as the addition of this where clause could result in unexpected or missing results.