Efficient way of returning item list from azure cosmos db

Question

I want to store data of the following form in azure cosmos db:

{
  "id": "guid",
  "name": "a name"
  "tenantId": "guid",
  "filter1": true,
  "filter2": false,
  "hierarchicalData" :{}
}

Each document will be up to a few megabytes in size.

I need to be able to return a {id, name} list (100 < count < 10k, per tenant) for a given search by {tenantId,filter1,filter2}.

From the documentation, I see I can do an SQL query with a projection, but am not sure if there is a better way.

Is there an ideal way to do the above while making efficient use of RUs?

A few MB? There's a 2MB document limit. And if your document hierarchy can grow unbounded, you'll eventually run out of room and your model will be broken. That said: without seeing an example of the output you're looking for, it's impossible to offer a solution. Please edit your question accordingly. — David Makogon
Did you try to test it out? What query/index did you use? What you consider acceptable RU usage? What is your actual (unacceptable) RU usage? Note that "Ideal" and "Efficient" solutions always start from "it depends". — Imre Pühvel
Thanks @JayGong you answer does help. Most of the hierarchical data doesn't need to be indexed so excluding that is a good idea. — Aaron0
@DavidMakogon good pick up I wasn't aware of the 2MB limit. We are going to need to do some modelling of the maximum possible size as that limit may rule out our use anyway — Aaron0

Jay Gong Jay Gong · Accepted Answer · 2018-10-01T08:44:10

Is there an ideal way to do the above while making efficient use of RUs?

Maybe it's hard to say that there is a best way to make efficient use of RUs and improve the query performance.

Based on your situation,of course, you could use SQL query to get data with specific filters. I'm just offering several ways to improve your query performance as below:

1.Add a partition key.

If your data is partitioned, then when you provide the partition key with sql,it could only scan the specific partition so that it will save RUs. Please refer to the document.

2.Use recent sdk.

The Azure Cosmos DB SDKs are constantly being improved to provide the best performance. See the Azure Cosmos DB SDK pages to determine the most recent SDK and review improvements.

3.Exclude unused paths from indexing for faster writes.

Cosmos DB's indexing policy also allows you to specify which document paths to include or exclude from indexing by leveraging Indexing Paths (IndexingPolicy.IncludedPaths and IndexingPolicy.ExcludedPaths). The use of indexing paths can offer improved write performance and lower index storage for scenarios in which the query patterns are known beforehand.

4.Use continuation token if the data is too large.

Paging the data with continuation token to improve query performance.Doc: https://www.kevinkuszyk.com/2016/08/19/paging-through-query-results-in-azure-documentdb/

More details, please refer to here.

Efficient way of returning item list from azure cosmos db

1 Answers