How do I use mongodb's sharding in a CosmosDB instance

Question

I am currently studying using MongoDB for a multi-tenant multi-document type application running on top of a Azure CosmosDB infrastructure.

The partitioning pages on the MSFT documentation (i.e. https://docs.microsoft.com/en-us/azure/cosmos-db/partition-data) explain thoroughly how to implement partitioning strategies if you are using DocumentDB to communicate with Cosmos, but they don't go into any detail as to how I should handle things when using the MongoDB API.

My idea would basically be:

A single database
A single collection

Both would map naturally to Cosmos' model for the cheapest experience. I'd aim at 400 RUs as the standard size, as it's the cheapest option.
Multiple types of documents, each with a TenantID property that would map to separated tenants in the application with their own (security, user, performance, etc.) concerns, and a DocumentType property to allow for easy filtering.

With the DocumentDB API it'd be natural to use the TenantID as a PartitionKey. With MongoDB API, can I just leave it to Azure? Should I do something 'manually'?

I'm using the C# API, if it matters - I assume the configuration would be similar anywhere else.

With the DocumentDB API it'd be natural to use the TenantID as a PartitionKey. With MongoDB API, can I just leave it to Azure? Should I do something 'manually'? If you'd like to distribute your data amongst multiple partitions, you need to specify Shard Key when you adding the collection. — Fei Han

dov.amir dov.amir · Accepted Answer · 2017-12-15T11:57:14

Mongo and Cosmos sharding mechanisms are build differently, therefore the sharding key should be different between systems if you want to fully utilize the platforms.

taken from this site abut Mongo DB http://learnmongodbthehardway.com/schema/sharding/

Cardinality

Always consider the number of values your shard key can express. A sharding key that has only 50 possible values, is considered low cardinality, while one that might be able to express several million values might be considered a high cardinality key. High cardinality keys are preferable to low cardinality keys to avoid un-splittable chunks.

So in Mongo DB you will want to have high cardinality partion keys to target chunks (logical partitions) of about 64MB,

where as in Cosmos DB you will target low cardinality partitions keys because the logical partitions are up to 10G

How do I use mongodb's sharding in a CosmosDB instance

1 Answers