Microsoft's documentation of Managing indexing in Azure Cosmos DB's API for MongoDB states that:
Azure Cosmos DB's API for MongoDB server version 3.6 automatically indexes the _id field, which can't be dropped. It automatically enforces the uniqueness of the _id field per shard key.
I'm confused about the reasoning behind "per shard key" part. I see it as "you're unique field won't be globally unique at all" because if I understand it correctly, if I have the Guid field _id as unique and userId field as the partition key then I can have 2 elements with the same ID provided that they happen to belong to 2 different users.
Is it that I fail to pick the right partition key? Because in my understanding partition key should be the field that is the most frequently used for filtering the data. But what if I need to select the data from the database only by having the ID field value? Or query the data for all users?
Is it the inherent limits in distributed systems that I need to accept and therefore remodel my process of designing a database and programming the access to it? Which in this case would be: ALWAYS query your data from this collection not only by _id field but first by userId field? And not treat my _id field alone as an identifier but rather see an identifier as a compound of userId and _id?
_id, and you're using something like a guid method, this shouldn't be an issue to you. However, if you're leaving it to Cosmos DB to generate_id(in other words, you didn't specify a value), then you will be guaranteed to have uniqueness within a shard (partition), but not necessarily globally. - David Makogon_idvalue in different partitions (shards). The enforcement is at the partition boundary. - David Makogon