1
votes

I'm trying to develop a user profile service (Asp.Net core Web API) which has persistent storage as Azure Cosmos DB. Even after reading various articles, I could not able to figure out the appropriate partition key for this service. As per various articles,

the partition key (logical partition) should be a one which has even access patterns. An ideal partition key is one that appears frequently as a filter in your queries and has sufficient cardinality to ensure your solution is scalable.

Below is sample document which is being stored in Azure Cosmos DB (SQL API).

{
    "id": <<Id>>,   
    "uniqueBusinessId": <<uniqueBusinessId>>,               
    "userName": <<userName>>,                                   
    "isActive": <<isActive>>,                       
    "email" : <<email>>
    "salutation": <<"salutation>>
    "firstName": <<firstName>>,                 
    "middleName": <<middleName>>,                       
    "lastName": <<lastName>>,                                           
    "companyName": <<companyName>>,         
    "jobTitle": <<jobTitle>>        
    "address": [                
        {                                   
            "countryCode": <<Country Code>>,        
            "stateProvinceCode": <<StateProvinceCode>>,     
            "address1": <<addressLine1>>,   
            "address2": null,               
            "city": <<city>>,               
            "postalCode": <<postalCode>>,           
        }
    ]
    "phone": [              
         {          
            "countryCode":  <<Country Code>>,           
            "areaCode": <<area Code>>,          
            "number": <<number>>,       
            "extension": <<extension>>          
        }
    ]
  }

There would be one document for each user in the collection and 99% queries will fetch the data based on uniqueBusinessId which is a unique id for every user (around ~1 million users will be there in the system).

If I choose uniqueBusinessIdfor the above collection as partition key that means it would create 1 million logical partitions (and it would have no cardinality). Is uniqueBusinessIda right candidate for partition key? I can choose partition key as /address/city or any other key in document to have good cardinality; but the problem it would create with queries as they would be cross partition scanning to filter document based on uniqueBusinessId.

Any suggestion for what should be the appropriate partition key for above document?

1

1 Answers

2
votes

Cardinality is fine to keep in mind but put business logic and what makes sense above everything. You want to eliminate the possibility of having to perform a cross partition query by choosing a key that you will always have available.

You DO NOT want to have ANY cross partition queries as part of your day to day workflow in your application.

Choosing uniqueBusinessId would be a good choice if you are able to have access to is 99% of the time. It will allow for good performance and low cost operations.

Keep in mind however that each logical partition has a maximum size of 10 GB. If using the uniqueBusinessId has any chance of meeting that limitation then you cannot use it.