I am working with Azure CosmosDB, and more specifically with the Gremlin API, and I am a little bit stuck as to what to select as a partition key.
Indeed, since I'm using graph data, not all vertices follow the same data schema. If I select a property that not all vertices have in common, Azure won't let me store vertices which don't have a value for the partition key. The problem is, the only property they all have in common is /id
, but Azure doesn't allow for this property to be used as a partition key.
Does that mean I need to create a property that all my vertices will have in common ? Doesn't that kill a little bit the purpose of graph data ? Or is there something I'm missing out ?
For example, in my case, I want to model an object and its parts. Each object and each part have a property /identificationNumber
. Would it be better to use this property as a parition key, or to create a new property /partitionKey
dedicated to the purpose of partitioning ? My concern is that, if I select /identificationNumber
as the partition key, and if my data model has to evolve in the future, if I have to model new objects without an /identificationNumber
, I will have to artificially add this property to these objects the data model, which might lead to some confusion.