0
votes

I've built a ATS application where a medium-sized set of entity instances (maybe 50 - 1000 entities) belong to a certain customer.

Currently, I got the following design:

  • Each Entity type has its own table. E.g. there's a 'customers' table storing all customers. And a 'things' table storing all things.

  • The customer ID is the partition key of an entity. E.g. a thing belonging to customer ABC belongs to the partition ABC.

I mostly query, update and delete entity sets owned by a certain customer. E.g. I query all 'thing' instances a customer has.

Is that a good way of organizing rows? Would it be better to have one table for each customer where all his data is stored?

Thank you in advance

1
There are several flavors of organizing data in Table Storage and probably every flavor has its own proponents. It also depends on the structure of for instance the 'things' data. If that has categories, this could be an awesome partitionkey in a things table. If it's relative simple data, your solution is just fine. So in short: we need more information to give an answer, but it will still give you a primarily opinion-based answer.rickvdbosch

1 Answers

1
votes

One difference between the 2 alternative designs in your question would be that you lose the batch operation support accross entire set of entities belonging to a specific customer id with your current solution because they are distributed accross different tables. You did not mention great deal of detail on your requirements but you may end up inconsistincies like deleting a customer from your customer table but not being able to delete that customers entities from corresponding entitiy tables because azure table storage only supports atomic batch operations on the same partition key and within the same table. Whereas if you put all your entities in the same table with customer id as PK and may be a category field as RK you could run batch operations accross entire partition key. Caveat is that this pattern would likely bloat your partitions with time and you may hit other perf related issues so.. this answer is more like informational than definitive. Definitive answer would require looking at your full set of requirements, data models, frequent use cases, throughput targets etc.