4
votes

What do you recommend in the following scenario:

I have an azure table called Users where as columns are:

  • PrimaryKey
  • RowKey
  • Timestamp
  • FirstName
  • LastName
  • Email
  • Phone

Then there are different types of tasks for each user let's call them TaskType1 and TaskType2.

Both task types have common columns but then have also type specific columns like this:

  • PrimaryKey (this is the same as the Users PrimaryKey to find all tasks belonging to one user)
  • RowKey
  • Timestamp
  • Name
  • DueDate
  • Description
  • Priority

then TaskType1 has additional columns:

  • EstimationCompletionDate
  • IsFeasible

and TaskType2 has it's own specific column:

  • EstimatedCosts

I know I can store both types in the same table and my question is:

If I use different tables for TaskType1 and TaskType2 what will be the impact in transactions costs? I will guess that if I have 2 tables for each task type and then I will issue a query like: get me all tasks where the task Primarykey is equal to a specific user from Users table PrimaryKey then I will have to run 2 queries for each types (because users can have both tasks type) that means more transactions ... instead if both tasks are in the same table then it will be like 1 query (in the limit of 1000 after pagination transactions) because I will get all the rows where the PartitionKey is the user PartitionKey so the partition is not split that means 1 transaction right?

So did I understood it right that I will have more transactions if I store the tasks in different tables .. ?

1

1 Answers

6
votes

Your understanding is completely correct. Having the tasks split into 2 separate tables would mean 2 separate queries thus 2 transactions (let's keep more than 1000 entities out of equation for now). Though transaction cost is one reason to keep them in the same table, there are other reasons too:

  • By keeping them in the same table, you would be making full use of schema-less nature of Azure Table Storage.
  • 2 tables means 2 network calls. Though the service is highly available but you would need to take into consideration a scenario when call to 1st table is successful however call to 2nd table fails. How would your application behave in that scenario? Do you discard the result from the 1st table also? By keeping them in just one table saves you from this scenario.
  • Assuming that you have a scenario in your application where a user could subscribe to both Task 1 and 2 simultaneously. If you keep them in the same table, you can make use of Entity Group Transaction as both entities (one for Task 1 and other for Task 2) will have the same PartitionKey (i.e. User Id). if you keep them in separate tables, you would not be able to take advantage of entity group transactions.

One suggestion I would give is to have a "TaskType" attribute in your Tasks table. That way you would have an easier way of filtering by tasks as well.