2
votes

I want to move some of my Azure SQL tables to Table storage. As far as I understand, I can save everything in the same table, seperating it using PartitionKey and keeping it unique within each partition using Rowkey. Now, I have a table with a compound key:

  • ParentId: (uniqueidentifier)
  • ReportTime: (datetime)

I also understand RowKeys have to be strings. Will I need to combine these in a single string? Or can I combine multiple keys some other way? Do I need to make a new key perhaps?

Any help is appreciated.

UPDATE

My idea is to put data from several (three for now) database tables and put in the same storage table seperating them with the partition key.

I will query using the ParentId and a WeekNumber (another column). This table has about 1 million rows that's deleted weekly from the db. My two other tables has about 6 million and 3.5 million

2

2 Answers

3
votes

This question is pretty broad and there is no right answer.

The specific question - can you use Compound Keys with Azure Table Storage. Yes, you can do that. But this involves manual Serializing / Deserializing of your object's properties. You can achieve that by overriding the TableEntity's ReadEntity and WriteEntity methods. Check this detailed blog post on how can you override these methods to use your own custom serialization/deserialization.

I will further discuss my view on your more broader question.

First of all, why you want to put data from 3 (SQL) tables into one (Azure Table)? Just have 3 Azure tables.

Second thought, as Fabrizio points out is how are you going to query the records. Because Windows Azure Table service has only one index, and that is PartitionKey + RowKey properties (columns). If you are pretty sure you will mostly query data by known PartitionKey and RowKey, then Azure Tables is perfectly suiting you! However you say that your combination for RowKey is ParentId + WeekNumber! That means that a record is uniquely identified by this combination! If it is true, then you are even more ready to go.

Next you say you are going to delete records every week! You should know that DELETE operation acts on a single entity. You can use Entity Group Transactions to DELETE multiple entities at once, but there is a limit of (a) All entities in batch operation must have the same PartitionKey, (b) The maximum number of entities per batch is 100, and (c) The maximum size of batch operation is 4MB. Say you have 1M records like you say. In order to delete them, you have to first retrieve them in groups by 100, then delete in groups by 100. These are, in best possible case 10k operations on retrieval and 10k operations on deletion. Event if it will only cost 0.002 USD, think about time taken to execute 10k operations against a REST API.

Since you have to delete entities on a regular basis, which is fixed to a WeekNumber let's say, I can suggest that you dynamically create your tables and include the week number in its name. Thus you will achieve:

  • Even better partitioning of information
  • Easier and more granular information backup / delete
  • Deleting millions of entities requires just one operation - delete table.
1
votes

There is not an unique solution for your problem. Yes, you can use ParentID as PartitionKey and ReportTime as Rowkey (or invert the assignment). But the big 2 main questions re: how do you query your data, with what conditions? and how many data do you store? 1000, 1 million items, 1000 millions items? The total storage usage is important. But it's also very important to consider the number of transaction you will generate to the storage.