I have been reading how to partition my Azure Table Storage to ensure high performance. I would like to know if my proposed strategy provides the ability to provide efficient and scalable inserts and simple queries to the data store.
I have 1000 different processes uploading a small packet of data ( ~50 bytes ) to AZT every 30 seconds. My queries will virtually always be to simply query by process and time. For example, I want to query for all of process A's logs from 7pm to 9pm on a given date.
My proposed strategy is to create a table for each process ( 1000 tables ) and then partition the rows such that each partition contains 6 hours of data ( 4 new partitions per day, 720 rows per partition ). Partition key 'NOV82012-0' would contain 720 rows from midnight on November 8 until 6AM. 'NOV82012-1' would contain 6AM-Noon, etc...
This should ensure that I always have fewer than 1000 rows in any partition so that I don't have to worry about continuation tokens. I can also easily 'filter' by process since the data from each process has its own table.
Is this the ideal strategy for this case? Am I missing anything?