We're seeing some very variable latencies when querying our Azure Table Storage data. We have a number of items each fetching time series data which is broken up by day as follows:
Partition key: {DATA_TYPE}_{YYYMMdd} - 4 different datatypes with about 2 years of data in total
Row Key: {DataObjectId} - About 3-4,000 records per day.
A record itself is a JSON encoded array of dateTime objects spread out every 15 minutes.
So I want to retrieve timeseries data for a specific object for the last few days so I constructed the following query:
string.Format("(PartitionKey ge '{0}') and (PartitionKey le '{1}') and (RowKey eq '{2}')", lowDate, highDate, DataObjectId);
As above we have records going over 2-3 years now.
On the whole the query time is fairly speedy 600-800 ms However once or twice we get a couple of values where it seems to take a very long time to retrieve data from these partitions. i.e. one or two queries have taken 50 seconds plus to return data.
We are not aware that the system is under dramatic load. In fact frustratingly all the graphs in the portal we've found suggest no real problems.
Some suggestions that come to mind:
1.) add year component first making the partition keys immediately more selective.
However the most frustrating thing is the variation in time taken to do the queries.
The Azure storage latency in the Azure portal is averaging at about 117.2ms and the maximum reported is 294ms. I have interpreted this as Network latency.
Of course any suggestions gratefully received. The most vexing thing is that the execution time is so variable. In a very small number of cases we see our application resorting to the use of continuation tokens as the query has taken over 5 seconds to complete.
https://msdn.microsoft.com/en-us/library/azure/dd179421.aspx