My Query
Should we expect slowness in retrieving data from storage within different Partitions in a specified period of time - say 1 hour - if the data in the Table Storage`Partitions within Table Storage` is expected to be very very very huge (say in millions)?
About My App
My web app deals with receiving data for different signals from different devices.
The frequency of data to be received from devices can be 1 min.
The data thus received will be posted to
Table Storageand displayed as and when received on a Dashboard.The data pertaining to a particular
signal(s)within a selected period of time can also be queried to be displayed on page.
My problem
Currently the app is in testing and data comes in only whenever testing is happening. With this less amount of data, to query and fetch data from Table Storage it takes ~30 secs to fetch ~10,000 rows.
I have been reading here in different posts like Very Slow on Azure Table Storage Query on PartitionKey/RowKey List
that says there is some delay in getting data from Table Storage.
So my query is
When there are millions of data in a
Table Storage\ within aPartitionwill a query toTable Storagemake a complete table scan leading to heavy performance issue?- One of my expected queries to retrieve data to display on my page is
(((((((((((((PartitionKey eq 'D4AS1') or (PartitionKey eq 'D4AS2')) or (PartitionKey eq 'D4AS3')) or (PartitionKey eq 'D4AS4')) or (PartitionKey eq 'D4AS5')) or (PartitionKey eq 'D4AS6')) or (PartitionKey eq 'D4AS7')) or (PartitionKey eq 'D4AS8')) or (PartitionKey eq 'D4AS9')) or (PartitionKey eq 'D4AS10')) or (PartitionKey eq 'D4AS11')) or (PartitionKey eq 'D4AS133')) and (TimeReceived ge datetime'2018-02-21T23:53:40.4622407Z')) and (TimeReceived le datetime'2018-02-22T23:53:40.4622407Z')Should the above query be re framed for better performance? If so please suggest in what way it needs to be addressed?- Whats the maximum delay we can expect on querying (simple\complex as above)
Table Storage?
TimeReceiveda row key? If not, you're doing a full partition scan. And since you specified multiple partition keys, you're doing multiple partition scans, ifTimeReceivedis just an additional property. - David Makogonsplitting it up into concurrent queries. The above queries is dynamically formed - like from a inputList<string>of partition keys , a method frames query with Or\And Condition accordingly. Kindly elaborate on your suggestion. - jAntoniTimeReceived' is not the row key. Will changing the query toTimestamp` property ofTable Storageinstead of the currentTimeReceivedhelp? Or do we have any other way to improve the performance? Please suggest. - jAntoniTimestampis not a rowkey. You'd need to store yourTimeReceivedproperty in the row key and try again. But... I can't help you design your table, especially without knowing all the types of queries you'll execute. The best thing you can do is look at the Table Storage Design Guide to better understand how tables and related queries work. - David Makogon