I need help.
I want to store articles from a lot of feeds in Azure Table Storage, and I'm expecting somewhere around 100 millions rows there. Initially I thought that Azure Table Storage will fit my requirements since I can design it like this:
- PartitionKey (will be hash of feed url)
- RowKey (will be hash of article url)
- Data (JSON data of article)
- PublishedOn (DateTime when article was published)
Than retrieving one article will be really fast when I'm accessing it by PartitionKey and RowKey.
And that worked as expected.
Now, I'm trying to send list of PartitionKeys (hashed feed urls) + pagination parameters (pageSize + currentPage). My result should be that in the first page of results I get recent articles, so it should be somehow ordered by PublishedOn column.
On above implementation I would need to get all rows from partitions requested, put them in one list, order them, take these which should be returned and return them...
Is this even possible to accomplish with Azure Table Storage or should I move on Azure SQL? Could I expect better performance for such query there on 100 milions records?
Thanks,