1
votes

Had trouble with best fit for this question

With Azure Table Storage you just have fixed key of partition, rownum. They charge based on size and number of operations.

Have a LAN document manage application with a WPF client and a SQL database

Taking it to Azure and I want to minimize cost

Most applications would be like 100,000 documents
But on the high end could be 4 million documents

Most SQL search are cheap (0.1 seconds) but some can be expensive (e.g. 60 seconds).

On the LAN version what I do is return the detail for 100 but but the raw identifiers for 10,000. So as the client needs detail for the next page it just sends back the identifiers (int) for the next 100 and a very cheap query for the detail. I don't keep the detail for the prior page and it would use too much client memory. When it gets to 10,000 then I run a search for the next 10,000.

In an Azure environment is there any reason to change that? I am paying for bandwidth but int is pretty darn small. I am paying for SQL CPU and IO also. If anything I am thinking about sending 100,000 identifier to the client.

I looked at Table Storage but if I have 100 clients doing 100 searches a day (and some do) and 100,000,000 returned the transaction cost adds up (over $700) and Table Storage inserts are relatively slow. The database is only $15.

I could store the results in a SQL table
userID (smallint), rowNum (int), docIdentifier (int)
clustered index with fill factor 50 on userID, rowNum
My problem there is a 4,000,000 documents with 1,000 login could put me into 16 GB on that table alone

I am thinking send 100,000 int at a time to the client.
Is there a better design?

2

2 Answers

1
votes

One of the benefits of using Azure is that you have multiple choices for storing data. From Azure tables, Document DB, all the way to SQL DB and SQL DW. Each service has great documentation that describes how they are different at what they are best at.

Because you have choices, you'll probably want to choose the cheapest storage option in Azure. That would be tables. However as you correct point out, the cheapest isn't always the easiest to search, and that's typically where SQL has benefits. Searching/selecting data from SQL is often easier.

So it's a tradeoff between storage cost, data throughput and easy of access/programming.

0
votes

I would say that Azure Table Storage is your cheapest solution. The point is, what exactly you'll return to your users? 100 records for each time of a amount of 100.000, I'm not sure if your users will browse on every page. Maybe you should consider another way to report the information.