I'm trying to bulk load about 25 million rows from an Azure SQL table into three different tables in Azure Table Storage. I'm currently managing to process about 50-100 rows / second, which means that at current speeds, it'll take me about 70-140 hours to finish the load. That's a long time, and it seems like it ought to be possible to speed that up.
Here's what I'm doing:
- Kick off 10 separate tasks
- For each task, read the next 10,000 unprocessed records from the SQL DB
- For each of the three destination ATS tables, group the 10,000 records by that table's partition key
- In parallel (up to 10 simultaneously), for each partition key, segment the partition into (max) 100-row segments
- In parallel (up to 10 simultaneously), for each segment, create a new
TableBatchOperation
. - For each row from the chunk, execute a batch.InsertOrReplace() statement (because some of the data has already been loaded, and I don't know which)
- Execute the batch asynchronously
- Rinse and repeat (with lots of flow control, error checking, etc.)
Some notes:
- I've tried this several different ways, with lots of different parameters for the various numbers up above, and I'm still not getting it down to less than 10-20 ms / event.
- It doesn't seem to be CPU bound, as the VM doing the load is averaging about 10-20% CPU.
- It doesn't seem to be SQL-bound, as the SQL select statement is the fastest part of the operation by at least two orders of magnitude.
- It's presumably not network-bound, as the VM executing the batch is in the same data center (US West).
- I'm getting reasonable partition density, i.e., each 10K set of records is getting broken up into a couple hundred partitions for each table.
- With perfect partition density, I'd have up to 3000 tasks running simultaneously (10 master tasks * 3 tables * 10 partitions * 10 segments). But they're executing asynchronously, and they're nearly all I/O bound (by ATS), so I don't think we're hitting any threading limits on the VM executing the process.
The only other obvious idea I can come up with is one that I tried earlier on, namely, to do an order by
partition key in the SQL select statements, so that we can get perfect partition density for the batch inserts. For various reasons that has proven to be difficult, as the table's indexes aren't quite setup for that. And while I would expect some speed up on the ATS side using that approach, given that I'm already grouping the 10K records by their partition keys, I wouldn't expect to get that much additional performance improvement.
Any other suggestions for speeding this up? Or is this about as fast as anybody else has been able to get?