Cosmos DB Table requests stall when parallelized

Question

When launching operations in parallel, they usually end up timing out or failing

I'm using:

Azure Cosmos DB Table API
.NET Core 2.0 (console app)
WindowsAzure.Storage (9.2.0) nuget package
Standard CloudStorageAccount.Parse(...).CreateCloudTableClient().GetTableReference("...") setup

This code fires off 8999 tasks. Each task is a TableOperation.Retrieve specifying both PartitionKey & RowKey. I injected some code to track task completion states and any IRetryPolicy hits. There are no 429 errors

Here's the output from a recent run:

await Task.WhenAll(stuff.Select(x => table.ExecuteAsync(opGetter.Get(x))));
0:01 - 12 done
0:02 - 228 done
0:03 - 313 done
0:04 - 435 done
0:05 - 1010 done
0:06 - 1883 done
0:07 - 2833 done
0:08 - 3014 done
0:09 - 3878 done
0:10 - 5447 done
0:11 - 5569 done
0:12 - 6492 done
0:13 - 6527 done
0:14 - 6532 done
0:15 - 6541 done
0:16 - 6543 done
0:17 - 6547 done
0:18 - 6552 done
0:19 - 6554 done
0:20 - 6951 done
0:21 - 8105 done
0:22 - 8128 done
0:23 - 8591 done
0:24 - 8907 done
0:25 - 8908 done
0:29 - 8994 done
0:32 - 8996 done
1:14 - 8997 done
2:26 - 8998 done
5:02 - StatusCode: 0 "An error occurred while sending the request."
5:05 - All 8999 Done

(Sometimes, I get a client-side Timeout instead of that particular error, instead, or multiple errors)

These 8999 retrievals should ideally take a few secs tops.

How can I stop them from stalling?

Note:

I haven't fiddled with any ServicePoint settings like max connections, etc
I can't use "Direct Mode" or "TCP" (vs Gateway/Https) because there's no SDK supporting Cosmos DB Table API for .NET Core or Standard
I suspect the trouble is client-side
This is being run from a local server (not on azure)
This isn't "occasional". It's what happens nearly every time, with a large batch like this.

EDIT: Also posted as a GitHub issue. https://github.com/Azure/azure-documentdb-dotnet/issues/517

Adam Smith - Microsoft Azure Adam Smith - Microsoft Azure · Accepted Answer · 2018-06-08T17:19:03

I escalated your issue to the product team, they'll be reaching out to you in Github since you have an open issue there as well: https://github.com/Azure/azure-documentdb-dotnet/issues/517 , I will update this thread once a resolution is found.

Cosmos DB Table requests stall when parallelized

1 Answers