I am evaluating the performance of Service Fabric with a Reliable Dictionary of ~1 million keys. I'm getting fairly disappointing results, so I wanted to check if either my code or my expectations are wrong.
I have a dictionary initialized with
dict = await _stateManager.GetOrAddAsync<IReliableDictionary2<string, string>>("test_"+id);
id
is unique for each test run.
I populate it with a list of strings, like "1-1-1-1-1-1-1-1-1", "1-1-1-1-1-1-1-1-2", "1-1-1-1-1-1-1-1-3".... up to 576,000 items. The value in the dictionary is not used, I'm currently just using "1".
It takes about 3 minutes to add all the items to the dictionary. I have to split the transaction to 100,000 at a time, otherwise it seems to hang forever (is there a limit to the number of operations in a transaction before you need to CommitAsync()
?)
//take100_000 is the next 100_000 in the original list of 576,000
using (var tx = _stateManager.CreateTransaction())
{
foreach (var tick in take100_000) {
await dict.AddAsync(tx, tick, "1");
}
await tx.CommitAsync();
}
After that, I need to iterate through the dictionary to visit each item:
using (var tx = _stateManager.CreateTransaction())
{
var enumerator = (await dict.CreateEnumerableAsync(tx)).GetAsyncEnumerator();
try
{
while (await enumerator.MoveNextAsync(ct))
{
var tick = enumerator.Current.Key;
//do something with tick
}
}
catch (Exception ex)
{
throw ex;
}
}
This takes 16 seconds.
I'm not so concerned about the write time, I know it has to be replicated and persisted. But why does it take so long to read? 576,000 17-character string keys should be no more than 11.5mb in memory, and the values are only a single character and are ignored. Aren't Reliable Collections cached in ram? To iterate through a regular Dictionary of the same values takes 13ms.
I then called ContainsKeyAsync
576,000 times on an empty dictionary (in 1 transaction). This took 112 seconds. Trying this on probably any other data structure would take ~0 ms.
This is on a local 1 node cluster. I got similar results when deployed to Azure.
Are these results plausible? Any configuration I should check? Am I doing something wrong, or are my expectations wildly inaccurate? If so, is there something better suited to these requirements? (~1 million tiny keys, no values, persistent transactional updates)