EDIT question summary:
- I want to expose an endpoints, that will be capable of returning portions of xml data by some query parameters.
- I have a statefull service (that is keeping the converted to DTOs xml data into a reliable dictionary)
- I use a single, named partition (I just cant tell which partition holds the data by the query parameters passed, so I cant implement some smarter partitioning strategy)
- I am using service remoting for communication between the stateless WEBAPI service and the statefull one
- XML data may reach 500 MB
- Everything is OK when the XML only around 50 MB
- When data gets larger I Service Fabric complaining about MaxReplicationMessageSize
and the summary of my few questions from below: how can one achieve storing large amount of data into a reliable dictionary?
TL DR;
Apparently, I am missing something...
- I want to parse, and load into a reliable dictionary huge XMLs for later queries over them.
- I am using a single, named partition.
I have a XMLData stateful service that is loading this xmls into a reliable dictionary in its RunAsync method via this peace of code:
var myDictionary = await this.StateManager.GetOrAddAsync<IReliableDictionary<string, List<HospitalData>>>("DATA"); using (var tx = this.StateManager.CreateTransaction()) { var result = await myDictionary.TryGetValueAsync(tx, "data"); ServiceEventSource.Current.ServiceMessage(this, "data status: {0}", result.HasValue ? "loaded" : "not loaded yet, starts loading"); if (!result.HasValue) { Stopwatch timer = new Stopwatch(); timer.Start(); var converter = new DataConverter(XmlFolder); List <Data> data = converter.LoadData(); await myDictionary.AddOrUpdateAsync(tx, "data", data, (key, value) => data); timer.Stop(); ServiceEventSource.Current.ServiceMessage(this, string.Format("Loading of data finished in {0} ms", timer.ElapsedMilliseconds)); } await tx.CommitAsync(); }
I have a stateless WebApi service that is communicating with the above stateful one via service remoting and querying the dictionary via this code:
ServiceUriBuilder builder = new ServiceUriBuilder(DataServiceName); DataService DataServiceClient = ServiceProxy.Create<IDataService>(builder.ToUri(), new Microsoft.ServiceFabric.Services.Client.ServicePartitionKey("My.single.named.partition")); try { var data = await DataServiceClient.QueryData(SomeQuery); return Ok(data); } catch (Exception ex) { ServiceEventSource.Current.Message("Web Service: Exception: {0}", ex); throw; }
It works really well when the XMLs do not exceeds 50 MB.
- After that I get errors like:
System.Fabric.FabricReplicationOperationTooLargeException: The replication operation is larger than the configured limit - MaxReplicationMessageSize ---> System.Runtime.InteropServices.COMException
Questions:
- I am almost certain that it is about the partitioning strategy and I need to use more partitions. But how to reference a particular partition while in the context of the RunAsync method of the Stateful Service? (Stateful service, is invoked via the RPC in WebApi where I explicitly point out a partition, so in there I can easily chose among partitions if using the Ranged partitions strategy - but how to do that while the initial loading of data when in the Run Async method)
Are these thoughts of mine correct: the code in a stateful service is operating on a single partition, thus Loading of huge amount of data and the partitioning of that data should happen outside the stateful service (like in an Actor). Then, after determining the partition key I just invoke the stateful service via RPC and pointing it to this particular partition
Actually is it at all a partitioning problem and what (where, who) is defining the Size of a Replication Message? I.e is the partiotioning strategy influencing the Replication Message sizes?
Would excerpting the loading logic into a stateful Actor help in any way?
For any help on this - thanks a lot!