I am trying to find way to further improve the performance of my console app (already fully working).
I have a CSV file which contains a list of addresses (about 100k). I need to query a Web API whose POST response would be the geographical coordinates of such addresses. Then I am going to write a GeoJSON file to the file system with the address data enriched with geographical coordinates (latitude and longitude).
My current solution splits the data into batches of 1000 records and sends Async POST requests to the Web API using HttpClient (.NET core 3.1 with console app and class library using .NET Standard 2.0). GeoJSON is my DTO class.
public class GeoJSON
{
public string Locality { get; set; }
public string Street { get; set; }
public string StreetNumber { get; set; }
public string ZIP { get; set; }
public string Latitude { get; set; }
public string Longitude { get; set; }
}
public static async Task<List<GeoJSON>> GetAddressesInParallel(List<GeoJSON> geos)
{
//calculating number of batches based on my batchsize (1000)
int numberOfBatches = (int)Math.Ceiling((double)geos.Count() / batchSize);
for (int i = 0; i < numberOfBatches; i++)
{
var currentIds = geos.Skip(i * batchSize).Take(batchSize);
var tasks = currentIds.Select(id => SendPOSTAsync(id));
geoJSONs.AddRange(await Task.WhenAll(tasks));
}
return geoJSONs;
}
My Async POST method looks like this:
public static async Task<GeoJSON> SendPOSTAsync(GeoJSON geo)
{
string payload = JsonConvert.SerializeObject(geo);
HttpContent c = new StringContent(payload, Encoding.UTF8, "application/json");
using HttpResponseMessage response = await client.PostAsync(URL, c).ConfigureAwait(false);
if (response.IsSuccessStatusCode)
{
var address = JsonConvert.DeserializeObject<GeoJSON>(await response.Content.ReadAsStringAsync());
geo.Latitude = address.Latitude;
geo.Longitude = address.Longitude;
}
return geo;
}
The Web API runs on my local machine as Self Hosted x86 application. The whole application ends in less than 30s. The most time consuming part is the Async POST part (about 25s). The Web API takes only one address for each post, otherwise I'd have sent multiple addresses in one request.
Any ideas on how to improve performance of the request against the Web API?