1
votes

Suppose I have data in a file that represents a car dealers current (huge) inventory of unique (by some criterium) cars, and a new file is generated daily reflecting the inventory and price etc. of each car.

Parsing the file results in a list of Car objects. The uniqueness of each car is represented in the Car object by some value, that would be the basis of a unique key in a regular rdbms setup.

I want the data in CosmosDB to be a queryable version of the data in the file. It should only hold data from the latest parsed file, and not data from previous files.

I have an Azure function that parses the file when it is uploaded to a blob storage, and inserts the parsed data into CosmosDB. However, this insertion will just grow the database over time with worthless data.


  • How can I UPSERT the parsed Car objects instead of always inserting them?

  • Can the upsert be declared as part azure function instead of ICollector<Car>?

I guess that CosmosDB could be used as an input to the Azure function and comparing the Car objects there with the ones in the file and updating/inserting when necessary, but I would prefer if CosmosDB or Azure function has a neat way of achieving this.


The Azure function:

[FunctionName("ParseCarsFromFile")]
public static void Run(
  [BlobTrigger("data", Connection = "StorageConnection")] TextReader textReader,
  [CosmosDB("data", "car", ConnectionStringSetting ="CosmosDb", CreateIfNotExists = true)] ICollector<Car> documentOutputBinding)
{
  var cars = CarsFileParser.Parse(textReader);

  foreach (var car in cars)
  {
      documentOutputBinding.Add(car);
  }
}
1
What's the document ID in Cosmos? Is it Car id or auto generated?Mikhail Shilkov
It is auto generated. Can it be the car id and still be og type Car?kasperhj

1 Answers

1
votes

To upsert documents, change your Car class to have id property in it, and set it to unique car identifier.

ICollector<Car> should be ok for this case.