
Suppose I have data in a file that represents a car dealers current (huge) inventory of unique (by some criterium) cars, and a new file is generated daily reflecting the inventory and price etc. of each car.

Parsing the file results in a list of Car objects. The uniqueness of each car is represented in the Car object by some value, that would be the basis of a unique key in a regular rdbms setup.

I want the data in CosmosDB to be a queryable version of the data in the file. It should only hold data from the latest parsed file, and not data from previous files.

I have an Azure function that parses the file when it is uploaded to a blob storage, and inserts the parsed data into CosmosDB. However, this insertion will just grow the database over time with worthless data.

  • How can I UPSERT the parsed Car objects instead of always inserting them?

  • Can the upsert be declared as part azure function instead of ICollector<Car>?

I guess that CosmosDB could be used as an input to the Azure function and comparing the Car objects there with the ones in the file and updating/inserting when necessary, but I would prefer if CosmosDB or Azure function has a neat way of achieving this.

The Azure function:

public static void Run(
  [BlobTrigger("data", Connection = "StorageConnection")] TextReader textReader,
  [CosmosDB("data", "car", ConnectionStringSetting ="CosmosDb", CreateIfNotExists = true)] ICollector<Car> documentOutputBinding)
  var cars = CarsFileParser.Parse(textReader);

  foreach (var car in cars)
What's the document ID in Cosmos? Is it Car id or auto generated?Mikhail Shilkov
It is auto generated. Can it be the car id and still be og type Car?kasperhj

1 Answers


To upsert documents, change your Car class to have id property in it, and set it to unique car identifier.

ICollector<Car> should be ok for this case.