0
votes

i have an input csv file with columns position_Id, Asofdate, etc which has to be loaded into staging table. In my table, columns Position_Id, AsofDate are primary keys. We receive this input file for very 2 hours. For Exmaple, we recived File at 10 Am today, and that files loads into table.And after 2 more hours we recived another file whcih contains of same data as of the file which we recived 2 hours back and data loads into table.

Now my table contains the data of the file that we recived at 10 Am and 12 pm. At 12:10 pm we received modified input file with different data inside it. Now, my actual requirement is, before the latest input file (12:10 Pm) data is loaded int table, it has to see that only new and updated data has to be loaded into the table.
1

1 Answers

2
votes

Have you ever heard of the term Upsert? Here are examples of how to upsert (insert new records and update existing records).

  1. This blog post walks you through Upserting using a lookup in a dataflow.
  2. This stackoverflow answer provides links to explaining and setting up a merge.