I am still wondering if there is any way to copy the data from one data lake folder to another data lake folder using copy activity or dataflow activity in the Azure data factory.
Suppose, I have a file in my input folder that says input/employee.csv which has data till March 21 and I have loaded that data into my datalake in the output folder output/employee.csv.
Now, from the next run onwards, I only want to pull data that got modified in the last 7 days. For doing that I queried the source system and I got that 7 days of data in a file and now, I need to check that 7 days of data against my sink and merge the data accordingly to sink (update/insert/delete).
I checked upsert if in dataflow but it looks like to work on this we need sink as a DB, but here my sink is datalake even my source is the file present in datalake.
I think I can use merge file behaviour or hashing as well, but not sure what will be the more optimized approach?
Please suggest