I have the following problem. I need to copy from a ADLS(Azure data lake store) source to a sink ADLS, but only the most recent file. Each hour, arrives to the source a .csv file, this file has to be copied to the sink data lake. For instance:
event: Hour1 - file_01.csv arrives to source. task: copy file_01.csv to sink data lake. event: Hour2 - file_02.csv arrives to source. task: copy file_02.csv to sink data lake. And so on.
Is there anyway to create an event based trigger(the arrival of new file in the source)? That was my first thought.
Another way, would be to create a job, run by Azure Data lake analytics. In there I would extract the system date & time (I dont know how to do this). Choose the most recent file, and copy that file into the sink data lake. How can I declare a variable containing the date&time using u-sql? How can I copy data into a data lake using u-sql?
Summary: How can i make an incremental/updated copy among data lakes?
Thanks