2
votes

I have a copy Activity that copies data from Blob to Azure Data Lake. The Blob is populated by an Azure function with an event hub trigger. Blob files are appended with UNIX timestamp which is the event enqueued time in the event hub. Azure data factory is triggered every hour to merge the files and move them over to Data lake.

enter image description here

Inside the source dataset I have filters by Last Modified date in UTC time out of the box. I can use this but it limits me to use Last modified date in the blob. I want to use my own date filters and decide where I want to apply these filters. Is this possible in Data factory? If yes, can you please point me in the right direction.

enter image description here

1

1 Answers

1
votes

For ADF in any case,the only idea that came to my mind is using combination of Look Up Activity ,ForEach Activity and Filter Activity.Maybe it is kind of complex.

1.Use Look up to retrieve the data from the blob file.

2.Use ForEach Activity to loop the result and set your data time filters.

3.Inside the ForEach Activity, do the copy task.

Please refer to this blog to get some clues.

Reviewing your descriptions of all the tasks you did now, I suggest you getting an idea of Azure Stream Analytics Service. No matter the data source is Event Hub or Azure Blob Storage, ASA supports them as input. And it supports ADL as output.

You could create a job to configure input and output,then use popular SQL language to filter your data however you want.Such as Where operator or DataTime Functions.