0
votes

I am using ADLS generation 2. And Data is stored in Date Hierarchy Strucurtre. like Year folder then month folder and then Day folder. The folder contains 2 txt files which I have to process into one SQL table.

path/Strucutre Example of ADLS folders:

raw/data

/2020

   /02

     /01

          A.txt
          B.txt     

 . . . . 
 . . . .
 
/2021

   /03

     /01

          A.txt
          B.txt

          
/2021  ( Year Folder )

   /03  ( Month Folder )

     /02 ( Day Folder )

          A.txt
          B.txt 

I can Process All files easily in ADF but data is present for the last 10 years so now I want to process files on the basis of the date parameter filter.

Parameter is:
    From_Date - '2020/03/01'
    To_Date -'2020/03/01'

So ADF should process all folders within this data range.

I would like to iterate only folder files which satisfy the input parameter date condition filter.

How do I iterate a specific date hierarchy folder and get the file inside it?

What would be the best approach to achieve it?

Thanks in advnace

1

1 Answers

0
votes

I haven't found any easy way to do this. Because it's hard to filter Date Hierarchy Strucurtre in ADF.I just provide a solution and maybe it's performance isn't good.

Soluttion:

  1. Get all date(format:yyyy/MM/dd) between From_Date and To_Date and put them into array. You can do this by Azure Function activity or directly in ADF(achieve directly in ADF can be complex).

  2. Loop this array with For Each activity and copy data from Azure Data Lake GEN2 to SQL table by using Copy activity.(Using Get Metadata activity to check whether this folder exists will be better)