3
votes

I'm quite new to Data Factory and Logic Apps (but I am experienced with SSIS since many years),
I succeeded in loading a folder with 100 text-files into SQL-Azure with DATA FACTORY
But the files themselves are untouched

Now, another requirement is that I loop through the folders to get all files with a certain file extension, In the end I should move (=copy & delete) all the files from the 'To_be_processed' folder to the 'Processed' folder

I can not find where to put 'wildcards' and such:
For example, get all files with file extensions .001, 002, 003, 004, 005, ...until... , 996, 997, 998, 999 (thousand files) --> also searching in the subfolders.

Is it possible to call a Data Factory from within a Logic App ? (although this seems unnecessary)

Please find some more detailed information in this screenshot:
(click to enlarge) enter image description here

Thanks in advance helping me out exploring this new technology!

2

2 Answers

1
votes

Interesting situation.

I agree that using Logic Apps just for this additional layer of file handling seems unnecessary, but Azure Data Factory may currently be unable to deal with exactly what you need...

In terms of adding wild cards to your Azure Data Factory datasets you have 3 attributes available within the JSON type properties block, as follows.

Folder Path - to specify the directory. Which can work with a partition by clause for a time slice start and end. Required.

File Name - to specify the file. Which again can work with a partition by clause for a time slice start and end. Not required.

File Filter - this is where wildcards can be used for single and multiple characters. (*) for multi and (?) for single. Not required.

More info here: https://docs.microsoft.com/en-us/azure/data-factory/data-factory-onprem-file-system-connector

I have to say that separately none of the above are ideal for what you require and I've already fed back to Microsoft that we need a more flexible attribute that combines the 3 above values into 1, allowing wildcards in various places and a partition by condition that works with more than just date time values.

That said. Try something like the below.

"typeProperties": {
  "folderPath": "TO_BE_PROCESSED",
  "fileFilter": "17-SKO-??-MD1.*" //looks like 2 middle values in image above
  }

On a side note; there is already a Microsoft feedback item thats been raised for a file move activity which is currently under review.

See here: https://feedback.azure.com/forums/270578-data-factory/suggestions/13427742-move-activity

Hope this helps

0
votes

We have used a C# application which we call through 'app services' -> webjobs. Much easier to iterate through folders. To call SQL we used sql bulkinsert