Choosing "As data in column" as file name option under Sink Settings for Data Flow is writing data at root of blob storage in Azure Data Factory V2

Question

That’s what I am currently trying to do in Data Flow:

Read table from SQL DWH (FileName, ProductID, MachinesCount, UsersCount, LastUsed)
In my Sink settings, I have set “File name option” to “As data in column” and gave FileName column.

In my Sink dataset(Azure Blob Storage), I have specified container to “referencedata” and folderPath to “mostused/accounts/newaccounts”. Now whenever Data Flow runs it just writes data on root of Blob Container which is “referencedata” and ignoring folderPath.

Extra information that could be useful: When Data Flow is actually running its writing data inside “mostused/accounts/newaccounts” just like how Azure Databricks does but then moves it to root of Blob Container. I think the logic that copies Azure Databricks multi partitioned files into more human readable files is buggy and its moving files to root of Blob container.

Hamdan Sultan Hamdan Sultan · Accepted Answer · 2019-11-25T09:34:46

Found the solution. When choosing “As data in column”, you need to specify the folder path from the container. So if your filename was file.json and you wish to write it in directory “mostused/accounts/newaccounts”, you need to have the column value be “mostused/accounts/newaccounts/file.json”.

I understand this is a confusing experience and Microsoft is working to improve it.

Choosing "As data in column" as file name option under Sink Settings for Data Flow is writing data at root of blob storage in Azure Data Factory V2

1 Answers