Azure Data Factory how to derive empty to null value with dynamic columns

Question

Is "Derived Column" step in Data Flow of Azure Data Factory able to dynamically detect empty value and replace with NULL value while looping through several files with different numbers of columns?

What I am trying to do is to copy 10 csv files to 10 tables straight without any manipulation but only replace empty value with NULL. Each of these 10 files contain 5-25 columns. I know I can build 10 data flow to achieve this but was wondering if there is a smarter way to LOOKUP the files list and FOREACH file dynamically replace empty values?

Update:

Thanks to Joseph's answer. It seems data flow automatically treats empty value as NULL. But I then encountered another parsing problem that date / time / datetime columns in Sink receive NULL value.

Source: The positions of date / time / datetime columns are not fixed in each file, neither with the same column name. Since the source is in CSV format, all columns are treated as string type and cannot use "type" for pattern matching: https://i.stack.imgur.com/X1C7z.png

Sink: From the example above, all the highlighted columns are transferred as NULL in Sink: https://i.stack.imgur.com/PecRb.png

Joseph Xu Joseph Xu · Accepted Answer · 2020-10-23T07:56:41

Yes, we can. In my test, I copied 2 csv files to 2 tables by one data flow.

I created an Array variable named Objects. It's format is like this [{"source":"nullValue1.csv","sink":"TestNull"},{"source":"nullValue2.csv","sink":"TestNull2"}]. In your case, you can add 10 Objects to the array.

Then we can foreach the Objects array:

In the ForEach activity, we can define a dataflow.

We can enter dynamic content @item().source to specify the filename dynamically.

Here comes the point, in the DerivedColumn, we can add a column pattern and use iifNull($$,toString(null())) to detect empty value in each column and replace with NULL value.

In the sink, we can enter dynamic content @item().sink to specify the tablename in SQL.

Update

My origin date format is like this: 11/04/2020 1:17:40.

According to this documnet, we can use name option to specify the column which need to be date format convert.

Here we can use the expression toString(toTimestamp(toString($$),'MM/dd/yyyy HH:mm:ss'),'yyyy-MM-dd HH:mm:SS') to convert the date format.

The result shows, we can see the Column4:

Part2
We should replace the / with -.

First pattern 1==1&&name!='ClockingDate'&&name!='CreationDate'&&name!='ClockingTime'
Others are like follows:

The data preview like this:

Azure Data Factory how to derive empty to null value with dynamic columns

1 Answers