1
votes

When using a data flow in azure data factory to move data, I've noticed that the data (at the sink) is missing columns that contains NULL values. When using the copy activity to copy the same data, the columns are present in the sink with their NULL values.

Record after a copy activity: Record after a copy activity

Record after a data flow: Record after a data flow

Source is parquet, sink is azure cosmos db. My goal is to avoid defining any schemas, as I simply want to copy all of the data "as is". I've used the "allow schema drift" option on the source and sink.

I would just use the copy activity, but it doesn't appear to have the ability to define a maximum speed (RU consumption) like the data flow does, so the copy activity ends up consuming all of the cosmos db's RUs very quickly (as described here)


EDIT:

sink data preview shows all columns sink data tab

sink inspect tab shows all columns sink inspect tab

1
Inside your data flow designer, when you click "data preview" on the sink transformation, do see the columns there that are NULL? Also, check the Inspect tab to make sure those columns are present in the metadata. - Mark Kromer MSFT
@MarkKromerMSFT - yeah, i see the columns in the data preview as well as the inspect tab for the sink, i'll add a picture to the OP - UnknownBeef
Updated my earlier answer with repro findings, see if it helps - KarthikBhyresh-MT

1 Answers

1
votes

Dataflows always skip writing JSON tags with NULLs. There is no workaround currently other than copy activity.