0
votes

I am implementing a feature which is validating a dataset schema. I know that in ADF there is a checkbox in the Mapping Dataflow section which does it automatically if columns or type of the column does not match against the reference. For CSV it works fine for me, but in case of JSON I noticed two thing: 1, when I created a Dataset for the JSON its schema is different what I see in the Mapping Dataflow although I use the same dataset. I tried every option at both (Dataset & Data Flow) places import from sample file, import projection etc.. w/o success

sameDataSetDifferentSchema.png

At the end in the Dataset I changed it manually in the code part by editing the JSON's JSON :)

changeSchemaOfDataset.png

2, lastly it does not solved the problem after having the same Schema it is still failing during schema validation.

ErrorMsg.png

However in the error msg itself it dispalys the same type if you see the screenshot:

Found: ArrayType(StructType(StructField(**Description1**,StringType,true),...etc Required: ArrayType(StructType(StructField(**Description1**,StringType,true),...etc

1

1 Answers

0
votes

Instead of changing the data type in the dataset JSON, just override it in the data flow.

In the Projection tab of the Source transform, click "Import Projection" to override the dataset schema.

If you're not getting the schema that you want, then modify it using a Derived Column with toInteger() for the string you wish to convert.