0
votes

I have an azure data factory pipeline for fetch the data from a third party API and store the data to the data-lake as .json format. When i click the import schema, it shows the correct datatype format. enter image description here

When I set the above mentioned data-lake as a source of data flow activity, the Int64 data type convert to boolean. I have checked the Microsoft documents and knew if the value is 0 or 1, it automatically convert to boolean. How can I avoid this data type conversion?

enter image description here

2

2 Answers

1
votes

First, verify if you have checked 'Infer drifted column types' to true under Source Settings.

Data Factory detects the data type as boolean if the values in the source column are only 1 or 0. This could be a potential bug.

One way around is, since you are using Data Flow, Add derivations for the columns using a Case statement and derive 1 & 0 in output based on boolean value.

1
votes

The easiest way is that just reset the all schema to String, that means don't convert the data type in Source dataset.

For example, this my source dataset schema and data, all the values in setNum are 1 or 0: enter image description here

Data Flow Source Projection, the data type of setNum first considered as Boolean.

enter image description here

Reset schema: all the data type will be string.

enter image description here

Then data factory will convert the data type in Sink level. It is similar with copy data from csv file.

Update:

You can first reset the schema to String.

Then using Derived Column to change/convert the data type as you want.

Using bellow expressions:

toShort()
toString()
toShort()

enter image description here

This will solve the problem.