1
votes

I want to merge / union multiple files (.csv) with different schema into a single file (.csv) using Azure Data Factory / Synapse Analytics.

Example:

File 1, schema:

C1 C2 C3 C4 C5
01 02 03 04 05
01 02 03 04 05

File 2, schema

C1 C5 C3
01 05 03
01 05 03

File 3, schema:

C1 C4 C2
01 04 02
01 04 02

My expexted output should be:

C1 C2 C3 C4 C5
01 02 03 04 05
01 02 03 04 05
01 03 05
01 03 05
01 02 04
01 02 04

All the files are in the same folder

But I am having the following issue, due to some columns are not listed in all files

{ "errorCode": "2200", "message": "ErrorCode=UserErrorInvalidColumnMappingColumnNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Column 'C2' specified in column mapping cannot be found in source data.,Source=Microsoft.DataTransfer.ClientLibrary,'", "failureType": "UserError", "target": "Copy_24a", "details": [] }

I tried using Schema drift but no result as same.

Can anyone please let me know how to achieve this?

Update

Now, I am trying Union Transformation from Data Flow. My Source is a folder where I have several/multiple files. As following:

enter image description here

But, still the problem. Actually, if I check the preview, the information shown is not correct:

enter image description here

The curious thing is when a take a look to the preview directly from the origin is looking good, as I want the result (Delimited Text):

enter image description here

Hope anyone could help!

Thanks in advance!

1

1 Answers