I want to merge / union multiple files (.csv) with different schema into a single file (.csv) using Azure Data Factory / Synapse Analytics.
Example:
File 1, schema:
C1 | C2 | C3 | C4 | C5 |
---|---|---|---|---|
01 | 02 | 03 | 04 | 05 |
01 | 02 | 03 | 04 | 05 |
File 2, schema
C1 | C5 | C3 |
---|---|---|
01 | 05 | 03 |
01 | 05 | 03 |
File 3, schema:
C1 | C4 | C2 |
---|---|---|
01 | 04 | 02 |
01 | 04 | 02 |
My expexted output should be:
C1 | C2 | C3 | C4 | C5 |
---|---|---|---|---|
01 | 02 | 03 | 04 | 05 |
01 | 02 | 03 | 04 | 05 |
01 | 03 | 05 | ||
01 | 03 | 05 | ||
01 | 02 | 04 | ||
01 | 02 | 04 |
All the files are in the same folder
But I am having the following issue, due to some columns are not listed in all files
{ "errorCode": "2200", "message": "ErrorCode=UserErrorInvalidColumnMappingColumnNotFound,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Column 'C2' specified in column mapping cannot be found in source data.,Source=Microsoft.DataTransfer.ClientLibrary,'", "failureType": "UserError", "target": "Copy_24a", "details": [] }
I tried using Schema drift but no result as same.
Can anyone please let me know how to achieve this?
Update
Now, I am trying Union Transformation from Data Flow. My Source is a folder where I have several/multiple files. As following:
But, still the problem. Actually, if I check the preview, the information shown is not correct:
The curious thing is when a take a look to the preview directly from the origin is looking good, as I want the result (Delimited Text):
Hope anyone could help!
Thanks in advance!