1
votes

I have a blob with below format. First row gives header details and next 2 rows as data record and final row as trailer record which contains data records count. While record the file I want to define my schema as single field and once I remove my trailer record I want to convert it into proper schema format with "|" as delimiter. Could you let me know how can I achieve this please.

DeptID|DeptNAme
1|A
2|B
2

Thanks in Advance Kumar

1

1 Answers

0
votes

update:

  1. After SurrogateKey1 activity mentioned previously at Step4, we can use Select activity to select the column DeptID|DeptNAme. enter image description here

  2. Then we can use DerivedColumn1 activity, expressions split({DeptID|DeptNAme},'|')[1] and split({DeptID|DeptNAme},'|')[2] to generate new columns manually. enter image description here

  3. The data preview is as follows: enter image description here


@Kumar G we can use data-flow in ADF to achieve that.
For example, I created a simple test.

  1. I created a bolb in Azure Data Lake Gen2 as follows: enter image description here

  2. I created a data source of this blob , select Pipe (|) as Column delimiter and First row as header. enter image description here The schema is as follows: enter image description here

  3. I created a mapping data flow in ADF and the source data preview is as follows: enter image description here

  4. In SurrogateKey1, type in Row_No as Key Column, 1 as Start Value. enter image description here The data preview is as follows: enter image description here

  5. In Conditional split1, use Row_No < 3 to exclude the last line. enter image description here

  6. In Select1, not select Row_No column, enter image description here The data preview is as follows: enter image description here

That's all!