I get CSV files of different length from different sources. The columns within the CSV are different with the only exception is each CSV file will always have an Id column which can be used to tie the records within different CSV files. At a time, two such CSV files needs to be processed. The process is to take the Id column from the first file and match the rows within the second CSV file and create a third file which contains contents from the first and second file. The id column can be repeated in the first file. Eg is given below. please note that the first file I might have 18 to 19 combination of different data columns so, I cannot hardcode the transformation within dataweave and there is a chance that a new file will be added every time as well. A dynamic approach is what I wanted to accomplish. So once written, the logic should work even if a new file is added. These files get pretty big as well.
The sample files are given below.
CSV1.csv
--------
id,col1,col2,col3,col4
1,dat1,data2,data3,data4
2,data5,data6,data6,data6
2,data9,data10,data11,data12
2,data13,data14,data15,data16
3,data17,data18,data19,data20
3,data21,data22,data23,data24
CSV2.csv
--------
id,obectId,resid,remarks
1,obj1,res1,rem1
2,obj2,res2,rem2
3,obj3,res3,rem3
Expected file output -CSV3.csv
---------------------
id,col1,col2,col3,col4,objectid,resid,remarks
1,dat1,data2,data3,data4,obj1,res1,rem1
2,data5,data6,data6,data6,obj2,res2,rem2
2,data9,data10,data11,data12,obj2,res2,rem2
2,data13,data14,data15,data16,obj2,res2,rem2
3,data17,data18,data19,data20,obj3,res3,rem3
3,data21,data22,data23,data24,obj3,res3,rem3
I was thinking to use pluck to get the column values for the first file. I idea was to get the columns in the transformation without hardcoding it. But I am getting some errors. After this I have the task of searching for the id and getting the value from the second file
{(
using(keys = payload pluck $$)
(
payload map
( (value, index) ->
{
(keys[index]) : value
}
)
)
)}
I am getting the following error when using pluck
Type mismatch for 'pluck' operator
found :array, :function
required :object, :function
I am thinking of using groupBy on id on the second file to facilitate better searching. But need suggestions on how to append the contents in one transformation to form the 3rd file.