0
votes

I have input like this .

Input:

|customerId|Header     |Line     |
|1001      |1001aa     |1001aa1  |
|1001      |1001aa     |1001aa2  |
|1001      |1001aa     |1001aa3  |
|1001      |1001aa     |1001aa4  |
|1002      |1002bb     |1002bb1  |
|1002      |1002bb     |1002bb2  |
|1002      |1002bb     |1002bb3  |
|1002      |1002bb     |1002bb4  |
|1003      |1003cc     |1003cc1  |
|1003      |1003cc     |1003cc2  |
|1003      |1003cc     |1003cc3  |
+----------+-----------+---------+

I want the output be of typeenter image description here

Using Dataframe and UDF I am able to do this enter image description here But I would like to have those column names as well with struct Datatype. Any help is appreciated.

 val udfHeaderLineList1 = udf((header:String,line:Seq[String])=>{
  line.map(records=>List(header,records)).toList
})


val eventingDFtable = my_dataframe_data_Table.
  groupBy(col("customerId"), col("header")).
  agg(collect_list(col("Line")).alias("Line")).
 withColumn("TransHeaderStruct",udfHeaderLineList1(col("header"),col("Line"))).printSchema
1

1 Answers

1
votes

I got this solution by creating case class

case class simpleCaseClass (header:String,line:String)

val udfHeaderLineList3 = udf((header:String,line:Seq[String])=>{
      line.map(records=>List(header ,records)).map(value=>simpleCaseClass(value(0),value(1)))
    }