0
votes

I have a column in my Spark dataframe in Scala that was generated as a result of aggregration of multiple columns using

 agg(collect_list(struct(col(abc), col(aaa)).as(def)

I want to pass this column to a UDF for further processing to work on one one of the index in this aggregated column.

When I pass argument to my UDF as:

.withColumn(def, remove
            (col(xyz), col(def)))

UDF- Type as Seq[Row]: val removeUnstableActivations: UserDefinedFunction = udf((xyz: java.util.Date, def: Seq[Row])

I get the error:

Exception encountered when invoking run on a nested suite - Schema for type org.apache.spark.sql.Row is not supported

How should I pass this columns and what should be the datatype of the column in UDF?