I have a dataframe which have a complex column datatype of Arraytype>. For transforming this dataframe I have created udf which can consume this column using Array [case class] as parameter. The main bottle neck here is when I create case class according to stucttype, the structfield name contains special characters for example "##field". So I provide same name to case class like this way case class (##field) and attach this to udf parameter. After interpreted in spark udf definition change name of case class field to this "$hash$hashfield". When performing transform using this dataframe it is failing because of this miss match. Please help ...
1
votes
1 Answers
1
votes
Due JVM limitations Scala stores identifiers in encoded form and currently Spark can't map ##field to $hash$hashfield.
One possible solution is to extract fields manually from raw row (but you need to know order of the fields in df, you can use df.schema for that):
val myUdf = udf { (struct: Row) =>
// Pattern match struct:
struct match {
case Row(a: String) => Foo(a)
}
// .. or extract values from Row
val `##a` = struct.getAs[String](0)
}