0
votes

I'm using GraphFrame in spark 2.0 and scala.

I need to remove double quote from columns that are in string type (out of many columns). I'm trying to do so using UDF as follow:

import org.apache.spark.sql.functions.udf

val removeDoubleQuotes = udf( (x:Any) =>
    x match{
      case s:String => s.replace("\"","")
      case other => other
    }
  )

And I get the following error since type Any is not supported in GraphFrame.

java.lang.UnsupportedOperationException: Schema for type Any is not supported

What is a workaround for that?

1
Do your columns have mixed types? Why not just write it only for strings and apply it only to the string columns? - Joe K
@JoeK Because I have many columns and try to find a way rather than manually find string columns. - MehrdadAP

1 Answers

0
votes

I think you don't have a column with type Any and you can't return different datatype from udf. You need to have a single datatype return from udf.

If your column is String then you can create udf as

import org.apache.spark.sql.functions.udf

val removeDoubleQuotes = udf( (x:String) => s.replace("\"",""))