I'm interested in taking a column in my dataframe called mapColumn
+-------------------+
| mapColumn |
+-------------------+
| Map(KEY -> VALUE) |
+-------------------+
and create a stringColumn that's just the key and value of the Map column where the value is "KEY,VALUE":
+-------------------+
| stringColumn |
+-------------------+
| KEY,VALUE |
+-------------------
I have tried creating a UDF to pass this value like follows:
var getStringColumn = udf(mapToString _)
df.withColumn("stringColumn,
when(col(mapColumn).isNotNull,
getStringColumn(col(mapColumn)))
.otherwise(lit(null: String)))
def mapToString(row: Row): String = {
if (null == row || row.isNullAt(FirstItemIndex)) {
return null
}
return row.getValuesMap[Any](row.schema.fieldNames).mkString(",")
}
I keep getting the following error:
Failed to execute user defined function($anonfun$1: (map) => string) Cause: java.lang.ClassCastException:scala.collection.immutable.Map$Map1 cannot be cast to org.apache.spark.sql.Row