Is it possible to cast a StringType column to an ArrayType column in a spark dataframe ?
df.printSchema()
gives this
Schema ->
a: string(nullable= true)
Now I want to convert this to
a: array(nullable= true)
Is it possible to cast a StringType column to an ArrayType column in a spark dataframe ?
df.printSchema()
gives this
Schema ->
a: string(nullable= true)
Now I want to convert this to
a: array(nullable= true)
As elisiah commented you have to split your string. You can use UDF:
df.printSchema
import org.apache.spark.sql.functions._
val toArray = udf[Array[String], String]( _.split(" "))
val featureDf = df
.withColumn("a", toArray(df("a")))
featureDF.printSchema
Gives output:
root
|-- a: string (nullable = true)
root
|-- a: array (nullable = true)
| |-- element: string (containsNull = true)