I need to append multiple columns to the existing spark dataframe where column names are given in List assuming values for new columns are constant, for example given input columns and dataframe are
val columnsNames=List("col1","col2")
val data = Seq(("one", 1), ("two", 2), ("three", 3), ("four", 4))
and after appending both columns, assuming constant values are "val1" for col1 and "val2" for col2,output data frame should be
+-----+---+-------+------+
| _1| _2|col1 |col2|
+-----+---+-------+------+
| one| 1|val1 |val2|
| two| 2|val1 |val2|
|three| 3|val1 |val2|
| four| 4|val1 |val2|
+-----+---+-------+------+
i have written a function to append columns
def appendColumns (cols: List[String], ds: DataFrame): DataFrame = {
cols match {
case Nil => ds
case h :: Nil => appendColumns(Nil, ds.withColumn(h, lit(h)))
case h :: tail => appendColumns(tail, ds.withColumn(h, lit(h)))
}
}
Is there any better way and more functional way to do it.
thanks
appendColumnsthe column name is the same as the column value, while in the expected output dataframe the value for e.g.col1isval1, can it be the same (column name and value) or do you want them to be separate? - Shaido