0
votes

Suppose I am going to add two columns into one columns df col1 col2 col3
1 2 3 3 4 5

and add a columns col1 col2 col3 SUM 1 2 3 6 3 4 5 12

df.withColumn("SUM", col(col1) + col(col2) + col(col3))

but I would like to do dynamically :

array=["col1","col2","col3"]

df.withColumn("SUM", *[col(x) for x in colarray])

but seems I am not sure where I can place a plus '+' over there.

1
anyone has the idea ???mytabi

1 Answers

0
votes

You can use the following command:

val colList = df.schema.fields.map(struct=> col(struct.name))
val newDf = df.withColumn("Sum",colList.reduce(_+_))

This is the generic Commands:

  1. The first command is used to get a list of all columns from the Dataframe.
  2. The second command is used to create new DF with a new column name Sum which has sum of all the columns.