1
votes

I have a String variable containing few column names separated by comma. For example :

val temp = "Col2, Col3, Col4"

I have a Dataframe and I want to group the Dataframe based on certain columns which include the columns stored in temp variable as well. For example my groupBy statement should act like the following statement

DF.groupBy("Col1", "Col2", "Col3", "Col4")

The temp variable may have any column names. So i want to create a GroupBy statement that gets the value of temp variable dynamically along with manual entries provided by me.

I tried with the following statement but to no avail DF.groupBy("Col1", temp)

Then I splitted the value of temp variable based on comma sign and stored them in another variable and tried to pass it to the groupBy statement. But even that fails.

val temp1 = temp.split(",")

DF.groupBy("Col1", temp1)

Any ideas how I can enclose the values of a List variable within double quotes and pass the same to a groupBy statement ?

1
DF.groupBy("Col1", temp: _*) This assumes that the groupBy() takes any number of String arguments via the standard varargs syntax.jwvh
Thanks @jwvh for your valuable inputJKC

1 Answers

1
votes

Use varargs:

df.groupBy("Col1", temp1: _*)

or

import org.apache.spark.sql.functions.col

df.groupBy("Col1 +: temp1 map col: _*)