I'm using spark-sql-2.4.1v with java8. I have a scenario where I need to add a when condition on column iff that column exist in the respective dataframe . How can it be done?
Ex :
val df = ...// may contain columns either abc, x or y or both....depend on some business logic.
val result_df = df
.withColumn("new_column", when(col("abc") === "a" , concat(col("x"),lit("_"),col("y"))))
// here the problem is some times df may not contain/fetch "x" column then it should give "y" value in the result_df. But in the above statement throwing error as "x" column is not present in df at that point.
So how to check if a column (i.e. "x") presents the use in the concat() else go with remaining columns( i.e. "y")
Here vice versa also possible i.e. only col(x) presents in the df but not column("y"). In some cases both columns x, y available in the df then it is working fine.
Question. how to add the condition in when-clause as and when the column presents in the dataframe. ?
one correction in question. If some colums is not there I should not go into that "withColumn" condition.
Ex :
If column x presents :
val result_df = df
.withColumn("new_x", when(col("abc") === "a" , concat(col("x"))))
If column x presents :
val result_df = df
.withColumn("new_y", when(col("abc") === "a" , concat(col("y"))))
If both column x and y presents :
val result_df = df
.withColumn("new_x", when(col("abc") === "a" , concat(col("x"))))
.withColumn("new_y", when(col("abc") === "a" , concat(col("y"))))
.withColumn("new_x_y", when(col("abc") === "a" , concat(col("x"),lit("_"),col("y"))))