I'm trying to do a case on a DF I have but I'm getting an error. I want to implement this with built in spark functions - withcolumn, when, otherwise:
CASE WHEN vehicle="BMW"
AND MODEL IN ("2020","2019","2018","2017")
AND value> 100000 THEN 1
ELSE 0 END AS NEW_COLUMN
Currently I have this
DF.withColumn(NEW_COLUMN, when(col(vehicle) === "BMW"
and col(model) isin(listOfYears:_*)
and col(value) > 100000, 1).otherwise(0))
But I'm getting an error due to data type mismatch, (boolean and string)... I understand my condition returns booleans and strings, which is causing the error. What's the correct syntax for executing a case like that one? also, I was using && instead of and but the third && was giving me a "cannot resolve symbol &&"
Thanks for the help!
NEW_COLUMN,vehicle,modeletc are variables of typeString? If so, this code runs fine. Do you haveimplicitsimported? - Raphael Roth