I wrote below function
object AgeClassification {
def AgeCategory(age:Int) : String = {
if(age<=30)
return "Young"
else if(age>=65)
return "Older"
else
return "Mid-age"
}
}
and I am trying to pass dataframe column as parameter
val df_new = df
.withColumn("Age_Category", AgeClassification.AgeCategory(df("age")))
but getting the error
:33: error: type mismatch;
found : org.apache.spark.sql.Column
required: Int
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df("age")))
How to pass column as parameter?
val df_new = df
.withColumn("Age_Category",AgeClassification.AgeCategory(df.age.cast(IntegerType)))
:33: error: value age is not a member of org.apache.spark.sql.DataFrame
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df.age.cast(IntegerType)))
val df_new = df
.withColumn("Age_Category", AgeClassification.AgeCategory(df("age").cast(Int)))
:33: error: overloaded method value cast with alternatives:
(to: String)org.apache.spark.sql.Column
(to: org.apache.spark.sql.types.DataType)org.apache.spark.sql.Column
cannot be applied to (Int.type)
val df_new = df.withColumn("Age_Category",AgeClassification.AgeCategory(df("age").cast(Int)))