I have dataframe with 2 ArrayType columns. I want to find the difference between columns. column1 will always have values while column2 may have empty array. I created following udf but it is not working
df.show()
gives following records
SampleData:
["Test", "Test1","Test3", "Test2"], ["Test", "Test1"]
Code:
sc.udf.register("diff", (value: Column,value1: Column)=>{
value.asInstanceOf[Seq[String]].diff(value1.asInstanceOf[Seq[String]])
})
Output:
["Test2","Test3"]
Spark version 1.4.1 Any help will be appreciated.
value
– undefined_variablecollection.SeqLike.diff
– Ram Ghadiyaram