Hi I'm new to spark and I wanted to know how I can do string manipulation so Column1 - Column2 to get column3.
Note: My data is in data frame
So basically I have two different column string and I wanted to get only the string that is existed in column2 but not in column 1 to I can produce it as column3
Column1
SAMPLE_OUT_3_APPLE|BANANA|GUAVA|ORANGE
Column2
SAMPLE_OUT_3_APPLE|BANANA|GUAVA|GRAPES|ORANGE|BERRY
Then Column3 should be...
Column3
GRAPES,BERRY
but for column1 and column2 I also wanted to show
APPLE,BANANA,ORANGE
Just removing the SAMPLE_OUT_3
and doing having comma delimited