I have spark data frame like following:
+----------+-------------------------------------------------+
|col1 |words |
+----------+-------------------------------------------------+
|An |[An, attractive, ,, thin, low, profile] |
|attractive|[An, attractive, ,, thin, low, profile] |
|, |[An, attractive, ,, thin, low, profile] |
|thin |[An, attractive, ,, thin, low, profile] |
|rail |[An, attractive, ,, thin, low, profile] |
|profile |[An, attractive, ,, thin, low, profile] |
|Lighter |[Lighter, than, metal, ,, Level, ,, and, tes] |
|than |[Lighter, than, metal, ,, Level, ,, and, tww] |
|steel |[Lighter, than, metal, ,, Level, ,, and, test] |
|, |[Lighter, than, metal, ,, Level, ,, and, Test] |
|Level |[Lighter, than, metal, ,, Level, ,, and, test] |
|, |[Lighter, than, metal, ,, Level, ,, and, ste] |
|and |[Lighter, than, metal, ,, Level, ,, and, ste] |
|Test |[Lighter, than, metal, ,, Level, ,, and, Ste] |
|Renewable |[Renewable, resource] |
|Resource |[Renewable, resource] |
|No |[No1, Bal, testme, saves, time, and, money] |
+----------+-------------------------------------------------+
I want to filter the data from the above column as case insensitive. Currently I am doing like this.
df.filter(array('words, "level")).show(false)
but it is not showing any data. please help me to resolve the issue.
wordscolumn (that seems to be of array type)? Why not to usecol1instead since it's already available? - Jacek Laskowski