0
votes

I have pandas DataFrame with array column:

id,classes,text
71,`["performer_146", "performer_42"]`,`adipiscing urna. molestie `
72,["performer_42"],`a ligula odio elementum, neque suscipit. egestas Maecenas`
73,["performer_146"],`vestibulum orci nec vestibulum, ligula orci et mauris lobortis, et Aliquam`
74,["performer_0"],tincidunt non interdum nunc ultrices mi accumsan elementum arcu venenatis
75,`["performer_146", "performer_42"]`, orci elementum non finibus dolor. Cras
76,`["performer_42", "performer_146"]`,`mi lectus Maecenas eleifend neque amet, `
77,["performer_146"],` platea placerat. odio Morbi rutrum, eu Cras`

I read this CSV and convert "classes" column's values into arrays:

import pandas as pd
import ast

df = pd.read_csv(filename, quotechar='`')
df['classes'] = df['classes'].apply(lambda x: ast.literal_eval(x))

Now I want to select rows with "performer_0" in "classes" values. Like this:

df['performer_0' in df['classes']]

But this code doesn't work:

Traceback (most recent call last): File "d:\pyenv\pandas\lib\site-packages\pandas\core\indexes\base.py", line 2657, in get_loc return self._engine.get_loc(key) File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: False

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "", line 1, in File "d:\pyenv\pandas\lib\site-packages\pandas\core\frame.py", line 2927, in getitem indexer = self.columns.get_loc(key) File "d:\pyenv\pandas\lib\site-packages\pandas\core\indexes\base.py", line 2659, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas_libs\index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas_libs\hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas_libs\hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: False

How can I do that?

2

2 Answers

0
votes

Easiest way I found is to combine apply and selecting:

df[df['classes'].apply(lambda x: 'performer_0' in x)]
0
votes

If you work on pandas 0.25+, you can make use of explode:

df[df['classes'].explode().eq(performer_0).any(level=0)]