Selecting rows with a string index that contains a bracket

Question

My table review_cp is indexed on beer names. I got the top three beer names through the following code.

top_3_spacy = review_cp.groupby('Name')['Average Evaluation Score'].mean().sort_values(by='Average Evaluation Score', ascending = False).index[:3].tolist()

The results are ['Rodenbach Caractère Rouge', 'Dorothy (Wine Barrel Aged)', 'Doubleganger']

However, when I tried to select rows using review_cp.loc[top_3_spacy[0]], it gave me a key error.

KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2896 try: -> 2897 return self._engine.get_loc(key) 2898 except KeyError:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Rodenbach Caractère Rouge'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) in ----> 1 review_cp.loc[top_3_spacy[0]]

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 1422 1423 maybe_callable = com.apply_if_callable(key, self.obj) -> 1424 return self._getitem_axis(maybe_callable, axis=axis) 1425 1426 def _is_scalar_access(self, key: Tuple):

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis) 1848 # fall thru to straight lookup 1849 self._validate_key(key, axis) -> 1850 return self._get_label(key, axis=axis) 1851 1852

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis) 158 raise IndexingError("no slices here, handle elsewhere") 159 --> 160 return self.obj._xs(label, axis=axis) 161 162 def _get_loc(self, key: int, axis: int):

~\Anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level) 3735 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level) 3736
else: -> 3737 loc = self.index.get_loc(key) 3738 3739 if isinstance(loc, np.ndarray):

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2897 return self._engine.get_loc(key) 2898 except KeyError: -> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2900
indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2901 if indexer.ndim > 1 or indexer.size > 1:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Rodenbach Caractère Rouge'

I tried another method using review_cp[review_cp['Name'].str.contains(top_3_spacy[0])], it worked for 'Rodenbach Caractère Rouge' and 'Doubleganger', but not for 'Dorothy (Wine Barrel Aged)'. I wonder if it was because of the bracket?

Can you please put full error message? It would Help to understand issue — Dishin H Goyani

DDD1 DDD1 · Accepted Answer · 2020-10-03T15:50:43

I doubt the issue is due to the bracket, as it is part of the string. As long as the string matches a name in the "Name" column, there shouldn't be a problem. If you want to get the rows of your top 3 list, instead of using loc, you can use:

review_cp[review_cp['Name'].isin(top_3_spacy)]

That will isolate your top3 names (and it should include Dorothy).

Selecting rows with a string index that contains a bracket

2 Answers