1
votes

My table review_cp is indexed on beer names. I got the top three beer names through the following code.

top_3_spacy = review_cp.groupby('Name')['Average Evaluation Score'].mean().sort_values(by='Average Evaluation Score', ascending = False).index[:3].tolist()

The results are ['Rodenbach Caractère Rouge', 'Dorothy (Wine Barrel Aged)', 'Doubleganger']

However, when I tried to select rows using review_cp.loc[top_3_spacy[0]], it gave me a key error.

KeyError Traceback (most recent call last) ~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2896 try: -> 2897 return self._engine.get_loc(key) 2898 except KeyError:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Rodenbach Caractère Rouge'

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) in ----> 1 review_cp.loc[top_3_spacy[0]]

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 1422 1423 maybe_callable = com.apply_if_callable(key, self.obj) -> 1424 return self._getitem_axis(maybe_callable, axis=axis) 1425 1426 def _is_scalar_access(self, key: Tuple):

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _getitem_axis(self, key, axis) 1848 # fall thru to straight lookup 1849 self._validate_key(key, axis) -> 1850 return self._get_label(key, axis=axis) 1851 1852

~\Anaconda3\lib\site-packages\pandas\core\indexing.py in _get_label(self, label, axis) 158 raise IndexingError("no slices here, handle elsewhere") 159 --> 160 return self.obj._xs(label, axis=axis) 161 162 def _get_loc(self, key: int, axis: int):

~\Anaconda3\lib\site-packages\pandas\core\generic.py in xs(self, key, axis, level, drop_level) 3735 loc, new_index = self.index.get_loc_level(key, drop_level=drop_level) 3736
else: -> 3737 loc = self.index.get_loc(key) 3738 3739 if isinstance(loc, np.ndarray):

~\Anaconda3\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance) 2897 return self._engine.get_loc(key) 2898 except KeyError: -> 2899 return self._engine.get_loc(self._maybe_cast_indexer(key)) 2900
indexer = self.get_indexer([key], method=method, tolerance=tolerance) 2901 if indexer.ndim > 1 or indexer.size > 1:

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas_libs\index_class_helper.pxi in pandas._libs.index.Int64Engine._check_type()

KeyError: 'Rodenbach Caractère Rouge'


I tried another method using review_cp[review_cp['Name'].str.contains(top_3_spacy[0])], it worked for 'Rodenbach Caractère Rouge' and 'Doubleganger', but not for 'Dorothy (Wine Barrel Aged)'. I wonder if it was because of the bracket?

2
Can you please put full error message? It would Help to understand issue - Dishin H Goyani

2 Answers

0
votes

I doubt the issue is due to the bracket, as it is part of the string. As long as the string matches a name in the "Name" column, there shouldn't be a problem. If you want to get the rows of your top 3 list, instead of using loc, you can use:

review_cp[review_cp['Name'].isin(top_3_spacy)]

That will isolate your top3 names (and it should include Dorothy).

0
votes

I guess, the problem is, that Name is not in your index. Otherwise, your groupby statement would not be able to access the value. So your index is most likely an automatic integer index. But .loc expects values in the index and can't find the name in the index of your DataFrame.

You can resolve this by using an indexer like in the post of DDD1.

review_cp['Name'].isin(top_3_spacy)

Creates this indexer to select the rows with the names in the list.