0
votes

I am often confused about pandas slice operation, for example,

import pandas as pd
raw_data = {'regiment': ['Nighthawks', 'Nighthawks', 'Nighthawks', 'Nighthawks', 'Dragoons', 'Dragoons', 'Dragoons', 'Dragoons', 'Scouts', 'Scouts', 'Scouts', 'Scouts'], 
    'company': ['1st', '1st', '2nd', '2nd', '1st', '1st', '2nd', '2nd','1st', '1st', '2nd', '2nd'], 
    'name': ['Miller', 'Jacobson', 'Ali', 'Milner', 'Cooze', 'Jacon', 'Ryaner', 'Sone', 'Sloan', 'Piger', 'Riani', 'Ali'], 
    'preTestScore': [4, 24, 31, 2, 3, 4, 24, 31, 2, 3, 2, 3],
    'postTestScore': [25, 94, 57, 62, 70, 25, 94, 57, 62, 70, 62, 70]}
df = pd.DataFrame(raw_data, columns = ['regiment', 'company', 'name', 'preTestScore', 'postTestScore'])

def get_stats(group):
    return {'min': group.min(), 'max': group.max(), 'count': group.count(), 'mean': group.mean()}
bins = [0, 25, 50, 75, 100]
group_names = ['Low', 'Okay', 'Good', 'Great']
df['categories'] = pd.cut(df['postTestScore'], bins, labels=group_names)
des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()
des.at['Good','mean']

and I got:

TypeError Traceback (most recent call last) pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()

TypeError: an integer is required

During handling of the above exception, another exception occurred:

KeyError Traceback (most recent call last) in () ----> 1 des.at['Good','mean']

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexing.py in getitem(self, key) 1867 1868 key = self._convert_key(key) -> 1869 return self.obj._get_value(*key, takeable=self._takeable) 1870 1871 def setitem(self, key, value):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\frame.py in _get_value(self, index, col, takeable) 1983 1984 try: -> 1985 return engine.get_value(series._values, index) 1986 except (TypeError, ValueError): 1987

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

KeyError: 'Good'

How can I do this?

Thanks in advance.

2
Do what exactly? - Mad Physicist

2 Answers

0
votes

Problem is with the line,

des = df['postTestScore'].groupby(df['categories']).apply(get_stats).unstack()

after doing a group by over 'postTestScroe' you are getting "Series" not "DataFrame" as shown below.

enter image description here

Now when you are trying to access scalar labels with DataFrame des ".at" it doesn't recognize label 'Good' since it doesn't exist with Series.

des.at['Good','mean']   

Just try to print des print,you will see the resulting series.

           count   max   mean   min
categories
Low           2.0  25.0  25.00  25.0
Okay          0.0   NaN    NaN   NaN
Good          8.0  70.0  63.75  57.0
Great         2.0  94.0  94.00  94.0
0
votes

It isn't working due the Categorical Index:

des.index
# Out[322]: CategoricalIndex(['Low', 'Okay', 'Good', 'Great'], categories=['Low', 'Okay', 'Good', 'Great'], ordered=True, name='categories', dtype='category')

Try changing it like this:

des.index = des.index.tolist()
des.at['Good','mean']
# Out[326]: 63.75