python - Multiple aggregations of the same column using pandas GroupBy.agg()

Question

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times?

Example dataframe:

import pandas as pd
import datetime as dt
import numpy as np

pd.np.random.seed(0)
df = pd.DataFrame({
         "date"    :  [dt.date(2012, x, 1) for x in range(1, 11)], 
         "returns" :  0.05 * np.random.randn(10), 
         "dummy"   :  np.repeat(1, 10)
})

The syntactically wrong, but intuitively right, way to do it would be:

# Assume `f1` and `f2` are defined for aggregating.
df.groupby("dummy").agg({"returns": f1, "returns": f2})

Obviously, Python doesn't allow duplicate keys. Is there any other manner for expressing the input to agg()? Perhaps a list of tuples [(column, function)] would work better, to allow multiple functions applied to the same column? But agg() seems like it only accepts a dictionary.

Is there a workaround for this besides defining an auxiliary function that just applies both of the functions inside of it? (How would this work with aggregation anyway?)

From 0.25 onwards, pandas provides a more intuitive syntax for multiple aggregations, as well as renaming output columns. See the documentation on Named Aggregations. — cs95
FYI this question was asked way back on pandas 0.8.x in 9/2012 — smci
FYI the accepted answer is also deprecated - don't pass agg() a dict of dicts. — cs95
@cs95: I know it's deprecated, I'm saying SO is becoming littered with old stale solutions from old versions. SO doesn't have a way of marking that - other than comments. — smci

bmu bmu · Accepted Answer · 2012-11-27T20:57:33

You can simply pass the functions as a list:

In [20]: df.groupby("dummy").agg({"returns": [np.mean, np.sum]})
Out[20]:         
           mean       sum
dummy                    
1      0.036901  0.369012

or as a dictionary:

In [21]: df.groupby('dummy').agg({'returns':
                                  {'Mean': np.mean, 'Sum': np.sum}})
Out[21]: 
        returns          
           Mean       Sum
dummy                    
1      0.036901  0.369012

python - Multiple aggregations of the same column using pandas GroupBy.agg()

3 Answers

Pandas >= 0.25: Named Aggregation

Pandas < 0.25