26
votes

I have a pandas dataframe defined as:

A   B   SUM_C      
1   1   10     
1   2   20   

I would like to do a cumulative sum of SUM_C and add it as a new column to the same dataframe. In other words, my end goal is to have a dataframe that looks like below:

A   B   SUM_C   CUMSUM_C       
1   1   10      10     
1   2   20      30   

Using cumsum in pandas on group() shows the possibility of generating a new dataframe where column name SUM_C is replaced with cumulative sum. However, my ask is to add the cumulative sum as a new column to the existing dataframe.

Thank you

2

2 Answers

47
votes

Just apply cumsum on the pandas.Series df['SUM_C'] and assign it to a new column:

df['CUMSUM_C'] = df['SUM_C'].cumsum()

Result:

df
Out[34]: 
   A  B  SUM_C  CUMSUM_C
0  1  1     10       10
1  1  2     20       30
0
votes

Overview: you can use a dataframe aggregate and pass it an user defined function

def accumulate(values):
    """The accumulate function takes the offset previous numbers in a series and sums them.
      args: a dataframe with a 100 numbers
 """
    offset=0
    accumulate=[]
    for i in np.arange(len(values)):
        offset+=1
        accumulate.append(values[:offset].sum())
    
    return accumulate

  A=pd.DataFrame(np.arange(1,101), columns=['value'])

  A.aggregate(accumulate).plot()