1
votes

I have the following dataframe in python:

months = [1,2,3,4,5,6,7,8,9,10,11,12]
data1 = [100,200,300,400,500,600,700,800,900,1000,1100,1200]
df = pd.DataFrame({
                    'month' : months,
                    'd1' : data1,
                    'd2' : 0,
                });

and I want to calculate the column d2, in the following way:

    month    d1      d2
0       1   100   101.0
1       2   200   303.0
2       3   300   606.0
3       4   400  1010.0
4       5   500  1515.0
5       6   600  2121.0
6       7   700  2828.0
7       8   800  3636.0
8       9   900  4545.0
9      10  1000  5555.0
10     11  1100  6666.0
11     12  1200  7878.0

I am doing it in the following way:

df['d2'] = (df['d2'].shift(1) + df['d1']) + df['month']

but the result is not what was expected:

    month    d1      d2
0       1   100     NaN
1       2   200   202.0
2       3   300   303.0
3       4   400   404.0
4       5   500   505.0
5       6   600   606.0
6       7   700   707.0
7       8   800   808.0
8       9   900   909.0
9      10  1000  1010.0
10     11  1100  1111.0
11     12  1200  1212.0

I do not know if I am clear in my request, I thank who can help me.

2

2 Answers

0
votes

What you need is cumulative sum :)

df['d2'] = df.d1.cumsum()
print(df) 

month    d1    d2
0       1   100   100
1       2   200   300
2       3   300   600
3       4   400  1000
4       5   500  1500
5       6   600  2100
6       7   700  2800
7       8   800  3600
8       9   900  4500
9      10  1000  5500
10     11  1100  6600
11     12  1200  7800
0
votes

IIUC, you're looking for cumsum:

df['d2'] = (df.d1+df.month).cumsum()

>>> df
    month    d1    d2
0       1   100   101
1       2   200   303
2       3   300   606
3       4   400  1010
4       5   500  1515
5       6   600  2121
6       7   700  2828
7       8   800  3636
8       9   900  4545
9      10  1000  5555
10     11  1100  6666
11     12  1200  7878