3
votes

The following is what I have:

ID Year Score
1 1999 10
1 2000 11
1 2001 14
1 2002 22
2 2000 19
2 2001 17
2 2002 22
3 1998 10
3 1999 12

The following is what I would like to do:

ID Year Score Total
1 1999 10 10
1 2000 11 21
1 2001 14 35
1 2002 22 57
2 2000 19 19
2 2001 17 36
2 2002 22 48
3 1998 10 10
3 1999 12 22

The amount of years and the specific years vary for each Id.

I have a feeling that it's some advanced options in ddply but I have not been able to find the answer. I've also tried working with for/while loops but since these are dreadfully slow in R and my data-set is large, it's not working all that well.

Thanks in advance!

2

2 Answers

2
votes

You can use the sumsum function and apply it with ave to all subgroups.

transform(dat, Total = ave(Score, ID, FUN = cumsum))

  ID Year Score Total
1  1 1999    10    10
2  1 2000    11    21
3  1 2001    14    35
4  1 2002    22    57
5  2 2000    19    19
6  2 2001    17    36
7  2 2002    22    58
8  3 1998    10    10
9  3 1999    12    22
2
votes

If your data is large, then ddply will be slow.

data.table is the way to go.

 library(data.table)
 DT <- data.table(dat)
 # create your desired column in `DT`
 DT[, agg.Score := cumsum(Score), by = ID]