I have a dataset that looks something like this
df <- data.frame("id" = c("Alpha", "Alpha", "Alpha","Alpha","Beta","Beta","Beta","Beta"),
"Year" = c(1970,1971,1972,1973,1974,1977,1978,1990),
"Group" = c(1,NA,1,NA,NA,2,2,NA),
"Val" = c(2,3,3,5,2,5,3,5))
And I would like to create a cumulative sum of "Val". I know how to do the simple cumulative sum
df <- df %>% group_by(id) %>% mutate(cumval=cumsum(Val))
However, I would like my final data to look like this
final <- data.frame("id" = c("Alpha", "Alpha", "Alpha","Alpha","Beta","Beta","Beta","Beta"),
"Year" = c(1970,1971,1972,1973,1974,1977,1978,1990),
"Group" = c(1,NA,1,NA,NA,2,2,NA),
"Val" = c(2,3,3,5,2,5,3,5),
"cumval" = c(2,5,6,11,2,7,5,10))
The basic idea is that when two "Val"'s are of the same "Group" the one happening later (Year) substitutes the previous one. For instance, in the sample dataset, observation 3 has a "cumval" of 6 rather than 8 because of the "Val" at time 1972 replaced the "Val" at time 1970. similarly for Beta.
I thank you in advance for your help
df %>% group_by(id) %>% mutate(cumval = cumsum(replace(Val, duplicated(Val), first(Val))))- akrun