Using this data.frame
DATA
df <- read.table(text = c("
SiteID measured modelled
site1 50 47
site2 28 30
site3 158 162
site4 247 243
site5 456 463
site6 573 564
site7 634 640"), ,header =T)
I want to create two new columns (measured_diff and modelled_diff). In these two new columns the value for site1 and site2 will be the same as the value in measured and modelled. However, for the rest of the sites the value will be similar to below
measured_diff for site3 = measured for site3 - sum(measured for site1 and site2)
measured_diff for site4 = measured for site4 - measured for site3
measured_diff for site5 = measured for site5 - measured for site4
measured_diff for site6 = measured for site6 - measured for site5
measured_diff for site7 = measured for site7 - measured for site6
and the same for modelled_diff
FINAL RESULT
It should be as below
# SiteID measured modelled diff_measured diff_modelled
#1 site1 50 47 50 47
#2 site2 28 30 28 30
#3 site3 158 162 80 85
#4 site4 247 243 89 81
#5 site5 456 463 209 220
#6 site6 573 564 117 101
#7 site7 634 640 61 76
Any suggestions how to do this in R using dplyr?
df %>% mutate_at(vars(-SiteID), funs(diff = . - lag(cumsum(.), default = 0))), but the numbers are off. - alistaire