Here is my toy data, I want to calculate diff_var4.
df <- tibble::tribble(
~var1, ~var2, ~var3, ~var4, ~diff_var4,
1L, 1L, 1L, 2L, NA,
1L, 1L, 1L, 2L, NA,
1L, 2L, 1L, 2L, 0L,
1L, 2L, 1L, 2L, 0L,
1L, 4L, 1L, 2L, 0L,
1L, 5L, 1L, 2L, 0L,
1L, 6L, 2L, 8L, 6L,
1L, 6L, 2L, 8L, 6L,
2L, 4L, 1L, 5L, NA,
2L, 5L, 1L, 5L, 0L,
2L, 5L, 1L, 5L, 0L,
2L, 6L, 2L, 8L, 3L,
2L, 6L, 2L, 8L, 3L)
var1 to var4 are input and I need to calculate diff_var4 so that
condition 1: for every var1, if var3 is 1 and var2 is min var2, then diff_var4 is var4 - previous(var4) for the number of observations for which the var2 remains the same.
condition 2: for every var1, if var3 changes, then diff_var4 is var4 - previous(var4) for the number of observations for which the var2 remains the same.
I started with
df %>% group_by(var1) %>%
mutate(diff_var4 = var4-lag(var4))
but can't get the desired diff_var4 with NA in the 2nd row, 6 in the 8th row, and 3 in the last row!
How can I calculate diff_var4, preferably with tidyverse solution?