I am attempting to create a new column that is a conditional difference based on a column of TRUE and FALSE. If the lag 1 row is FALSE then we should compute a difference from either the beginning or the last TRUE row, whichever is later in the dataframe, however if the lag 1 row is TRUE then the difference should be should be reset.
I would like to use the dplyr::mutate function as much as possible. I'm attempting to use dplyr::lag with an ifelse() but I'm having a hard time with the conditions
dat <- data.frame(logic_col = c(F, F, T, T, F, F, F, T, F),
time_col = c(200, 435, 567, 895, 1012, 1345, 1456, 1700, 1900),
expected_col_unseen = c(200, 435, 567, 328, 117, 450, 561, 805, 200))
expected
column is inconsistent with "cumulative sum". Since row 2 is false, then row 3 expected should be200+435+567=1202
, not 1002 as you have it. From there, it seems as if your expected column is not even close, as row 3 is true, so row 4 should be 895. I think you may be trying to subtract the previous row's time_col from the expected, but even then the cumulative sum doesn't carry-over correctly. Can you either fix your expected data or expand on how you are calculating it? – r2evansdplyr
, I encourage the use ofdplyr::if_else
(vice baseifelse
) as it will guard against common mistakes (protect you from yourself, so to speak). – r2evans