How do I subset a time series from the start up to the first occurrence of a variable meeting a condition?
tribble(
~t, ~x, ~y,
as.POSIXct(strptime("2011-03-27 01:30:00", "%Y-%m-%d %H:%M:%S")), -1, 1,
as.POSIXct(strptime("2011-03-27 01:30:01", "%Y-%m-%d %H:%M:%S")), -5, 2,
as.POSIXct(strptime("2011-03-27 03:45:00", "%Y-%m-%d %H:%M:%S")), -3, 5,
as.POSIXct(strptime("2011-03-27 04:20:00", "%Y-%m-%d %H:%M:%S")), -8, 3,
as.POSIXct(strptime("2011-03-27 04:25:00", "%Y-%m-%d %H:%M:%S")), -2, 8
)
For example all rows from start to first occurrence of y > 4
(expecting the first three rows of the sample data).
h3rm4ns Solution explained
simpler case of not including the first row to match the condition would be:
%>% filter(cumsum(y > 4) == 0)
y > 4
will be false which is equal to 0
in R, so the cumsum == 0
will return TRUE
(and thus filter) for all rows up to the first one that matches y > 4
and therefore adds a 1
to the sum.
To have it include the matching row, we additionally lag(y, default = 0)
.