I'm having trouble wrapping my head around creating a series of intervals from some time series data.
If I have a data frame (df) with date, concentration, and whether that concentration exceeded a threshold of 5:
df <- structure(list(DATE = structure(c(1356183950, 1356184851, 1356185750,
1356186650, 1356187551, 1356188450, 1356189350, 1356190250, 1356191150,
1356192050, 1356192950, 1356193851, 1356194750, 1356195650, 1356196550,
1356197450), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
CONC = c(3.8, 3.8, 3.7, 4.3, 5, 6, 7.2, 7, 6, 5, 4.3,
3.7, 3.4, 3.3, 3.1, 3), EXCEED = c(0, 0, 0, 0, 1, 1, 1, 1,
1, 1, 0, 0, 0, 0, 0, 0)), .Names = c("DATE", "TURBIDITY",
"EXCEED"), row.names = 1070:1085, class = "data.frame")
I want to create an interval for each time period based on consecutive measurements below or above the threshold and return summary statistics , something like:
START END MAXCONC
1 2012-12-22 13:45:50 2012-12-22 14:30:50 4.3
2 2012-12-22 14:45:51 2012-12-22 16:00:50 7.2
3 2012-12-22 16:15:50 2012-12-22 17:30:50 4.3
I can't figure out how to create the distinct intervals using lubridate. Is there another package I should be using? Thoughts?