I have a large dataframe with four measurments per second over two weeks. Hence the dataframe is quite large.
I also have two vectors with start and end dates, which define certain time ranges within the dataframe and which I have to filter out.
What I want to do is to extract the data between the start and end dates.
My data looks something like this.
library(lubridate)
library(dplyr)
df <- data.frame(datetime = seq(ymd_hms("2020/01/01 00:00:00"),
by = "sec",
length.out = 3600),
var = rnorm(3600))
And my start/end vector look like this. Here I only added two start/end combinations. But the actual vector contain more values.
start = c(ymd_hms("2020/01/01 00:1:00"), ymd_hms("2020/01/01 00:30:00"))
end = c(ymd_hms("2020/01/01 00:1:04"), ymd_hms("2020/01/01 00:30:04"))
I tried to just filter it, using
filtered <- df %>%
filter(datetime >= start & datetime <= end)
head(filtered)
datetime var
1 2020-01-01 00:01:00 -0.2245330
2 2020-01-01 00:01:02 0.5926424
3 2020-01-01 00:01:04 -0.3824533
4 2020-01-01 00:30:01 -0.7202059
5 2020-01-01 00:30:03 -0.5775794
But it seems to downsample the data, as the filtered dataframe has only three measurments for the first time interval, instead of the expected five values.
If I filter only for the first start and end date, I get the five values.
filtered2 <- df %>%
filter(datetime >= start[1] & datetime <= end[1])
head(filtered2)
datetime var
1 2020-01-01 00:01:00 -0.22453305
2 2020-01-01 00:01:01 1.13452854
3 2020-01-01 00:01:02 0.59264239
4 2020-01-01 00:01:03 -0.03700048
5 2020-01-01 00:01:04 -0.38245332
Where I get stuck is:
Why does the first filtering work, but does not return the expected full daterange?
And how can I get the complete data filtered?
I alreadz tried filter(between(datetime, start, end)
. This gave me the expected result, but only for the first daterange. It seeems that dplyr::between
does not accept vecorts.
Any help would be very welcome.
UPDATE
@ekoam has pointed out correctly that data.table::between
works as well. But as dplyr::between
, it doesn't like vectors.
df
. – arg0naut91