Sum across multiple time frames using R

Question

I have two data frames, x and y. The data frame x has a range of dates while data frame y has individual dates. I want to get the sum of the individual date values for the time ranges in data frame x. Thus id a would have the sum of all the values from 2019/1/1 through 2019/3/1.

id <- c("a","b","c")
start_date <- as.Date(c("2019/1/1", "2019/2/1", "2019/3/1"))
end_date <- as.Date(c("2019/3/1", "2019/4/1", "2019/5/1"))
x <- data.frame(id, start_date, end_date)

dates <- seq(as.Date("2019/1/1"),as.Date("2019/5/1"),1)
values <- runif(121, min=0, max=7)

y <- data.frame(dates, values)

Desired output

id start_date end_date  sum
a  2019/1/1   2019/3/1  221.8892

ThomasIsCoding ThomasIsCoding · Accepted Answer · 2020-05-01T15:51:10

One base R option is using apply

x$sum <- apply(x, 1, function(v) sum(subset(y,dates >= v["start_date"] & dates<=v["end_date"])$values))

such that

> x
  id start_date   end_date      sum
1  a 2019-01-01 2019-03-01 196.0311
2  b 2019-02-01 2019-04-01 185.6970
3  c 2019-03-01 2019-05-01 173.6429

Data

set.seed(1234)
id <- c("a","b","c")
start_date <- as.Date(c("2019/1/1", "2019/2/1", "2019/3/1"))
end_date <- as.Date(c("2019/3/1", "2019/4/1", "2019/5/1"))
x <- data.frame(id, start_date, end_date)

dates <- seq(as.Date("2019/1/1"),as.Date("2019/5/1"),1)
values <- runif(121, min=0, max=7)

y <- data.frame(dates, values)

Sum across multiple time frames using R

2 Answers