I have two datasets: DF1 - data frame which lists heads of states (leader_id) of countries (country_code) and an interval of their time in office (office_interval). DF2 - data frame where every observation is an event that has the country (country_code) and date it occurred (event_date).
reproducible data:
library(lubridate)
#Leader DF
leader_id <- c("Adam","Bob","Charlie")
country_code <- c(1,1,2)
office_interval <- c(interval(ymd("1900-01-01"), ymd("1905-01-01")),
interval(ymd("1910-01-01"), ymd("1915-01-01")),
interval(ymd("1920-01-01"), ymd("1925-01-01")))
DF1 <- data.frame(leader_id, country_code, office_interval)
#Event DF
country_code <- c(1,2,2,1)
event_date <- c(as.Date("1901-01-01"),
as.Date("1902-01-01"),
as.Date("1921-01-01"),
as.Date("1901-02-02"))
DF2 <- data.frame(country_code, event_date)
I would like to create a new column, DF1$total_events, that sums the total number of observations in DF2 that occur within the same country_code and office_interval for each leader in DF1. It should look like this:
leader_id country_code1 office_interval total_events
1 Adam 1 1900-01-01 UTC--1905-01-01 UTC 2
2 Bob 1 1910-01-01 UTC--1915-01-01 UTC 0
3 Charlie 2 1920-01-01 UTC--1925-01-01 UTC 1
I've tried to modify some solutions from this similar question, however i can't get anything to work on my data.