I have a database with an id variable and start/end dates for follow up:
library(dplyr)
library(lubridate)
data <- tibble(id = c(01, 02, 03, 04),
start_date = dmy(c('01-02-1997', '05-03-1998', '09-08-2002', '05-05-1997')),
end_date = dmy(c('03-04-2002', '06-07-2004', '07-04-2010', '03-04-2008')))
I want to produce a df that takes a period of time (3 year periods, e.g. y1997.99 = 1997 to 1999) and asks whether the start and end dates fall within this period (e.g. if follow up starts in 1996 and ends in 2000, they would be counted in the 1997 to 1999 and 2000-2003 periods but not the 2003-2006 period).
I want to produce a df such as:
data <- tibble(id = c(01, 02, 03, 04),
start_date = dmy(c('01-02-1997', '05-03-1998', '09-08-2002', '05-05-1997')),
end_date = dmy(c('03-04-2002', '06-07-2004', '07-04-2010', '03-04-2008')),
y1997.99 = c(1,1,0,1),
y2000.03 =c(1,1,1,1),
y2004.06 =c(0,1,1,1),
y2007.09 = c(0,0,1,1),
y2010.12 = c(0,0,1,0),
y2013.15 = c(0,0,0,0))
Does anyone know how this could be achieved? Thanks.