I've been struggling for a few days to solve the this task in R (I'm a former SAS user).
The setting/study - Observational data. Patients with Crohns Disease. Data was collected annually during 2002–2013. - Patients can be included any year and visits may be irregular on a annual basis. - I know the exact day of death for each patient. VARIABLE: DEATH_YEAR - I know the exact day of relapse (the endpoint of interest). VARIABLE: RELAPSE_YEAR
I am interested in the incidence of relapse and I need to calculate the number of relapses each year divided by the number of individuals alive that year. Now the problem is that from inclusion, individuals come irregularly, but I do know if they are actually alive that year and if they have experienced a relapse.
I could solve this if I could create 12 new variables for each patient. Each new variable should be the calendar year and this variable should be set to '1' if the patient is alive that year and has not yet experienced the event.
Thus the problem is that i need to create a 'year-variables' that are set to '1' for each year at inclusion and thereafter, given that the person is not dead, or has experienced the event.
An example: Patient X was included 2005 and died 2009. For him I would need he following variables: '2005', '2006', '2007', '2008' and '2009' set to '1'. Patient Y was included 2005 and experienced event 2007. For him I would need the following variables: '2005', '2006', 2007' set to '1'. (Yes, year of event/death need still be set to '1').
Here is how my data set looks:
data <- read.table(header = TRUE, text = "
patient visit first_visit relapse_year death_year
1 2003 2003 . 2010
1 2004 2003 . 2010
1 2009 2003 . 2010
2 2002 2002 2006 .
2 2006 2002 2006 .
2 2006 2002 2006 .
2 2008 2002 2006 .
2 2012 2002 2006 .
3 2004 2004 . .
3 2008 2004 . .
3 2008 2004 . .
")
Here is the DESIRED data set
desired_data <- read.table(header = TRUE, text = "
patient visit first_visit relapse_year death_year YEAR2002 YEAR2003 YEAR2004 YEAR2005 YEAR2006 YEAR2007 YEAR2008 YEAR2009 YEAR2010 YEAR2011 YEAR2012
1 2003 2003 . 2010 . 1 1 1 1 1 1 1 1 . .
1 2004 2003 . 2010 . 1 1 1 1 1 1 1 1 . .
1 2009 2003 . 2010 . 1 1 1 1 1 1 1 1 . .
2 2002 2002 2006 . 1 1 1 1 1 . . . . . .
2 2006 2002 2006 . 1 1 1 1 1 . . . . . .
2 2006 2002 2006 . 1 1 1 1 1 . . . . . .
2 2008 2002 2006 . 1 1 1 1 1 . . . . . .
2 2012 2002 2006 . 1 1 1 1 1 . . . . . .
3 2004 2004 . . . . 1 1 1 1 1 1 1 1 1
3 2008 2004 . . . . 1 1 1 1 1 1 1 1 1
3 2008 2004 . . . . 1 1 1 1 1 1 1 1 1
")
I would be extremely grateful for any advice on this! Thanks in advance!