2
votes

I'm trying to create a survival array in R. Here is my current data frame (note: the real dataset has over 500 unique IDs):

Hyena DOB  Death
A     1989 1990 
B     1989 1991  
C     1990 1990 

I want to create new columns of years between 1989 and 1991 (the span of the study). Then, for each Hyena, I would like to code whether it was alive ("1") or dead/unborn ("0") for each respective column year (e.g. 1989,1990,1991).

Ex: For Hyena A, DOB = or < 1989 AND Death = or >1989 so "1" for column 1989. Also, DOB = or < 1990 AND Death = or >1990 so "1" for column 1990. Although, DOB = or < 1991, Death < 1991 so "0" for column 1991.

In the end, I'd like my data frame to look like this:

Hyena DOB  Death 1989 1990 1991
A     1989 1990   1    1    0
B     1989 1991   1    1    1
C     1990 1990   0    1    0

I've been trying for loops, but I'm just not savvy enough to know what I'm doing. Any help would be MOST appreciated!! Hope this wasn't too confusing

1

1 Answers

2
votes

You can try

library(qdapTools)
cbind(df1,  mtabulate(Map(seq, df1$DOB, df1$Death)))
#    Hyena  DOB Death 1989 1990 1991
#1     A 1989  1990    1    1    0
#2     B 1989  1991    1    1    1
#3     C 1990  1990    0    1    0

Or an option with data.table

library(data.table)#v1.9.5+
setkey(setDT(df1),Hyena)[dcast(df1[, seq(DOB, Death), by=Hyena][,
           .N,list(V1, Hyena)], Hyena~V1, value.var='N', fill=0)]
#  Hyena  DOB Death 1989 1990 1991
#1:     A 1989  1990    1    1    0
#2:     B 1989  1991    1    1    1
#3:     C 1990  1990    0    1    0

Or using base R

lst <- Map(seq, df1$DOB, df1$Death)
Un1 <- sort(unique(unlist(lst)))
cbind(df1, t(sapply(lst, function(x) table(factor(x, levels=Un1)))))

data

df1 <- structure(list(Hyena = c("A", "B", "C"), DOB = c(1989L, 1989L, 
1990L), Death = c(1990L, 1991L, 1990L)), .Names = c("Hyena", 
"DOB", "Death"), class = "data.frame", row.names = c(NA, -3L))