0
votes

I have 3 data frames, df1 = a time interval, df2 = list of IDs, df3 = list of IDs with associated date.

df1 <- structure(list(season = structure(c(2L, 1L), .Label = c("summer", 
    "winter"), class = "factor"), mindate = structure(c(1420088400, 
    1433131200), class = c("POSIXct", "POSIXt")), maxdate = structure(c(1433131140, 
    1448945940), class = c("POSIXct", "POSIXt")), diff = structure(c(150.957638888889, 
    183.040972222222), units = "days", class = "difftime")), .Names = c("season", 
    "mindate", "maxdate", "diff"), row.names = c(NA, -2L), class = "data.frame")

df2 <- structure(list(ID = c(23796, 23796, 23796)), .Names = "ID", row.names = c(NA, 
    -3L), class = "data.frame")

df3 <- structure(list(ID = c("23796", "123456", "12134"), time = structure(c(1420909920, 
1444504500, 1444504500), class = c("POSIXct", "POSIXt"), tzone = "US/Eastern")), .Names = c("ID", 
"time"), row.names = c(NA, -3L), class = "data.frame")

The code should compare if df2$ID == df3$ID. If true, and if df3$time >= df1$mindate and df3$time <= df1$maxdate, then df1$maxdate - df3$time, else df1$maxdate - df1$mindate. I tried using the ifelse function. This works when i manually specify specific cells, but this is not what i want as I have many more (uneven rows) for each of the dfs.

df1$result <- ifelse(df2[1,1] == df3[1,1] & df3[1,2] >= df1$mindate & df3[1,2] <= df1$maxdate, 
                     difftime(df1$maxdate,df3[1,2],units="days"),
                     difftime(df1$maxdate,df1$mindate,units="days")

EDIT: The desired output is (when removing last row of df2):

 season    mindate             maxdate          diff   result
1 winter 2015-01-01 2015-05-31 23:59:00 150.9576 days 141.9576
2 summer 2015-06-01 2015-11-30 23:59:00 183.0410 days 183.0410

Any ideas? I don't see how I could merge dfs to make them of the same length. Note that df2 can be of any row length and not affect the code. Issues arise when df1 and df3 differ in # of rows.

1
Could you please add your desired output? - sm925

1 Answers

0
votes

The > and < are vectorized:

transform(df1,result=ifelse(df3$ID%in%df2$ID & df3$time>mindate & df3$time <maxdate, difftime(maxdate,df3$time),difftime(maxdate,mindate)))
  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410

You can also use the between function from data.table library

library(data.table)
transform(df1,result=ifelse(df3$ID%in%df2$ID&df3$time%between%df1[2:3],
               difftime(maxdate,df3$time),difftime(maxdate,mindate)))

  season             mindate             maxdate          diff   result
1 winter 2014-12-31 21:00:00 2015-05-31 20:59:00 150.9576 days 141.9576
2 summer 2015-05-31 21:00:00 2015-11-30 20:59:00 183.0410 days 183.0410