2
votes

FYI I have simplified the actual problem a lot

Say I have a matrix i.e. the dates and column/row names

    2012-08-06  2012-08-13  2012-08-20  2012-08-27
2012-08-06  1.0 1.0 1.0 1.0
2012-08-13  1.0 1.0 1.0 1.0
2012-08-20  1.0 1.0 1.0 1.0
2012-08-27  1.0 1.0 1.0 1.0

and I have a reference date, where I want to pull out all the data which is greater or equal to this reference date. i.e. if the reference date is 2012-08-13 then I want this data:

    2012-08-06  2012-08-13  2012-08-20  2012-08-27
2012-08-06              
2012-08-13      1.0 1.0 1.0
2012-08-20      1.0     
2012-08-27      1.0 

I'm actually currently doing this by doing rows and columns separately and works fine for me i.e. using logic like data[rownames(data) < reference, colnames(data) == reference] to get columns and something similar to get the rows

However What I want is to have a reference lookup (so not just one value of multiple dates i.e. if I had two dates

reference = c("2012-08-13","2012-08-20")

Then the values I need to source need to be:

    2012-08-06  2012-08-13  2012-08-20  2012-08-27
2012-08-06              
2012-08-13      1.0 1.0 1.0
2012-08-20      1.0 1.0 1.0
2012-08-27      1.0 1.0 

I want to eventually replace the 1.0's with something else where it meets this criteria

Can someone help me with referencing rows/column names to a vector of lookups? What my end goal is, is to actually use the rows/columns that I have kept as 1.0 to replace these numbers with some other calculated field (in the original matrix)

Thanks

1

1 Answers

1
votes

Here is a go at it. The idea is to perform the operation for each reference, substituting unwanted rows and columns (i.e. those before the reference) with NA. This is what f does. Then the data frames obtained for each reference are combined using function g.

d <- structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim = c(4L, 
    4L), .Dimnames = list(c("2012-08-06", "2012-08-13", "2012-08-20", 
    "2012-08-27"), c("2012-08-06", "2012-08-13", "2012-08-20", "2012-08-27"
    )))

f <- function(d, ref) {
    d[-which(rownames(d) == ref), -which(colnames(d) == ref)] <- NA
    d
}    

g <- function(d1, d2) {
    d1[is.na(d1) & !is.na(d2)] <- 1
    d1
}

refs <- sort(c("2012-08-13",  "2012-08-20")) # must be sorted
dlist <- lapply(refs, f, d = d)
res <- Reduce(g, dlist)
res[rownames(res) < min(refs), ] <- NA
res[, colnames(res) < min(refs)] <- NA
res
#            2012-08-06 2012-08-13 2012-08-20 2012-08-27
# 2012-08-06         NA         NA         NA         NA
# 2012-08-13         NA          1          1          1
# 2012-08-20         NA          1          1          1
# 2012-08-27         NA          1          1         NA