2
votes

I'm having trouble to rapply over a nested list. Here's the structure of a sample of one element of the list :

$ F01    :List of 7
  ..$ 0:'data.frame':   16 obs. of  3 variables:
  .. ..$ lengths: Factor w/ 8 levels "1","2","4","5",..: 1 2 3 4 5 6 7 8 1 2 ...
  .. ..$ values : Factor w/ 2 levels "C","N": 1 1 1 1 1 1 1 1 2 2 ...
  .. ..$ Freq   : int [1:16] 1 2 0 1 1 1 1 0 1 3 ...
  ..$ 1:'data.frame':   20 obs. of  3 variables:
  .. ..$ lengths: Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
  .. ..$ values : Factor w/ 2 levels "C","N": 1 1 1 1 1 1 1 1 1 1 ...
  .. ..$ Freq   : int [1:20] 0 1 1 1 1 0 1 0 1 1 ...

I can easily apply a function to one element of the list with lapply : say F01

 lapply(data$F01,function(x) x[which(x[['values']]=="C"),])

Then I thought of applying it to the whole nested list with rapply :

rapply(data,function(x) x[which(x[['values']]=="C"),],how="list")
Error in `[[.default`(x, "values") : subscript out of bounds

I don't get why I get this rapply error, as rapply should lapply recursively to non list elements, in this case a data.frame. Is there something obvious that I don't get ?

here's a sample of two complete element of the main list :

samp <- list(structure(list(`0` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", 
    "2", "7", "8", "13", "18"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(0L, 1L, 1L, 1L, 1L, 0L, 2L, 
    0L, 0L, 0L, 0L, 1L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -12L), class = "data.frame"), `1` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L), .Label = c("1", 
    "2", "3", "5", "8", "12"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(1L, 1L, 0L, 1L, 1L, 1L, 2L, 
    0L, 1L, 1L, 0L, 0L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -12L), class = "data.frame"), `2` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("1", 
    "3", "4", "6", "9", "19", "20"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(1L, 1L, 1L, 1L, 0L, 1L, 0L, 
    0L, 0L, 3L, 0L, 1L, 0L, 2L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -14L), class = "data.frame"), `3` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 
        2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("1", "2", "3", "4", 
        "5", "8", "11", "18"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
        ), .Label = c("C", "N"), class = "factor"), Freq = c(1L, 
        2L, 1L, 1L, 0L, 1L, 1L, 0L, 1L, 2L, 1L, 1L, 1L, 0L, 0L, 1L
        )), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -16L), class = "data.frame"), `4` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("1", 
    "2", "3", "4", "6", "11", "13"), class = "factor"), values = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
    "N"), class = "factor"), Freq = c(0L, 2L, 0L, 1L, 1L, 0L, 2L, 
    1L, 2L, 2L, 0L, 0L, 1L, 0L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -14L), class = "data.frame"), `5` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
        1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", "2", 
        "4", "5", "6", "7", "8", "11", "23"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), 
        Freq = c(0L, 3L, 1L, 2L, 0L, 1L, 0L, 0L, 1L, 3L, 2L, 0L, 
        0L, 1L, 0L, 1L, 1L, 0L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -18L), class = "data.frame"), `6` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 
        10L, 11L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L), .Label = c("1", 
        "2", "3", "4", "5", "6", "9", "12", "13", "21", "36"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
        1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(2L, 2L, 3L, 1L, 2L, 1L, 
        2L, 1L, 0L, 0L, 0L, 2L, 3L, 1L, 4L, 0L, 1L, 0L, 0L, 1L, 1L, 
        1L)), .Names = c("lengths", "values", "Freq"), row.names = c(NA, 
    -22L), class = "data.frame")), .Names = c("0", "1", "2", "3", 
    "4", "5", "6")), structure(list(`0` = structure(list(lengths = structure(c(1L, 
    2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("2", "13", "17", "25"
    ), class = "factor"), values = structure(c(1L, 1L, 1L, 1L, 2L, 
    2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), Freq = c(1L, 
    1L, 0L, 1L, 0L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
    "Freq"), row.names = c(NA, -8L), class = "data.frame"), `1` = structure(list(
        lengths = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L, 3L, 
        4L, 5L, 6L), .Label = c("1", "2", "3", "4", "5", "8"), class = "factor"), 
        values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
        2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), Freq = c(0L, 
        0L, 1L, 2L, 2L, 0L, 1L, 1L, 0L, 1L, 1L, 1L)), .Names = c("lengths", 
    "values", "Freq"), row.names = c(NA, -12L), class = "data.frame"), 
        `2` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("2", 
        "3", "4", "7", "14", "18", "19"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(1L, 1L, 2L, 0L, 0L, 0L, 
        0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -14L), class = "data.frame"), 
        `3` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L), .Label = c("2", 
        "3", "5", "8", "9", "10", "19", "76"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", "N"), class = "factor"), 
            Freq = c(1L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 
            1L, 1L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -16L), class = "data.frame"), 
        `4` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("2", 
        "5", "7", "8", "9", "16", "35"), class = "factor"), values = structure(c(1L, 
        1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
        "N"), class = "factor"), Freq = c(1L, 1L, 2L, 0L, 1L, 0L, 
        0L, 1L, 0L, 0L, 2L, 0L, 1L, 1L)), .Names = c("lengths", "values", 
        "Freq"), row.names = c(NA, -14L), class = "data.frame"), 
        `5` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", 
        "2", "3", "5", "6", "10", "11", "14", "27"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
            "N"), class = "factor"), Freq = c(2L, 2L, 1L, 1L, 1L, 
            1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L)), .Names = c("lengths", 
        "values", "Freq"), row.names = c(NA, -18L), class = "data.frame"), 
        `6` = structure(list(lengths = structure(c(1L, 2L, 3L, 4L, 
        5L, 6L, 7L, 8L, 9L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("1", 
        "2", "3", "4", "5", "6", "11", "21", "51"), class = "factor"), 
            values = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
            1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("C", 
            "N"), class = "factor"), Freq = c(2L, 1L, 2L, 2L, 1L, 
            1L, 0L, 0L, 0L, 3L, 0L, 2L, 0L, 1L, 1L, 1L, 1L, 1L)), .Names = c("lengths", 
        "values", "Freq"), row.names = c(NA, -18L), class = "data.frame")), .Names = c("0", 
    "1", "2", "3", "4", "5", "6")))
1
Why is samp not a nested list if that's what your data is? You're not going to be able to use rapply when your list elements are data.frames, because data.frames are lists, and so rapply will traverse the columns. That's why you're getting the error. It's trying to get the 'values' item from each column in your data.frames.Matthew Plourde
use lapply in your expression instead of rapplyeddi
@eddi, I think Chargaff simply posted wrong sample data. If you look at the str at the top of the OP, it is clearly nestedRicardo Saporta
@MatthewPlourde, samp is only the first element of the nested list, so it isn't nested. I could remove it if it's confusing.Chargaff
@Chargaff, it is not just that it is confusing, it's that such a piece of information is crucial. The entire issue here is the depth of the list. Posting a child and not explaining that it is a child doesnt really help anyone to in assitingRicardo Saporta

1 Answers

3
votes

I don't believe you actually want to use rapply here, as you do not seem to want total recursion. That is, you are not trying to apply a function to lengths and then to values, etc.

Instead, try simply two nested lapply 's:

 lapply(dat, lapply, function(x) x[which(x[['values']]=="C"),])