2
votes

After much experimenting and googling... and subsequent experimenting again, I am finally resulting to asking my first question on StackOverflow :)

I have a data.frame, and want to apply a custom function, expandBases, to each row of the data.frame. expandBases returns a data.frame consisting of 1 or more rows (and this will vary depending on the data supplied to it). expandBases actually returns more columns than in the toy example below -- but for the sake of illustration:

structure(list(id = structure(1:3, .Label = c("a", "b", "c"), class = "factor"),
startpos = c(1, 2, 3), len = c(1, 2, 3)), .Names = c("id",
"startpos", "len"), row.names = c(NA, -3L), class = "data.frame")


expandBases <- function(startpos, len)
{
    return(data.frame(cy <- startpos + 0:(len - 1)))
}

I would like the id factor to be replicated for each row of the returned data.frame. I've been told to use lapply + do.call(rbind). I was wondering if there is a plyr-based solution to this?

Thanks in advance.

1
There is. I know my answer to the immediately prior question: stackoverflow.com/questions/11895796/… does this and maybe @csgillespie 's does as well.IRTFM

1 Answers

3
votes

I have to guess slightly at exactly what you want, but here is how to go about using base R (do.call + lapply) as well as plyr:

The helper function that creates the data frame:

expandBases <- function(x){
  with(x, 
    data.frame(
      id = rep(id, len-1),
      cy = startpos + seq_len(len-1)
      )
   )
}

Using base R:

do.call(rbind, lapply(seq_len(nrow(dat)), function(i)expandBases(dat[i, ])))
  id cy
1  b  3
2  c  4
3  c  5

Using plyr:

library(plyr)
adply(dat, 1, expandBases)[-(1:2)]
  id cy
1  b  3
2  c  4
3  c  5

Note that I implemented the function as you described in your question, but that means one row always goes missing. I suspect that's not quite what you wanted.