1
votes

I have a list of data.frames and would like to use the R package caret to create a new column in each data.frame, folds. So far I tried to write a custom function and then apply it to my list using map. Any suggestions would be appreciated.

library(caret)

one = airquality[1:10,]
two = airquality[11:20,]
listdf <- list(one, two)

foldfunc <- function(x) {
  folds <- createFolds(1:nrow(x), k=10,list = F)
  x$folds <- folds
}

map(listdf, foldfunc)
2

2 Answers

1
votes

You just need to make your function returns the data frame:

foldfunc <- function(x) {
  folds <- createFolds(1:nrow(x), k=10,list = F)
  x$folds <- folds
  return(x)
}

In your code, your function is returning the folds. Since you weren't explicitly saying what to return, the function assumes that the desired result is the last thing it calculates, and that's why you were receiving numerical vectors (with the folds calculated by createFolds).

If you try print(foldfunc(listdf[[1]])) with your function, you will see this:

print(foldfunc(listdf[[1]]))
# [1]  1  2  3  4  5  6  7  8  9 10

With the new version, a data frame with a folds column will be provided.

1
votes

Using tidyverse

library(dplyr)
library(purrr)
listdf <- map(listdf, ~ .x %>% 
     mutate(folds = createFolds(row_number(), k = 10, list = FALSE)))