2
votes

I want to apply a function element-wise to a list of dataframes. I am able to apply a simple function but not the more complex one cause I am not sure how to refer to the arguments.

I am able to do the following action on a data frame:

df1 <- data.frame(
  A = c(1, 2),
  B = c(1, 3)
)
centered <- apply(df1, 2, function(x) x - mean(x)) 
scaled <- apply(centered, 2, function(x) x/sqrt(sd(x)))

Then I create a list of two data frames (they will have the same number of rows but different number of columns):

df1 <- data.frame(
      A = c(1, 2),
      B = c(1, 3))
 df2 <- data.frame(
      A = c(1, 2, 3, 4),
      B = c(1, 2, 3, 4))
 l=list(df1,df2)

I have learned that mapply seems to do what I want. But, how to apply the actions from above? Here is the mapply for function(x,y). I would like to apply actions centered and scaled from above instead:

l_output <- mapply(function(x,y) x*y, x = 2, y = list, SIMPLIFY = FALSE)
2
What type of structure do you want for your output?camille
I want a list of dataframes, same as the input but with a scaling function applied. Ronak's answer works!micecanfly

2 Answers

1
votes

Apply the same functions using lapply. This applies both centered and scaled function together.

lapply(l, function(y) apply(y, 2, function(x) {
        x = x - mean(x)
        x/sqrt(sd(x))
}))

#[[1]]
#              A          B
#[1,] -0.5946036 -0.8408964
#[2,]  0.5946036  0.8408964

#[[2]]
#              A          B
#[1,] -1.3201676 -1.3201676
#[2,] -0.4400559 -0.4400559
#[3,]  0.4400559  0.4400559
#[4,]  1.3201676  1.3201676

If you want them separately

centered <- lapply(l, function(y) apply(y, 2, function(x) x - mean(x)))
scaled <- lapply(centered, function(y) apply(y, 2, function(x) x/sqrt(sd(x))))
0
votes

One option is with purrr::map to iterate over the data frames and dplyr::mutate_all to apply a function to all columns in each data frame.

purrr::map(l, function(d) {
  dplyr::mutate_all(d, function(x) {
    x <- x - mean(x)
    x / sqrt( sd(x) )
  })
})
#> [[1]]
#>            A          B
#> 1 -0.5946036 -0.8408964
#> 2  0.5946036  0.8408964
#> 
#> [[2]]
#>            A          B
#> 1 -1.3201676 -1.3201676
#> 2 -0.4400559 -0.4400559
#> 3  0.4400559  0.4400559
#> 4  1.3201676  1.3201676

Or, if you declare that function, you can do it in one line:

center_and_scale <- function(x) {
  x <- x - mean(x)
  x / sqrt( sd(x) )
}

purrr::map(l, dplyr::mutate_all, center_and_scale)
# same output