0
votes

I am having a bit of difficulty understanding how to pass a column from a nested tibble into a function argument. As an example, the following code return the mean of "cyl" grouped by "am":

test <- mtcars %>% 
  group_by(am) %>% 
  nest()

get_mean <- function (df) {
  return (mean(df$cyl))
}

test <- test %>% 
  mutate(mean = map_dbl(data, get_mean))

But what I wanted the mean of a column other than cyl, and wanted to pass that into the function as an argument? I know this is wrong code, but I would try to write something like this:

test <- mtcars %>% 
  group_by(am) %>% 
  nest()

get_mean <- function (df, column) {
  return (mean(df${{column}}))
}

test <- test %>% 
  mutate(mean = map_dbl(data, get_mean, column))

Any help around this would be appreciated. How would I get column into the map function and how am I supposed to write df${{column}}?

2

2 Answers

0
votes

This should do what you want. You can dynamically extract columns from dataframes using the strings inside a variable using the [[]] operator instead of $.

library(purrr)
library(dplyr)
library(tidyr)

nest_data <- mtcars %>% 
  group_by(am) %>% 
  nest()

get_mean <- function (df, column) {
  return (mean(df[[column]]))
}

test_cyl <- nest_data %>% 
  mutate(mean = map_dbl(data, get_mean, "cyl"))
    
test_mpg <- nest_data %>% 
  mutate(mean = map_dbl(data, get_mean, "mpg"))
0
votes

You can use the following -

library(dplyr)
library(purrr)

get_mean <- function (df, column) {
  df %>%
    select(-{{column}}) %>%
    unlist %>% mean
}

test %>%  ungroup %>% mutate(mean = map_dbl(data, get_mean, cyl))

#     am data                mean
#  <dbl> <list>             <dbl>
#1     1 <tibble [13 × 10]>  36.3
#2     0 <tibble [19 × 10]>  55.5