1
votes

If I have a function defined using rlang, how I can use purrr::map to use it with several variables ?

Suppose I have a function defined as:

mean_by <- function(data, by, var) {
  data %>%
    group_by({{ by }}) %>%
    summarise(avg = mean({{ var }}, na.rm = TRUE))
}

Which computes group means,

Preferably using a purrr::map solution, how could I apply this function for several "by" variables but a single "var" in a data frame?

2

2 Answers

3
votes

You need the !!! operator or using group_by_at

library(tidyverse)


mean_by <- function(data, by, var) {
  data %>%
    group_by_at(by) %>% 
    summarise(avg = {{var}} %>% mean(na.rm =TRUE))
}


mtcars %>% 
  mean_by(by = vars(mpg,cyl),hp)
#> # A tibble: 27 x 3
#> # Groups:   mpg [25]
#>      mpg   cyl   avg
#>    <dbl> <dbl> <dbl>
#>  1  10.4     8   210
#>  2  13.3     8   245
#>  3  14.3     8   245
#>  4  14.7     8   230
#>  5  15       8   335
#>  6  15.2     8   165
#>  7  15.5     8   150
#>  8  15.8     8   264
#>  9  16.4     8   180
#> 10  17.3     8   180
#> # … with 17 more rows


# or


mean_by <- function(data, by, var) {
  data %>%
    group_by(!!!by) %>% 
    summarise(avg = {{var}} %>% mean(na.rm =TRUE))
}


mtcars %>% 
  mean_by(by = vars(cyl,disp),hp)
#> # A tibble: 27 x 3
#> # Groups:   cyl [3]
#>      cyl  disp   avg
#>    <dbl> <dbl> <dbl>
#>  1     4  71.1    65
#>  2     4  75.7    52
#>  3     4  78.7    66
#>  4     4  79      66
#>  5     4  95.1   113
#>  6     4 108      93
#>  7     4 120.     97
#>  8     4 120.     91
#>  9     4 121     109
#> 10     4 141.     95
#> # … with 17 more rows

Created on 2020-01-07 by the reprex package (v0.3.0)

0
votes

A good alternative is to "pass the dots".

The first argument will be the single variable you want to summarise, and use ... to pass all (if any) grouping variables you want.

This way you have a cleaner syntax for your function and you avoid including the vars function.

library(tidyverse)


mean_by <- function(data, var, ...) {
  data %>%
    group_by(...) %>% 
    summarise(avg = {{var}} %>% mean(na.rm =TRUE))
}


mtcars %>% 
  mean_by(hp, cyl, disp)
#> # A tibble: 27 x 3
#> # Groups:   cyl [3]
#>      cyl  disp   avg
#>    <dbl> <dbl> <dbl>
#>  1     4  71.1    65
#>  2     4  75.7    52
#>  3     4  78.7    66
#>  4     4  79      66
#>  5     4  95.1   113
#>  6     4 108      93
#>  7     4 120.     97
#>  8     4 120.     91
#>  9     4 121     109
#> 10     4 141.     95
#> # ... with 17 more rows


mtcars %>% 
  mean_by(hp)
#> # A tibble: 1 x 1
#>     avg
#>   <dbl>
#> 1  147.

Created on 2020-01-08 by the reprex package (v0.3.0)