3
votes

I have been looking around posts about passing arguments to dplyr functions inside a custom function, and I could not solve the following situation:

I have created the following function to get a subset of a dataframe.

library(Lahman)

top_leaders <- function(df, metric, n) {
     # metric is the name of the column of Batting df which I would like to analyze
     # n is the number of top players leaders on that metric

    stat_leader <- enquo(metric)

    df %>%
      dplyr::select(playerID, !!stat_leader) %>% 
      dplyr::top_n(n)
  }

As this function works well subsetting the n players leaders on that stat. For example:

> top_leaders(Lahman::Batting, "R", 5)
Selecting by R
   playerID   R
1 oneilti01 167
2 brownto01 177
3 hamilbi01 198
4  ruthba01 177
5 gehrilo01 167

Nevertheless, I want the result to be ordered, so I use include the arrange function to order it by the stat.

top_leaders <- function(df, metric, n) {
    stat_leader <- enquo(metric)

    df %>%
      dplyr::select(playerID, !!stat_leader) %>% 
      dplyr::top_n(n) %>%
      dplyr::arrange(desc(!!stat_leader))
  }

But it gives the following error:

Selecting by R
 Error: incorrect size (1) at position 1, expecting : 5 

I tried later to use arrange_(desc(!!stat_leader)) getting also another error:

Selecting by R
 Error: Quosures can only be unquoted within a quasiquotation context.

  # Bad:
  list(!!myquosure)

  # Good:
  dplyr::mutate(data, !!myquosure)

So I have no ideia on how to solve this.

2
Does it work when you call it with bare column names, i.e. R instead of "R"? That's the convention in these dplyr-based functionscamille

2 Answers

2
votes

Take advantage of Rlang's new curly-curly notation:

top_leaders <- function(df, playerID, metric, n) {
  df %>%
    dplyr::select({{playerID}}, {{metric}}) %>% 
    dplyr::top_n(n) %>%
    dplyr::arrange(desc({{metric}})) %>% 
    return(.)
}

top_leaders(as_tibble(Lahman::Batting), playerID, R, 5)

#Selecting by R
## A tibble: 5 x 2
#  playerID      R
#  <chr>     <int>
#1 hamilbi01   198
#2 brownto01   177
#3 ruthba01    177
#4 oneilti01   167
#5 gehrilo01   167

You will need to pass playerID to the function, too, but that's a minor alteration.

1
votes

We may need to convert to symbol here as we are passing a string.

top_leaders <- function(df, metric, n) {
    stat_leader <- ensym(metric)

     df %>%
       dplyr::select(playerID, !!stat_leader) %>% 
       dplyr::top_n(n) %>%
       dplyr::arrange(desc(!!stat_leader))
     }
top_leaders(Lahman::Batting, "R", 5)
#Selecting by R
#   playerID   R
#1 hamilbi01 198
#2 brownto01 177
#3  ruthba01 177
#4 oneilti01 167
#5 gehrilo01 167

It would also work if we pass unquoted variable name

top_leaders(Lahman::Batting, R, 5)
#Selecting by R
#   playerID   R
#1 hamilbi01 198
#2 brownto01 177
#3  ruthba01 177
#4 oneilti01 167
#5 gehrilo01 167

With the OP's function, it expects unquoted argument only instead of quoted