41
votes

I want to pass arrange() {dplyr} a vector of variable names to sort on. Usually I just type in the variables I want, but I'm trying to make a function where the sorting variables can be input as a function parameter.

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L
  ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L
  ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"
  ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 
  5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 
  2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), 
  class = "factor")), .Names = c("var1", "var2", "var3", "var4"), 
  row.names = c(NA, -10L), class = "data.frame")

# this is the normal way to arrange df with dplyr
df %>% arrange(var3, var4)

# but none of these (below) work for passing a vector of variables
vector_of_vars <- c("var3", "var4")
df %>% arrange(vector_of_vars)
df %>% arrange(get(vector_of_vars))
df %>% arrange(eval(parse(text = paste(vector_of_vars, collapse = ", "))))
6
Imo, use of %>% should be saved for chaining, as it's pretty ugly... (for single actions <- or = works just fine...Kevin

6 Answers

36
votes

Hadley hasn't made this obvious in the help file--only in his NSE vignette. The versions of the functions followed by underscores use standard evaluation, so you pass them vectors of strings and the like.

If I understand your problem correctly, you can just replace arrange() with arrange_() and it will work.

Specifically, pass the vector of strings as the .dots argument when you do it.

> df %>% arrange_(.dots=c("var1","var3"))
   var1 var2 var3 var4
1     1    i    5    i
2     1    x    7    w
3     1    h    8    e
4     2    b    5    f
5     2    t    5    b
6     2    w    7    h
7     3    s    6    d
8     3    f    8    e
9     4    c    5    y
10    4    o    8    c

========== Update March 2018 ==============

Using the standard evaluation versions in dplyr as I have shown here is now considered deprecated. You can read Hadley's programming vignette for the new way. Basically you will use !! to unquote one variable or !!! to unquote a vector of variables inside of arrange().

When you pass those columns, if they are bare, quote them using quo() for one variable or quos() for a vector. Don't use quotation marks. See the answer by Akrun.

If your columns are already strings, then make them names using rlang::sym() for a single column or rlang::syms() for a vector. See the answer by Christos. You can also use as.name() for a single column. Unfortunately as of this writing, the information on how to use rlang::sym() has not yet made it into the vignette I link to above (eventually it will be in the section on "variadic quasiquotation" according to his draft).

19
votes

In the new version (soon to be released 0.6.0 of dplyr) we can make use of the quosures

library(dplyr)
vector_of_vars <- quos(var1, var3)
df %>%
    arrange(!!! vector_of_vars)
#   var1 var2 var3 var4
#1     1    i    5    i
#2     1    x    7    w
#3     1    h    8    e
#4     2    b    5    f
#5     2    t    5    b
#6     2    w    7    h
#7     3    s    6    d
#8     3    f    8    e
#9     4    c    5    y
#10    4    o    8    c

When there are more than one variable, we use quos and for a single variable it is quo. The quos will return a list of quoted variables and inside arrange, we unquote the list using !!! for evaluation

18
votes

In the quosures spirit:

df %>% arrange(!!! rlang::syms(c("var1", "var3")))

For single variable, it would look like:

df %>% arrange(!! rlang::sym(c("var1")))
10
votes

I think now you can just use dplyr::arrange_at().

library(dplyr)

### original
head(iris)
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          5.1         3.5          1.4         0.2  setosa
# 2          4.9         3.0          1.4         0.2  setosa
# 3          4.7         3.2          1.3         0.2  setosa
# 4          4.6         3.1          1.5         0.2  setosa
# 5          5.0         3.6          1.4         0.2  setosa
# 6          5.4         3.9          1.7         0.4  setosa

### arranged
iris %>% 
  arrange_at(c("Sepal.Length", "Sepal.Width")) %>% 
  head()
#   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
# 1          4.3         3.0          1.1         0.1  setosa
# 2          4.4         2.9          1.4         0.2  setosa
# 3          4.4         3.0          1.3         0.2  setosa
# 4          4.4         3.2          1.3         0.2  setosa
# 5          4.5         2.3          1.3         0.3  setosa
# 6          4.6         3.1          1.5         0.2  setosa
3
votes

Try this:

df %>% do(do.call(arrange_, . %>% list(.dots = vector_of_vars)))

and actually this can be written more simply as:

df %>% arrange_(.dots = vector_of_vars)

although at this point I think its the same as farnsy's implied solution.

0
votes

It's a little dense, but I think the best approach now is to use across() along with a tidyselect function, e.g. all_of():

df <- structure(list(var1 = c(1L, 2L, 2L, 3L, 1L, 1L, 3L, 2L, 4L, 4L
  ), var2 = structure(c(10L, 1L, 8L, 3L, 5L, 4L, 7L, 9L, 2L, 6L
  ), .Label = c("b", "c", "f", "h", "i", "o", "s", "t", "w", "x"
  ), class = "factor"), var3 = c(7L, 5L, 5L, 8L, 5L, 8L, 6L, 7L, 
  5L, 8L), var4 = structure(c(8L, 5L, 1L, 4L, 7L, 4L, 3L, 6L, 9L, 
  2L), .Label = c("b", "c", "d", "e", "f", "h", "i", "w", "y"), 
  class = "factor")), .Names = c("var1", "var2", "var3", "var4"), 
  row.names = c(NA, -10L), class = "data.frame")

vector_of_vars <- c("var3", "var4")

df %>% arrange(across(all_of(vector_of_vars)))