4
votes

How can I use variables in place of column names in dplyr strings? As an example say I want to add a column to the iris dataset called sum that is the sum of Sepal.Length and Sepal.Width. In short I want a working version of the below code.

x = "Sepal.Length"
y = "Sepal.Width"
head(iris%>% mutate(sum = x+y))

Currently, running the code outputs "Evaluation error: non-numeric argument to binary operator" as R evaluates x and y as character vectors. How do I instead get R to evaluate x and y as column names of the dataframe? I know that the answer is to use some form of lazy evaluation, but I'm having trouble figuring out exactly how to configure it.

Note that the proposed duplicate: dplyr - mutate: use dynamic variable names does not address this issue. The duplicate answers this question:

Not my question: How do I do:

var = "sum"
head(iris %>% mutate(var = Sepal.Length + Sepal.Width))
2
So that you don't have to find your answer in another castle, to evaluate character vectors as column names in a dplyr function, use !! as.name(x). Here head(iris%>% mutate(sum = !!as.name(x) + !!as.name(y)))De Novo
Thanks Dan! when I run your code, I get back the following error message: Error in !as.name(y) : invalid argument type .So it seems like R isn't correctly evaluating + !!as.name(y)Ajjit Narayanan
if it's not running, you should check packageVersion("dplyr"). You need 0.7 or higher.De Novo
My dplyr version 0.7.3 Its weird because if I run head(iris%>% mutate(sum = !!as.name(x))) everything works fine and a new column named sum is returned (with values equal to Sepal.Length). But when i add +!!as.name(y) to the command (ie run your command) I get the above error. So R seems to specifically having a problem processing the second set of !!. Does it work locally on your computer? If so, it might just be a problem with my R sessionAjjit Narayanan
Yes, with the following variables assigned in the global environment: x <- "Sepal.Length"; y <- "Sepal.Width".De Novo

2 Answers

2
votes

I think that recommended way is using sym:

iris %>% mutate(sum = !!sym(x) + !!sym(y)) %>% head
1
votes

It also works with get():

> rm(list = ls())
> data("iris")
> 
> library(dplyr)
> 
> x <- "Sepal.Length"
> y <- "Sepal.Width"
> 
> head(iris %>% mutate(sum = get(x) + get(y)))
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species sum
1          5.1         3.5          1.4         0.2  setosa 8.6
2          4.9         3.0          1.4         0.2  setosa 7.9
3          4.7         3.2          1.3         0.2  setosa 7.9
4          4.6         3.1          1.5         0.2  setosa 7.7
5          5.0         3.6          1.4         0.2  setosa 8.6
6          5.4         3.9          1.7         0.4  setosa 9.3