1
votes

I'm trying to create a function that will take 2 variables from a dataset, and map their distinct values side by side, after which it will write the out to a csv file. I'll be using dplyr's distinct function for getting the unique values.

map_table <- function(df, var1, var2){
  df_distinct <- df %>% distinct(var1, var2)
  write.csv(df_distinct, 'var1.csv')
}

map_table(iris, Species, Petal.Width)

1) map_table(iris, Species, Petal.Width) doesn't produce what I want. It should produce 27 rows of data, instead I'm getting 150 rows of data.

2) How can I name the csv file after the input of var1? So if var1 = 'Sepal.Length', the name of the file should be 'Sepal.Length.csv'

3
non-standard evaluation (NSE) is one well-known hiccup when using dplyr. Here's [one related question from back in 2014](how can i tell select() in dplyr that the string it is seeing is a column name in a data frame); but the solution here is cleaner, so this should probably not be closed-as-duplicate. - smci

3 Answers

2
votes

If you want to pass the col names without quotes, you need to use non-standard evaluation. (More here)

deparse(substitute()) will get you the name for the file output.

library(dplyr)

map_table <- function(df, var1, var2){

  file_name <- paste0(deparse(substitute(var1)), ".csv") # file name

  var1 <- enquo(var1) # non-standard eval
  var2 <- enquo(var2) # equo() caputures the expression passed, ie: Species

  df_distinct <- df %>% 
    distinct(!!var1, !!var2) # non-standard eval, !! tells dplyr to use Species

  write.csv(df_distinct, file = file_name)

}

map_table(iris, Species, Petal.Width)
0
votes

You're trying to pass the columns as objects. Try passing their names instead and then use a select helper:

map_table <- function(df, var1, var2){
  df_distinct <- df %>% select(one_of(c(var1, var2)))%>%
      distinct()
  write.csv(df_distinct, 'var1.csv')
}

map_table(iris, 'Species', 'Petal.Width')
0
votes

1) Ok the answer is to use distinct_ instead of distinct. And the variables being called need to be apostrophized. 2) use apply function to concatenate values/string formatting, and file =

map_table <- function(df, var1, var2){
  df_distinct <- df %>% distinct_(var1, var2)
  write.csv(df_distinct, file = paste(var1,'.csv'))
}

map_table(iris, 'Species', 'Petal.Width')