1
votes

Really struggling with putting dplyr functions within my functions. I understand the function_ suffix for the standard evaluation versions, but still having problems, and seemingly tried all combinations of eval paste and lazy.

Trying to divide multiple columns by the median of the control for a group. Example data includes an additional column in iris named 'Control', so each species has 40 'normal', and 10 'control'.

data(iris)
control <- rep(c(rep("normal", 40), rep("control", 10)), 3)
iris$Control <- control

Normal dplyr works fine:

out_df <- iris %>% 
    group_by(Species) %>% 
    mutate_each(funs(./median(.[Control == "control"])), 1:4)

Trying to wrap this up into a function:

norm_iris <- function(df, control_col, control_val, species, num_cols = 1:4){

out <- df %>%
    group_by_(species) %>% 
    mutate_each_(funs(./median(.[control_col == control])), num_cols)
    return(out)
}

norm_iris(iris, control_col = "Control", control_val = "control", species = "Species")

I get the error:

Error in UseMethod("as.lazy_dots") : 
no applicable method for 'as.lazy_dots' applied to an object of class "c('integer', 'numeric')"

Using funs_ instead of funs I get Error:...: need numeric data

1

1 Answers

1
votes

If you haven't already, it might help you to read the vignette on standard evaluation here, although it sounds like some of this may be changing soon.

Your function is missing the use of interp from package lazyeval in the mutate_each_ line. Because you are trying to use a variable name (the Control variable) in the funs, you need funs_ in this situation along with interp. Notice that this is a situation where you don't need mutate_each_ at all. You would need it if you were trying to use column names instead of column numbers when selecting the columns you want to mutate.

Here is what the line would look like in your function instead of what you have:

mutate_each(funs_(interp(~./median(.[x == control_val]), x = as.name(control_col))), 
                        num_cols)