I'm wondering whether there is an easy way to use dplyr's variable selection functions in my own custom functions. By dplyr's variable selection functions, I mean these: https://github.com/hadley/dplyr/blob/master/R/select-utils.R
Or, if you're familiar with dplyr, things like "contains", "one_of", "starts_with", etc.
What I'd like to be able to do is write a function that only operates on certain variables:
# note: pseudo code
foo = function(df, vars){
for (var in vars){
df$var = as.character(df$var)
}
}
I'm aware of dplyr's "mutate_each" function, which allows me to do this, but I have to write a function that operates on a vector instead of writing a function that operates on a data.frame.
The purpose of my question is to be able to more cleanly add a custom function to a data processing pipeline. For example, I want to ultimately do this:
df %>%
foo(starts_with("varname"))
Rather than
df %>%
mutate_each(funs(foo), starts_with("varname"))
I hope this makes sense. Thanks!
sum
, one formean
, one for.. etc. Is that what you want? – talatdf %>% foo(starts_with("varname"))
to be exactly identical to the result ofdf %>% mutate_each(funs(foo), starts_with("varname"))
(meaning it would result in a data.frame)? – talatmutate_each
approach is perfectly clear about what it's doing. Your desired replacement is not. – Matthew Plourde