I am hoping to use ddply within a function to summarise groups based on a user determined summary statistic (e.g. the mean, median, min, max), by passing the name of the summary function to apply as a variable in the function call. However, I'm not sure how to pass this to ddply.
Simple e.g.
library(plyr)
test.df<-data.frame(group=c("a","a","b","b"),value=c(1,5,5,15))
ddply(test.df,.(group),summarise, mean=mean(value, na.rm=TRUE))
how could I set this up something like below, with the relevant function passed to ddply (additionally within a function of course, although this should be straightforward once the first problem is solved). Note each summary measure (mean etc.), will require na.rm=TRUE. I could do this by writing my own replacement function for each summary statistic, but this seems overly complex.
Desired:
#fn<-"mean"
#ddply(test.df,.(group),summarise, fn=fn(value, na.rm=TRUE))
Thanks for any help people can provide.
EDIT! Thanks all for these responses. I initially thought leaving out the quotes was working, however that approach, nor the use of getFunction or match.fun work once fn is specific as part of a function call. What I'm actually hoping to get working is something along the lines of the code below (which returns an error). Apologies for not providing a more thorough example in the first instance...
test.df<-data.frame(group=c("a","a","b","b"),value=c(1,5,5,15))
my.fun <- function(df, fn="mean") {
summary <- ddply(df,.(group),summarise, summary=match.fun(fn)(value, na.rm=T))
return(summary)
}
my.fun(test.df, fn="mean")
fn<-mean
. – nograpes