4
votes

I am trying to use ddply and summarise together from the plyr package but am having difficulty parsing through column names that keep changing...In my example i would like something that would parse in X1 programatically rather than hard coding in X1 into the ddply function.

setting up an example

require(xts)
require(plyr)
require(reshape2)
require(lubridate)
t <- xts(matrix(rnorm(10000),ncol=10), Sys.Date()-1000:1)
t.df <- data.frame(coredata(t))
t.df <- cbind(day=wday(index(t), label=TRUE, abbr=TRUE), t.df)
t.df.l <- melt(t.df, id.vars=c("day",colnames(t.df)[2]), measure.vars=colnames(t.df)[3:ncol(t.df)])

This is the bit im am struggling with....

cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(X1, value))

i do not want to use the term X1 and would like to use something like

cor.vars <- ddply(t.df.l, c("day","variable"), summarise, cor(colnames(t.df)[2], value))

but that comes up with the error: Error in cor(colnames(t.df)[2], value) : 'x' must be numeric

I also tried various other combos that parse in the vector values for the x argument in cor...but for some reason none of them seem to work...

any ideas?

1
Can you please make your example reproducible? I still get errors after loading the plyr, reshape2 and xts packages.flodel
added in the libridate package...as a required library...h.l.m

1 Answers

5
votes

Although this is probably not the intended usage for summarize and there must be much better approaches to your problem, the direct answer to your question is to use get:

ddply(t.df.l, c("day","variable"), summarise, cor(get(colnames(t.df)[2]), value))

Edit: here is for example one approach that is in my opinion better suited to your problem:

ddply(t.df.l, c("day", "variable"), function(x)cor(x["X1"], x["value"]))

Above, "X1" can be also replaced by 2 or the name of a variable holding "X1", etc. It depends how you want to programmatically access the column.