I have a data frame called dat_new, essentially it is clinic visit data, hrn being a patient ID, and dov being date of visit (multiple visits per person). Then I have a data frame called event with dated hospital admissions (multiple admissions per person).
What I want to do, is for each clinic visit, I want to sum the hospital admissions that occurred prior to that clinic visit, simple.
This works with ddply from plyr, takes a bit of time but works well.
temp <- ddply(dat_new, .(hrn,dov), summarise,
dka2 = sum(event$event_code[which(event$hrn==hrn & event$doa <= dov)]==2),
dka3 = sum(event$event_code[which(event$hrn==hrn & event$doa <= dov)]==3),
dka8 = sum(event$event_code[which(event$hrn==hrn & event$doa <= dov)]==8)
)
Now, trying to rewrite in dplyr, I get an error
Error: binding not found: 'event_code'
I have it coded like this:
temp2 <- group_by(dat_new, hrn, dov)
temp3 <- summarise(temp2,
dka2 = sum(event$event_code[which(event$hrn==hrn & event$doa <= dov)]==2))
Obviously event_code is not in the temp2 data frame. Is it a case of dplyr can not work with 'other' data frames when 'summarising'? If there is a far better way to be doing the 'lookup'/sum I'm doing I'm all ears.
I did try this a few times trialing loading packages on a vanilla R in different orders to try and eliminate any namespace issues.
Thanks
EDIT - REPRODUCIBLE EXAMPLE
This is a quick and dirty example just to illustrate the issue. If we make a 'lookup' data.frame that has 2 of each car, with a mpg around 500, we can then try and go through the original data.frame, looking up in the new data.frame and summing the two mpgs together. plyr gives the expected, figures around 1000. dplyr errors.
# add the model names as a column so they're easier to get at
mtcars$models <- row.names(mtcars)
# create a 'lookup' table
xtra <- data.frame(models = rep(row.names(mtcars),2),
newmpg = rnorm(2*nrow(mtcars),500,10)
)
xtra <- xtra[sample(row.names(xtra)), ]
library(plyr)
ddply(mtcars, .(models), summarise,
revisedmpg = sum(xtra$newmpg[models==xtra$models]) )
# great, one row per car, with both mpgs added together
library(dplyr)
temp2 <- group_by(mtcars, models)
temp3 <- summarise(temp2,
revisedmpg = xtra$newmpg[models==xtra$models] )
# error
dplyr
. Looking forward to see the answer to this question. You question is really intersting so please make some effort to make it reproductible if you people to help you. Use themtcars
data set for example. – dickoadplyr
guru will find a workaround now. – dickoa