0
votes

I have 2 variables that are highly correlated (although not perfectly linear, hence the generalized additive model) that have been measured over one time period. I can construct a reliable GAM between these with a high amount of deviance explained and good validation plots. So for example on some hypothetical data (ignoring the validation plots and deviance explained in this example),

date <- as.data.frame(seq(from = as.POSIXct("2007/9/01"), 
                      to = as.POSIXct("2008/3/01"), by = "day"))
a1 <- as.data.frame(matrix(sample(0:1000, 18.3*10, replace=TRUE), ncol=1))
b1 <- as.data.frame(matrix(sample(0:1000, 18.3*10, replace=TRUE), ncol=1))
df1 <- cbind(date,a1,b1)
colnames(df1) <- c("date","a1","b1")
library(mgcv)
gam <- gam(a1 ~ s(b1), data = df1)

Now on a seperate dataframe, I have one of these variables measured over a much longer time period. Is there a way to predict the second response variable across this duration? So for example...

date2 <- as.data.frame(seq(from = as.POSIXct("2006/1/01"), 
                       to = as.POSIXct("2008/12/31"), by = "day"))
a2 <- as.data.frame(matrix(sample(0:1000, 109.6*10, replace=TRUE), ncol=1))
df2 <- cbind(date2,a2)
colnames(df2) <- c("date","a2")

I have tried this that does not seem work.

    b2_predict <- predict.gam(gam,df2$a2)

I get this error message

b2_predict <- predict.gam(gam,df2$a2)
Error in model.frame.default(ff, data = newdata, na.action = na.act) :
invalid type (list) for variable 'b1'

Any idea how to fix it?

1

1 Answers

0
votes

Maybe this will help:

b2_predict <- predict.gam(gam,data.frame(b1=df2$a2))