0
votes

I have an annoying problem when pulling the finalModel from the caret package. I need the final logistic regression from a crossCV, but the list object has added the ` symbol around model terms that need to be evaluated in the formula (e.g., `I(x^2)`). This renders the object unusable for any other package.

I have tried: 1. stating the model formula in the function call, 2. using poly() instead of I for the terms, 3. using lapply on the model object with a gsub() function to replace the character, 4. using grep with lappy on the model object just to find the character and going thru manually with gsub. But basically, lapply doesn't dig down thru all the variable list lengths, and doesn't return a model object.

#Here's the problem

library(caret)


dat=data.frame(y=as.factor(rbinom(1000, 1, prob=0.5)),
x=rnorm(1000,10,1),
w=rnorm(1000,100,1),
z=rnorm(1000,1000,1))
levels(dat$y) <- c("A","P")

Train <- createDataPartition(dat$y, p=0.7, list=FALSE)
train<- dat[ Train, ]
test <- dat[ -Train, ]


lof=as.formula(y~ x+I(x^2)+w+I(w^2)+z+I(z^2))

m1<-glm(lof,family="binomial", data=train)
m1

pred1=predict(m1,newdata=test, type="response" )

ctrl <- trainControl(
  method = "repeatedcv", 
  repeats = 3,
  classProbs = TRUE, 
  summaryFunction = twoClassSummary,
)

m2<-train(lof,family="binomial", data=train, 
                 method="glm",
                 trControl = ctrl,
                 metric = "ROC")

m3=m2$finalModel
m3
pred2=predict(m3,newdata=test, type="response" )

res1=lapply(m3, function (x) grepl('\\`I',x)) 
m3$terms[[3]]=gsub('\\`I',"",m3$terms[[3]])
m3$terms[[3]]=gsub(')\\`',"",m3$terms[[3]])
m3$terms

res=lapply(m3, function (x) grepl('\\`I',names(x)))
names(m3$effects)[res$effects==TRUE]=gsub('\\`I',"",names(m3$effects)[res$effects==TRUE])
names(m3$effects)[res$effects==TRUE]=gsub(')\\`',"",names(m3$effects)[res$effects==TRUE])

pred3=predict(m3,newdata=test, type="response" )

And an attempted fix with lapply finds some instances, but not all, and cannot be easily used to modify the original object. So even a manual fix is stymied. Obviously the easiest solution to figure out how to stop caret from doing this in the first place rather than trying to edit the object.

I should probably note that I'm not trying to extract the finalModel to use with predict, I am aware I can predict with caret on the entire m2 object... I want it for use with the dismo package.

1

1 Answers

0
votes

No replies, but what I ended up doing as a workaround was to create new variables x2=x^2 etc, which is fine as far as it goes, but for my use that means I ALSO had to create new raster layers with x^2 etc which is a waste of time and space.