0
votes

I am new in R and learning ml using caret. I was working on UCI bank marketing response data but used iris data here for reproducibility.

Issue is that I am getting error on running vif from car package on classification models.

library(tidyverse)
library(caret)
library(car)

iris

# to make it binary classification
iris_train <- iris %>% filter(Species %in% c("setosa","versicolor"))
iris_train$Species <- factor(iris_train$Species)

Creating Model


model_iris3 <- train(Species ~ ., 
                      data = iris_train, 
                      method = "gbm",
                     verbose = FALSE
                      # tuneLength = 5,
                      # metric = "Spec", 
                      # trControl = fitCtrl
                      )

Error in vif

# vif
car::vif(model_iris3)

Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "c('train', 'train.formula')"

I got to know about using finalModel for vif from this SO post: Variance inflation VIF for glm caret model in R

But still getting an error

car::vif(model_iris3$finalModel)

Error in UseMethod("vcov") : no applicable method for 'vcov' applied to an object of class "gbm"

same error I get with adaboost, earth etc.

Appreciate any help or suggestions to solve this issue.

(UPDATE)

Finally this worked (see the complete solution in Answers if you still get an error):

vif doesn't work on classification models so convert dependent variable to numeric and run linear regression on it and then vif


model_iris4 <- train(as.numeric(Species) ~ ., 
                      data = iris_train, 
                      method = "lm",
                     verbose = FALSE
                      # tuneLength = 5,
                      # metric = "Spec", 
                      # trControl = fitCtrl
                      )

car::vif(model_iris4$finalModel)

######## output ##########

Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
    4.803414     2.594389    36.246326    25.421395 
2

2 Answers

1
votes

car::vif is a function that needs to be adapted for each type of model. It works in the linked question because car::vif has been implemented to cope with glm models. car::vif does not support your chosen model type: gbm.

2
votes

Finally this worked:

vif doesn't work on classification models so convert dependent variable to numeric and run linear regression on it and then vif

model_iris4 <- train(as.numeric(Species) ~ ., 
                      data = iris_train, 
                      method = "lm",
                     verbose = FALSE
                      # tuneLength = 5,
                      # metric = "Spec", 
                      # trControl = fitCtrl
                      )

car::vif(model_iris4$finalModel)

######## output ##########

Sepal.Length  Sepal.Width Petal.Length  Petal.Width 
    4.803414     2.594389    36.246326    25.421395 

There are high chances that if you have dummies in model than it may still give error.

For example: After following above steps I got new error on my original UCI banking dataset: Error in vif.default(model_vif_check$finalModel) : there are aliased coefficients in the model

To solve this error you can try below steps

run alias() on model where predicted variable is numeric

alias_res <- alias( 
  lm( as.numeric(y) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+contact.cellular+previous+age+cons.price.idx+month.jun+job.retired, data = train ) 
  )

alias_res
ld.vars <- attributes(alias_res$Complete)$dimnames[[1]]
ld.v

this will return an alias that was causing error, so just remove that predictor from the model and run model again (in my case it was "contact.cellular")

model_vif_check_aliased <- train(as.numeric(pull(y)) ~ duration+nr.employed+euribor3m+pdays+emp.var.rate+poutcome.success+month.mar+cons.conf.idx+contact.telephone+previous+age+cons.price.idx+month.jun+job.retired, 
                      data = train, 
                      method = "lm"
                      )
model_vif_check_aliased

Now run vif

vif_values <- car::vif(model_vif_check_aliased$finalModel)
vif_values

duration nr.employed euribor3m pdays 1.016706 75.587546 80.930134 10.216410 emp.var.rate poutcome.success month.mar cons.conf.idx 64.542469 9.190354 1.077018 3.972748 contact.telephone previous age cons.price.idx 2.091533 1.850089 1.185461 28.614339 month.jun job.retired 3.936681 1.198350