0
votes

I found caret package in R is very helpful to see the importance variables for modeling. But, i have all categorical variables in my dataset, in this case 'varImp' command returns variable importance for each label of the factor variables. i just want to determine the important distinct variables list, not with the label.

library(caret)
logit <- glm(Life.Insurance.Owner~., data = train, family = 'binomial')
summary(logit)

varImp(logit,scale=FALSE)
1

1 Answers

0
votes

As you mentioned 'all categorical variables in the dataset' .. Any chance I get a look to variables. What is the size of the levels in each variable?

One possible thing, you can so that - convert categorical variables into dummy variables. And Now you're dataset represents continuous variables. [But, again this depends on case to case and - also keep your objective in mind.]

Simple Example to Create Dummy Variables:

x = c(Red, Blue, Green)

y = c(Bus, train, Boat)

x.dummy = model.matrix(~x - 1, data = x)

y.dummy = model.matrix(~y - 1, data = y)