I am a newbie, who is having trouble in interpreting the output of my logistic regression. My response variable has two values - “multiplex” and “subterraneus”. When used the factor() function on “microtus.train” data frame, I get “mutiplex and subterraneus” in that order. After I fitted the model, and predict the response, I am having trouble in understanding what does the probability mean. Do these probabilities mean probability of an observation being “subterraneus”? When I used “contrasts(microtus.train$Group)” statement, I got the table below.
> contrasts(microtus.train$Group)
subterraneus
multiplex 0
subterraneus 1
Based on this table, I interpret that the model is trying to predict probabilities of “subterraneus” (not the probabilities of “multiplex”) because “1” is dummy coded for “subterraneus”. Is my assumption correct?
My code is given below and I appreciate your help in advance.
library(Flury)
data(microtus, package = "Flury")
str(microtus)
summary(microtus)
# Creating training & test data frames
microtus.train <- subset(microtus,
microtus$Group %in% c("multiplex", "subterraneus"),
select = c("Group", "M1Left", "M2Left", "M3Left",
"Foramen", "Pbone","Length", "Height",
"Rostrum") )
# Drop 3rd factor level
microtus.train$Group = droplevels(microtus.train$Group)
factor(microtus.train$Group)
nullModel.GLM <- glm(Group ~ 1, data = microtus.train,
family = binomial())
fullModel.GLM <- glm(Group ~ ., data = microtus.train,
family = binomial())
summary(nullModel.GLM)
summary(fullModel.GLM)
stepFwd.GLM <- step(nullModel.GLM, scope = list(upper = fullModel.GLM),
direction = 'forward', k = 2)
stepFwd.GLM.fitResults <- predict(stepFwd.GLM, type = 'response')
stepFwd.GLM.fitResults
contrasts(microtus.train$Group)