0
votes

I would like to create confusion matrices for a multinomial logistic regression as well as a proportional odds model but I am stuck with the implementation in R. My attempt below does not seem to give the desired output.

This is my code so far:

CH <- read.table("http://data.princeton.edu/wws509/datasets/copen.dat", header=TRUE)
CH$housing <- factor(CH$housing)
CH$influence <- factor(CH$influence)
CH$satisfaction <- factor(CH$satisfaction)
CH$contact <- factor(CH$contact)
CH$satisfaction <- factor(CH$satisfaction,levels=c("low","medium","high"))
CH$housing <- factor(CH$housing,levels=c("tower","apartments","atrium","terraced"))
CH$influence <- factor(CH$influence,levels=c("low","medium","high"))
CH$contact <- relevel(CH$contact,ref=2)
model <- multinom(satisfaction ~ housing + influence + contact, weights=n, data=CH)
summary(model)
preds <- predict(model)
table(preds,CH$satisfaction)

omodel <- polr(satisfaction ~ housing + influence + contact, weights=n, data=CH, Hess=TRUE)
preds2 <- predict(omodel)
table(preds2,CH$satisfaction)

I would really appreciate some advice on how to correctly produce confusion matrices for my 2 models!

1
table(preds,CH$satisfaction) gives you the confusion matrix. If you want some more stats for your predictions you can use confusionMatrix function from caret package. - AntoniosK
I believe table(preds,CH$satisfaction) does unfortunately not take into account the weights. So the total number is simply the number of rows but not the number of total observations. Is there a way to incorporate the weights? - Joe
Then maybe instead of having a weights column you can reshape your dataset to have n as the number of rows. In that case each row is an observation and not a collection of observations. You can create that reshaped dataset like this: CH %>% rowwise() %>% mutate(id = list(seq(1:n))) %>% unnest(id) %>% select(-n) and build the model using that. - AntoniosK

1 Answers

0
votes

You can refer - Predict() - Maybe I'm not understanding it

Here in predict() you need to pass unseen data for prediction.