1
votes

Good afternoon, I have a problem with the output I get when performing a logistic regression with NNET package. I want to predict Category with HS_TR (Return Period) and SLR (Sea Level Rise). The multinomial model, called fit, has been calculated with the information from the x.sub subset. There are 4 different categories possible 1,2,3 or 4.

x.sub:

   POINTID  HS_TR  SLR  Category
       4     10    0.0     3
       4     10    0.6     4
       4     50    0.0     3
       4     50    0.6     4
       4    100    0.0     4
       4    100    0.6     4

When I run the model> fit <- multinom(Category ~ HS_TR + SLR, x.sub, maxit=3000) I get the results :

Coefficients:
    (Intercept)       HS_TR         SLR 
    -30.5791517   0.4130478  62.0976951 

    Residual Deviance: 0.0001820405 
    AIC: 6.000182

Now that I have the multinomial, I want to know the predicted category for a specific scenario (d3) of SLR and HS_TR. I define d3 and apply the prediction and I get reasonable result:

d3<-data.frame("HS_TR"=c(10),"SLR"=c(0))
prediction <-(predict(fit,d3))

I get

> prediction
[[1]]
[1] 3 
Level: 3

However, when I calculate the probability of getting the prediction prediction <-(predict(fit,d3, type="probs")), I get the following:

> prediction
[[1]]
1 
0 

Which makes no sense since it says that there is probability 0. Since the model I run gives a prediction of the CATEGORY, I don't understand why then, the probability is 0. Does someone know why I get it?

If someone knows how I could work on the problem so that I can solve it. Thank you in advance.

1
Could you provide a sample dataset? - ekstroem
Yes, @ekstroem sure. Consider the x.sub to be the following table (I have also edited to post which might be easier to read): ` POINTID HS_TR SLR Category 19 4 10 0.0 3 20 4 10 0.6 4 21 4 50 0.0 3 22 4 50 0.6 4 23 4 100 0.0 4 24 4 100 0.6 4` - D_CodeO1
David, David, David. It's so sad to see a possibly useful question marred by failing to read the SO help pages and use edit to create a minimal reproducible example and instead posting updates in comments. I also see 147 hits for a search with "[r] nnet predict" and will be tempted to downvote if you don't improve the question further. - IRTFM
@42- Apart from commenting, I also edited the post. In any case, I have added more details to the question. - D_CodeO1

1 Answers

1
votes

You have a problem with separation / complete separation (Google the term to get more information. This page gives a nice introduction which contains this quote:

A complete separation happens when the outcome variable separates a predictor variable or a combination of predictor variables completely.

If you look at your data, for example using

> xtabs(~ Category + HS_TR + SLR, data=x.sub)
, , SLR = 0

        HS_TR
Category 10 50 100
       3  1  1   0
       4  0  0   1

, , SLR = 0.6

        HS_TR
Category 10 50 100
       3  0  0   0
       4  1  1   1

then you see that the combination of SLR and HS_TR fully determines the outcome for SLR=0.6. You need to specify a simpler model or get more data to provide a stable fit.

In your case your output has only two possible categories so you should be able to fit a log linear model or logistic regression model and get the same result. If you create a new variable Cat that is a factor of Category then you see a warning pointing you in the right direction.

> glm(Cat ~HS_TR + SLR, data=x.sub, family="binomial") 
Warning message:
glm.fit: fitted probabilities numerically 0 or 1 occurred 

I think multinom does not detect the problem in the data. However, if you look at the summary of your fit then you find that the standard error of two of the parameter estimates are extremely large. This also suggests that the estimates are not stable and that separation could be a problem.

> summary(fit)
Call:
multinom(formula = Category ~ HS_TR + SLR, data = x.sub, maxit = 3000)

Coefficients:
                 Values  Std. Err.
(Intercept) -30.5791517 356.932851
HS_TR         0.4130478   5.137396
SLR          62.0976951 634.584184

Residual Deviance: 0.0001820405 
AIC: 6.000182 

I think the convergence checking in multinom lacks a check of some sort.