1
votes

I'm trying to run a multinomial logistic regression using the mlogit package in R. I've uploaded the data here https://drive.google.com/file/d/0B_o3xTWAYdbuRGw0dzNFRzd2NEk/view?usp=sharing.

The data contains two different choice variables which I want to run the same model on. I run the first model like so:

lfsm1 <- mlogit.data(lfs.models, shape="wide", choice="PWK")
f1 <- mFormula(PWK~1 | MIGGRP+SEX+AGE+EDU)
m1 <- mlogit(f1, lfsm1, weights=PWT14)
summary(m1)

This model runs without issues. Then I run the same exact model on the other choice variable:

lfsm2 <- mlogit.data(lfs.models, shape="wide", choice="multi")
f2 <- mFormula(multi~1 | MIGGRP+SEX+AGE+EDU)
m2 <- mlogit(f1, lfsm2, weights=PWT14)

I get the following errors:

Error in if (is.null(initial.value) || lnl <= initial.value) break : 
missing value where TRUE/FALSE needed
In addition: There were 20 warnings (use warnings() to see them)
> warnings()
Warning messages:
1: In `[<-.factor`(`*tmp*`, is.na(x), value = FALSE) :
   invalid factor level, NA generated

And that warning message repeats 20x.

I'm not sure what either of these errors mean in the context of my model. A previous post (mlogit: missing value where TRUE/FALSE needed) suggests that my first error occurs because my data are not in wide format, or because there are some individuals who do not select any of the alternatives. In my case neither of these explanations can be right. What I've seen about the warning messages suggest mlogit is reacting badly to variables being factors or numeric. But I don't quite understand why this would matter in a multinomial regression context, or how the problem only occurred twenty times in such a large dataset.

Any suggestions would be most appreciated!

1
Are you intentionally using f1 in the call defining m2?coffeinjunky
Also, could you provide sample data here in the post? Few people want to download, save, load and explore a file from the web just to answer a question here.coffeinjunky
@coffeinjunky You are right, using m2 <- mlogit(f2, lfsm2, weights=PWT14) no errors come out.Marco Sandri

1 Answers

0
votes

Try

m2 <- mlogit(f2, lfsm2, weights=PWT14)

Note the f2 in the call to mlogit.

In your second call to mlogit.data, you have specified that multi is the choice variable, and the data are prepared accordingly. Yet, in the formula that you are using, f1, the dependent variable is specified as PWK, so that mlogit is expecting a dataframe with one row for each alternative as defined by PMK, not multi.