Multinomial Logit Fixed Effects: Stata and R

Question

I am trying to run a multinomial logit with year fixed effects in mlogit in Stata (panel data: year-country), but I do not get standard errors for some of the models. When I run the same model using multinom in R I get both coefficients and standard errors.

I do not use Stata frequently, so I may be missing something or I may be running different models in Stata and R and should not be comparing them in the first place. What may be happening?

So a few details about the simple version of the model of interest:

I created a data example to show what the problem is

Dependent variable (will call it DV1) with 3 categories of -1, 0, 1 (unordered and 0 as reference)
Independent variables: 2 continuous variables, 3 binary variables, interaction of 2 of the 3 binary variables
Years: 1995-2003
Number of observations in the model: 900

In R I run the code and get coefficients and standard errors as below.

R version of code creating data and running the model:

## Fabricate example data 
library(fabricatr)
data <- fabricate(
  N = 900,
  id = rep(1:900, 1),
  IV1 = draw_binary(0.5, N = N),
  IV2 = draw_binary(0.5, N = N),
  IV3 = draw_binary(0.5, N = N),
  IV4 = draw_normal_icc(mean = 3, N = N, clusters = id, ICC = 0.99),
  IV5 = draw_normal_icc(mean = 6, N = N, clusters = id, ICC = 0.99))

library(AlgDesign)
DV = gen.factorial(c(3), 1, center=TRUE, varNames=c("DV"))
year = gen.factorial(c(9), 1, center=TRUE, varNames=c("year"))

DV = do.call("rbind", replicate(300, DV, simplify = FALSE))
year = do.call("rbind", replicate(100, year, simplify = FALSE))

year[year==-4]= 1995
year[year==-3]= 1996
year[year==-2]= 1997
year[year==-1]= 1998
year[year==0]= 1999
year[year==1]= 2000
year[year==2]= 2001
year[year==3]= 2002
year[year==4]= 2003

data1=cbind(data, DV, year)
data1$DV1 = relevel(factor(data1$DV), ref = "0")

## Save data as csv file (to use in Stata)
library(foreign)
write.csv(data1, "datafile.csv", row.names=FALSE)

## Run multinom
library(nnet)
model1 <- multinom(DV1 ~ IV1 + IV2 + IV3 + IV4 + IV5 + IV1*IV2 + as.factor(year), data = data1)

Results from R

When I run the model using mlogit (without fixed effects) in Stata I get both coefficients and standard errors.

So I tried including year fixed effects in the model using Stata three different ways and none worked:

femlogit

factor-variable and time-series operators not allowed
depvar and indepvars may not contain factor variables or time-series operators

mlogit

fe option: fe not allowed
used i.year: omits certain variables and/or does not give me standard errors and only shows coefficients (example in code below)

* Read file
  import delimited using "datafile.csv", clear case(preserve)
  
* Run regression
    mlogit DV1 IV1 IV2 IV3 IV4 IV5 IV1##IV2 i.year, base(0) iterate(1000)

Stata results

xtmlogit

error - does not run
error message: total number of permutations is 2,389,461,218; this many permutations require a considerable amount of memory and can result in long run times; use option force to proceed anyway, or consider using option rsample()

These questions are difficult to ask and difficult to answer, really presuming here some reader with very detailed understanding of the functionality of both Stata and R in this territory (not me). Generic advice: Make your question self-contained with a minimal data example and code that is sufficient to see the problem. I doubt that the minimal example would need six predictors. Also, don't ask several questions at once: your last paragraph slips in something different. — Nick Cox
Thanks for the comment, I edited to include a data example to add some clarity in my question. — Emily
Your mlogit Stata results do not converge, possibly because there is no relation between $y$ and $X$ in your random data. With your observed data, multinom and femlogit may give different results because they estimate different models (and may have different convergence criteria). As far as I know, femlogit (and now xtmlogit) are the only fixed-effect multinomial logit commands/functions in public Stata, R, and Python packages. I think femlogit will converge (if convergence is possible) if you follow @Danferno's answer. — Richard Herron

Danferno Danferno · Accepted Answer · 2021-06-01T08:14:40

Fixed effects and non-linear models (such as logits) are an awkward combination. In a linear model you can simply add dummies/demean to get rid of a group-specific intercept, but in a non-linear model none of that works. I mean you could do it technically (which I think is what the R code is doing) but conceptually it is very unclear what that actually does.

Econometricians have spent a lot of time working on this, which has led to some work-arounds, usually referred to as conditional logit. IIRC this is what's implemented in femlogit. I think the mistake in your code is that you tried to include the fixed effects through a dummy specification (i.year). Instead, you should xtset your data and then run femlogit without the dummies.

xtset year
femlogit DV1 IV1 IV2 IV3 IV4 IV5 IV1##IV2

Note that these conditional logit models can be very slow. Personally, I'm more a fan of running two one-vs-all linear regressions (1=1 and 0/-1 set to zero, then -1=1 and 0/1 set to zero). However, opinions are divided (Wooldridge appears to be a fan too, many others very much not so).

Multinomial Logit Fixed Effects: Stata and R

1 Answers