0
votes

I'm new to discrete choice modeling, so my apologies if I am misunderstanding a fundamental aspect of the analysis.

I would like to run a discrete choice analysis with an individual-specific variable and what I think are alternative-specific attribute variables. From the mlogit vignette I think the individual-specific variable is a "choice situation specific covariate" (in the new vignette) and the alternative-specific attribute variables are "alternative specific covariates with generic coefficients" (again, in the new vignette). The alternative-specific attribute variables should not have differing impacts for the different alternatives, so I believe a generic coefficient that applies to all alternatives is in order.

Let's use the Fishing dataset as an example.

library(mlogit)

data(Fishing)
Fish1 <- dfidx(Fishing, varying=2:9, choice="mode", idnames=c("chid", "alt"),
    drop.index=F)
Fish1

... which gets us:

~~~~~~~
 first 10 observations out of 4728
~~~~~~~
    mode   income     alt   price  catch chid    idx
1  FALSE 7083.332   beach 157.930 0.0678    1 1:each
2  FALSE 7083.332    boat 157.930 0.2601    1 1:boat
3   TRUE 7083.332 charter 182.930 0.5391    1 1:rter
4  FALSE 7083.332    pier 157.930 0.0503    1 1:pier
5  FALSE 1250.000   beach  15.114 0.1049    2 2:each
6  FALSE 1250.000    boat  10.534 0.1574    2 2:boat
7   TRUE 1250.000 charter  34.534 0.4671    2 2:rter
8  FALSE 1250.000    pier  15.114 0.0451    2 2:pier
9  FALSE 3750.000   beach 161.874 0.5333    3 3:each
10  TRUE 3750.000    boat  24.334 0.2413    3 3:boat```

And then we fit the model:

(fit1 <- mlogit(mode ~ price+catch | income | 1, data=Fish1))

... which gets us:

Call:
mlogit(formula = mode ~ price + catch | income | 1, data = Fish1,     method = "nr")

Coefficients:
   (Intercept):boat  (Intercept):charter     (Intercept):pier                price
        0.527278790          1.694365710          0.777959401         -0.025116570
              catch          income:boat       income:charter          income:pier
        0.357781958          0.000089440         -0.000033292         -0.000127577

So far so good.

Now let's recode the price and catch (alternative-specific attribute variables) values to be alternative varying but individual invariant:

Fishing2 <- Fishing

Fishing2$price.beach   <- 50
Fishing2$price.pier    <- 100
Fishing2$price.boat    <- 150
Fishing2$price.charter <- 200
Fishing2$catch.beach   <- .2
Fishing2$catch.pier    <- .5
Fishing2$catch.boat    <- .75
Fishing2$catch.charter <- .87

Fish2 <- dfidx(Fishing2, varying=2:9, choice="mode", idnames=c("chid", "alt"),
    drop.index=F)

Fish2

... which gets us:

~~~~~~~
 first 10 observations out of 4728
~~~~~~~
    mode   income     alt price catch chid    idx
1  FALSE 7083.332   beach    50  0.20    1 1:each
2  FALSE 7083.332    boat   150  0.75    1 1:boat
3   TRUE 7083.332 charter   200  0.87    1 1:rter
4  FALSE 7083.332    pier   100  0.50    1 1:pier
5  FALSE 1250.000   beach    50  0.20    2 2:each
6  FALSE 1250.000    boat   150  0.75    2 2:boat
7   TRUE 1250.000 charter   200  0.87    2 2:rter
8  FALSE 1250.000    pier   100  0.50    2 2:pier
9  FALSE 3750.000   beach    50  0.20    3 3:each
10  TRUE 3750.000    boat   150  0.75    3 3:boat

It seems to me that this is like a one-choice product comparison: each of the alternatives has a fixed set of attributes (alternative-specific attribute variables with generic coefficients) that may influence an individual's decision. The individual's income, the individual-specific (or choice situation-specific, from the new vignette) variable, might affect the decision as well, although it must vary with alternative as shown by the vignette.

BUT, when I try to run the model for the Fish2 dataset, it fails:

fit2 <- mlogit(mode ~ price+catch | income | 1, data=Fish2)
Error in solve.default(H, g[!fixed]) :
  system is computationally singular: reciprocal condition number = 3.18998e-23

I'm guessing that the fact that the alternative-specific attribute variables do not vary across choice situations is the problem, but I do not understand why, or how to fix it. It SEEMS to me like I should be able to analyze this situation with mlogit.

If there is another analysis technique that would accommodate this kind of question better, I'm open to suggestions.

2

2 Answers

0
votes

The error message you get is often the result of insufficient variation in the data. With insufficient variation the Hessian matrix (negative of the information matrix) becomes singular and cannot be inverted, i.e. you cannot get your standard errors. There are many answers on this particular error message. For example here.

In your second example, if I understand correctly, each alternative is the same for all individuals, which means that you only have four different observations, one for each fishing location. While you observe each many times, you still only have 4 unique observations, but you are trying to fit 8 parameters. This is in all likelihood why your model fails.

0
votes

So, it turns out that there is a multicollinearity problem if you include alternative-specific covariates with generic coefficients AND allow the intercept to be included. From the mlogit vignette:

The treatment of alternative specific variables don’t differ much from the alternative and choice situation specific variables with a generic coefficient. However, if some of these variables are introduced, the parameter can only be estimated in a model without intercepts to avoid perfect multicolinearity.

If I remove the intercepts:

(fit2 <- mlogit(mode ~ price+catch - 1 | income - 1, data=Fish2))

everything runs fine:

Call:
mlogit(formula = mode ~ price + catch - 1 | income - 1, data = Fish2, method = "nr")

 Coefficients:
          price           catch     income:boat  income:charter     income:pier  
   0.0117786865   -0.9155791943    0.0001061285    0.0000037033   -0.0000411957