Setting up multinomial logit model with mlogit package

Question

We are attempting to estimate a travel mode choice model using the mlogit package. Ultimately, we intend to set up a nested model with more variables, however we are attempting to first set up a very simple non-nested multinomial model to test. In particular, what we're trying to accomplish differs from the examples in the mlogit package in that we have some alternative-specific (e.g. bike vs. walk vs. drive) utility functions.

Our starting dataset (simplified) has this form:

"recid","mode","walk_mode_time","bike_mode_time","carsdivworkers"
254,"Bike",15.0666484832764,4.51999473571777,0.5
7,"SOV",17.9941387176514,5.39824199676514,2
40,"Walk",43,12.8999996185303,1

The utility functions that we want to specify for this test model are as follows:

Utility(SOV)= beta1* carsdivworkers

Utility(Walk)= Constant(Walk)+ beta6*(walk_mode_time) + beta7 *( carsdivworkers)

Utility(Bike)= Constant(Bike)+ beta8*(bike_mode_time) + beta9 *( carsdivworkers))

To make our data look more like the examples in the mlogit documentation, we THINK we need to structure our data with:

Each record (which lists a chosen alternative) replicated to also include the non-chosen alternatives for a given trip.
Alternative-specific values zeroed out for the non-chosen alternatives

This results in a data structure that looks like:

"recid","mode","choice","walk_mode_time",”bike_mode_time","cardivwkr"
7,"Bike",FALSE,0,5.39824199676514,1
7,"DriveTransit",FALSE,0,0,1
7,"HOV2",FALSE,0,0,1
7,"HOV3",FALSE,0,0,1
7,"SOV",TRUE,0,0,1
7,"Walk",FALSE,17.9941387176514,0,1
7,"WalkTransit",FALSE,0,0,1
40,"Bike",FALSE,0,12.8999996185303,0.5
40,"DriveTransit",FALSE,0,0,0.5
40,"HOV2",FALSE,0,0,0.5
40,"HOV3",FALSE,0,0,0.5
40,"SOV",FALSE,0,0,0.5
40,"Walk",TRUE,43,0,0.5
40,"WalkTransit",FALSE,0,0,0.5
254,"Bike",TRUE,0,4.51999473571777,1
254,"DriveTransit",FALSE,0,0,1
254,"HOV2",FALSE,0,0,1
254,"HOV3",FALSE,0,0,1
254,"SOV",FALSE,0,0,1
254,"Walk",FALSE,15.0666484832764,0,1
254,"WalkTransit",FALSE,0,0,1

We then turn this into an mlogit data structure as follows:

logit_data <- mlogit.data(data=joined_data,
                          choice="choice",
                          shape="long",
                          alt.var="mode",
                          chid.var="recid",
                          drop.index=TRUE,
                          reflevel= "SOV")

And our model specification:

mc <-mlogit(formula= choice ~  1 | carsdivworkers | walk_mode_time + bike_mode_time,  
          data = logit_data, reflevel= "SOV")

Unfortunately, we get the following error when we run this against our full dataset:

Error in solve.default(H, g[!fixed]) : Lapack routine dgesv: system is exactly singular

We think that this formula specifies the utility functions we want, but are not sure. Is this correct? Also, do we need to manually replicate our data records as we have done? Or is there a way of having mlogit.data() build a set of choice alternatives from our initial dataset?

otsaw otsaw · Accepted Answer · 2012-09-26T12:23:49

Considering the way you have prepared walk_mode_time and bike_mode_time you should probably try walk_mode_time + bike_mode_time | 1 + carsdivworkers | 0 as the formula. I usually find it convenient to produce partially zeroed variables and use only the first part of the formula, i.e. walk_mode_time + bike_mode_time + walk_mode_carsdivworkers + bike_mode_carsdivworkers + ... | 1 | 0 with *_carsdivworkers given for one less than the amount of alternatives (the coefficient for the one not specified is thus zero and others relative to that).

It's also possible you have something wrong with your data, e.g. choice situations with zero or more than one alternative chosen, a variable that has the same value for all alternatives, etc. If the formula 0 | 1 | 0 fails, you probably have a data problem, it if works you have a formula problem.

Setting up multinomial logit model with mlogit package

1 Answers