Page 1 of 1

Estimating Covariates with Unlabelled Alternatives

Posted: 07 Sep 2021, 18:39
by biancar
Hello,

Do you have any example of estimating the effects of covariates (e.g. gender, age group) on categorical attributes with unlabelled alternatives in a MMNL model? I see there are multiple examples of covariates incorporated in the analysis of labelled alternatives (e.g. shifts for females on mode choice) on this website and in the forum. From what I see (e.g. example Apollo file 3), covariates are added to the alternatives: "Create alternative specific constants and coefficients using interactions with socio-demographics". But I'm not interested in whether females were more likely to choose alternative 1 over 2. So instead I've tried factoring in my covariate (sex) into my utility equations. However when I did this I got an error message stating "Error in apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, : Parameter fare_est_a does not influence the log-likelihood of your model!"

Re: Estimating Covariates with Unlabelled Alternatives

Posted: 08 Sep 2021, 11:40
by dpalma
Hi,

In MNL models the only thing that matters is the difference between utilities. Consider, for example, an MNL model with two alternatives, each having deterministic utilities V1 and V2. Then their probabilities can be written as:
P(y=1) = exp(V1) / ( exp(V1) + exp(V2) ) = 1 / ( 1 + exp(V2 - V1) )
P(y=2) = exp(V2) / ( exp(V1) + exp(V2) ) = 1 / ( 1 + exp(V1 - V2) )
As you can see from the expressions above, anything that is the same in V1 and V2 will drop out of the probability expression, so it won't influence the probability, nor the likelihood, so it can't be estimated.

I imagine your specification is somethings similar to:

Code: Select all

V[['alt1']] = b*x1 + bHigh*highIncome
V[['alt2']] = b*x2 + bHigh*highIncome
...
Where "b" and "bHigh" are parameters to be estimated. You can see that V2-V1 = b*(x2-x1), with bHigh*highIncome dropping out of the expression. This also applies to MMNL models, as they are also based on MNL models.

To estimate the effect of sociodemographics in non-labelled experiments you basically have two approaches. The first approach is to use different coefficients for each alternative and set one alternative as base. For example:

Code: Select all

V[['alt1']] = b*x1 # this is the base alternative, so there are no sociodemographics in its utility
V[['alt2']] = b*x2 + bHigh2*highIncome
V[['alt3']] = b*x3 + bHigh3*highIncome
...
Another alternative is using interactions between attributes that change from one alternative to the next (like "x" in the example) and the sociodemographics. For example:

Code: Select all

V[['alt1']] = (b + bHigh*highIncome)*x1
V[['alt2']] = (b + bHigh*highIncome)*x2
...
You can mix both approaches, using one approach for some explanatory variables (some "x") and the other approach for other explanatory variables.

Hope this clarifies the issue.

Cheers
David