Page 1 of 1

Goodness of fit MMNL and HCM models

Posted: 31 Jul 2023, 14:03
by Lin
Dear Prof. Hess,

I apologize if this is not the best place for these questions.

1) Can you tell me why the R2 score isn't available in an HCM output? That score is available in MMNL model output and corresponds to the formula of McFadden's pseudo R2: 1 — (deviance of the fitted model / deviance of the null model). The HCM output provides log-likelihood scores necessary for calculating McFadden's R2, just like an MMNL model output. Hence, is it wrong to calculate McFadden's R2 by hand, using the formula?

2) If I "add" a latent variable to what would be an MMNL model, the model fit measure will get worse. I understand that as a result of new "constraints" in the model, implemented in the form of indicator questions' scores. Thus, the log-likelihood of such a model will be lower compared to MMNL... However, I am not sure how to theoretically explain that. In technical terms, there are new independent variables, and as such, they should only improve log-likelihood. Even if I don't consider the latent variable as independent variables, the latent variables' indicators, in technical terms, should be new independent variables in the model.

3) Considering the previous questions, what would be an appropriate measure of goodness-of-fit for both MMNL and HCM? As far as I understand, Log-likelihood, AIC, and BIC in a case like this only show that there is a certain trade-off (worse HCM fit due to the implemented latent variables). However, what measure can indicate the models' proper fit to the data (independently per model)? Until I saw it is not available in the HCM output, I was counting on McFadden's pseudo R2.

Thank you,
Lin

Re: Goodness of fit MMNL and HCM models

Posted: 10 Aug 2023, 09:44
by stephanehess
Lin

using your numbering

1) the LL measures reported for HB models are simulated using averaging across the post burn-in iterations. Rho2 is also calculated and reported. But not for a hybrid choice model if that includes non discrete choice parts.

2) the LL for a hybrid choice will always be more negative than for a simple choice model as you have more dependent variables. And you should not compare the fit of the hybrid choice model on the choice data to that of a choice model alone as a choice model with equivalent flexibiilty will explain the choices better. See the theoretical discussions in https://doi.org/10.1016/j.trb.2016.04.021 and an example illustration in https://doi.org/10.1016/j.socscimed.2014.05.058

3) Goodness of fit measures in choice modelling are not like in regression. You should not use them to evaluate the performance of a model compared to other datasets, only for comparisons against other models on the same data.

Stephane