Goodness of fit MMNL and HCM models
Posted: 31 Jul 2023, 14:03
Dear Prof. Hess,
I apologize if this is not the best place for these questions.
1) Can you tell me why the R2 score isn't available in an HCM output? That score is available in MMNL model output and corresponds to the formula of McFadden's pseudo R2: 1 — (deviance of the fitted model / deviance of the null model). The HCM output provides log-likelihood scores necessary for calculating McFadden's R2, just like an MMNL model output. Hence, is it wrong to calculate McFadden's R2 by hand, using the formula?
2) If I "add" a latent variable to what would be an MMNL model, the model fit measure will get worse. I understand that as a result of new "constraints" in the model, implemented in the form of indicator questions' scores. Thus, the log-likelihood of such a model will be lower compared to MMNL... However, I am not sure how to theoretically explain that. In technical terms, there are new independent variables, and as such, they should only improve log-likelihood. Even if I don't consider the latent variable as independent variables, the latent variables' indicators, in technical terms, should be new independent variables in the model.
3) Considering the previous questions, what would be an appropriate measure of goodness-of-fit for both MMNL and HCM? As far as I understand, Log-likelihood, AIC, and BIC in a case like this only show that there is a certain trade-off (worse HCM fit due to the implemented latent variables). However, what measure can indicate the models' proper fit to the data (independently per model)? Until I saw it is not available in the HCM output, I was counting on McFadden's pseudo R2.
Thank you,
Lin
I apologize if this is not the best place for these questions.
1) Can you tell me why the R2 score isn't available in an HCM output? That score is available in MMNL model output and corresponds to the formula of McFadden's pseudo R2: 1 — (deviance of the fitted model / deviance of the null model). The HCM output provides log-likelihood scores necessary for calculating McFadden's R2, just like an MMNL model output. Hence, is it wrong to calculate McFadden's R2 by hand, using the formula?
2) If I "add" a latent variable to what would be an MMNL model, the model fit measure will get worse. I understand that as a result of new "constraints" in the model, implemented in the form of indicator questions' scores. Thus, the log-likelihood of such a model will be lower compared to MMNL... However, I am not sure how to theoretically explain that. In technical terms, there are new independent variables, and as such, they should only improve log-likelihood. Even if I don't consider the latent variable as independent variables, the latent variables' indicators, in technical terms, should be new independent variables in the model.
3) Considering the previous questions, what would be an appropriate measure of goodness-of-fit for both MMNL and HCM? As far as I understand, Log-likelihood, AIC, and BIC in a case like this only show that there is a certain trade-off (worse HCM fit due to the implemented latent variables). However, what measure can indicate the models' proper fit to the data (independently per model)? Until I saw it is not available in the HCM output, I was counting on McFadden's pseudo R2.
Thank you,
Lin