NaNs for the standard errors - Are the results always unacceptable?

cybey · Post by **cybey** » 06 Apr 2021, 07:57

Hello everyone,

I have a rather general question regarding model identification. Following the discussions in the forum, I learned that NaNs for the standard errors in the output point to problems in model specification/identification. However, most forum posts are devoted to models where NaNs appear in (almost) all parameter estimates. My question now is whether a (very) small number of NaNs in the standard errors (in relation to the total number of parameters to be estimated) is still acceptable, especially if...

The model is indeed very complex, e.g. ICLV in WTP space, ICLV with a high number of LV, etc.
Two data sources are used, such as forced choice and free choice data, or stated preferences and revealed preferences data (even though a parameter for differences in scale is estimated).

I have two data sets where this problem occurs. It is interesting to note that the number of NaNs decreases with the number of (e.g. Sobol) draws. However, at a certain point I reach the maximum of the computational capacity available (which is 256 Gb of ram), so I cannot tell if it is actually an identification problem.
In my cases, the robust standard errors can still be estimated for the parameters where NaNs appear. However, my Google search indicates that in the case of high robust standard errors the results should be treated with caution, since the standard errors may be unbiased, but not the parameter estimators themselves.

“[...] the probit (Q-) maximum likelihood estimator is not consistent in the presence of any form of heteroscedasticity, unmeasured heterogeneity, omitted variables (even if they are orthogonal to the included ones), nonlinearity of the form of the index, or an error in the distributional assumption [ with some narrow exceptions as described by Ruud (198)]. Thus, in almost any case, the sandwich estimator provides an appropriate asymptotic covariance matrix for an estimator that is biased in an unknown direction.”

Source: Greene, W. H., 2012. Econometric Analysis. Prentice Hall, Upper Saddle River, NJ., pp. 692-693

Unfortunately, I could not find a satisfactory answer in the FAQs, the forum or Google group. Therefore, I would be very happy to hear your expert opinion.

Best
Nico

Post by **stephanehess** » 08 Apr 2021, 11:02

Hi Nico

whether NA happens for all or some parameters shows you a problem with your model. For example, if you overspecify the constants in the model, only the constants might show NA for the se.

So to me, it would never be acceptable to have NA even for some. One thing you could try here is to see what happens with bootstrapping for the se.

Stephane

cybey · Post by **cybey** » 21 Apr 2021, 07:14

Hi Stephane,

Thank you very much for your answer.

I suspect that the estimation problems may have something to do with the complexity of the models: HCM in WTP space and/or a high number of latent variables (three or four with several indicators each) and/or several model components (e.g. using LV to explain past choices). Estimation with HB would circumvent the identification problem, although the convergence problem remains. Is it okay to switch to HB if a model becomes "too complex" (whatever this means exactly) for classical estimation?

I have a second question that I would very much appreciate an answer to: Some studies (e.g. in the field of environmental psychology) use only one indicator for latent variables in their structural equation models. In these cases, strictly speaking, the LV is no longer latent, but observable for each respondent. In SEM books, it is described that with single indicators you cannot estimate separately a measurement error variance and a factor variance. What would the model specification in Apollo look like in such a case, e.g. with the HCM data set from the homepage?

Code: Select all

randcoeff[["LV"]] = gamma_LV_reg_user*regular_user + gamma_LV_university*university_educated + gamma_LV_age_50*over_50 + eta

Code: Select all

ol_settings1 = list(outcomeOrdered=attitude_quality, 
                      V=zeta_quality*LV, 
                      tau=c(tau_quality_1, tau_quality_2, tau_quality_3, tau_quality_4),
                      rows=(task==1))
  ol_settings2 = list(outcomeOrdered=attitude_ingredients, 
                      V=zeta_ingredient*LV, 
                      tau=c(tau_ingredients_1, tau_ingredients_2, tau_ingredients_3, tau_ingredients_4), 
                      rows=(task==1))
  ol_settings3 = list(outcomeOrdered=attitude_patent, 
                      V=zeta_patent*LV, 
                      tau=c(tau_patent_1, tau_patent_2, tau_patent_3, tau_patent_4), 
                      rows=(task==1))
  ol_settings4 = list(outcomeOrdered=attitude_dominance, 
                      V=zeta_dominance*LV, 
                      tau=c(tau_dominance_1, tau_dominance_2, tau_dominance_3, tau_dominance_4), 
                      rows=(task==1))

So, the model would only include one ordinal regression.

I look forward to your answers.
Nico

Post by **stephanehess** » 21 Apr 2021, 14:31

Nico

on the first point, i.e. whether you should move to HB, it doesn't really circumvent the problem. It's not the case that your classical model does not converge here, but that you can't compute standard errors. Complexity could be a reason for the calculations failing (but less so with the newer versions with analytical derivatives), but the other core reason could be an identification issue, which is of course important to know about. So a HB model might 'work', but unless you very carefully examine your posteriors, you might not spot that the model is not actually identified.

on the second point, my view has always been that in a hybrid choice model, you have the additional dependent variable given by the choices, and so even with a single indicator, the latent variable is still calibrated on more than one data point per person. This doesn't mean it's a good idea to just use one indicator

Stephane

cybey · Post by **cybey** » 21 Apr 2021, 16:05

Thank you very much.

I just realised that I didn't respond to your suggestion to use bootstrapping to estimate the standard errors. I have read the Apollo manual, but this is probably computationally too burdensome.

Best
Nico

ApolloChoiceModelling forum

NaNs for the standard errors - Are the results always unacceptable?

NaNs for the standard errors - Are the results always unacceptable?

Re: NaNs for the standard errors - Are the results always unacceptable?

Re: NaNs for the standard errors - Are the results always unacceptable?

Re: NaNs for the standard errors - Are the results always unacceptable?

Re: NaNs for the standard errors - Are the results always unacceptable?