Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

In log(ll) : NaNs produced

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

In log(ll) : NaNs produced

Post by bokapatsila »

Hi! When estimating an ICLV model in Apollo, I'm getting a message:

There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In log(ll) : NaNs produced
...
...
50: 50: In log(ll) : NaNs produced

I'm not sure if that's something that should be of concern, as I don't have any NaNs in the results. Can someone please explain this to me? Thanks in advance!
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: In log(ll) : NaNs produced

Post by dpalma »

Hi,

This is probably because you are using either ordered logit or linear models for the measurement equations in your ICLV. If that is the case, and you achieve convergence despite the warnings, then I would not worry about it.

The warnings are generated because the likelihood maximisation algorithm (BFGS, BHHH, or whichever you are using) sometimes tries combinations of parameters that are not valid. For example, in the case of a linear model it might try a negative s.d., or in the ordered logit it might try thresholds that are not increasing (e.g. tau_1 > tau_3). This is not a problem because when it happens, the results are NaN ("not a number"), and the algorithm realises that is not the way to the solution, and tries different values instead. If the model converges despite these warnings, then the optimum lies in the region were the likelihood is well defined, and therefore there isn't a problem.

I would only worry if -besides getting these warning- your model also struggles to converge. It could mean that the optimum is near or in the area were your likelihood is not defined (e.g. negative s.d. for linear model, or non-increasing thresholds).

If you want to avoid these warning, you could define the s.d. in a linear model (apollo_normalDensity) as exp(s), where "s" is the parameter to estimate, so you make sure that the value is positive. In the case of the ordered logit, you could do tau_1 = t1, tau_2 = tau_1 + exp(t2), tau_3 = tau_2 + exp(t3), ... where t1, t2, t3, ... are the parameters to be estimated, making sure all taus are increasing. But I don't recommend doing any of these, as it will slow down estimation and make parameters more difficult to interpret, for no real benefit.

In summary, I wouldn't worry about it if you are using apollo_normalDensity or apollo_ol, and the estimation converges without issues.

Cheers
David
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: In log(ll) : NaNs produced

Post by bokapatsila »

Thank you, David. You're correct, I'm using the ordered logit, and I had the same assumption but wanted to verify this.

Another question I have regarding this ICLV model is its interpretation. In the Advanced Choice Modelling Course, Stephane provides an example of interpretation for a positive lambda. Particularly, that a positive lambda indicates that a responded with a more positive latent variable is more likely to choose an option, while zetas suggest that they are more likely to agree with a statement in the same case.

How to interpret a negative lambda? I'm confused that zetas remain positive. Does it mean that a negative lambda indicates that a responded with a more negative latent variable is more likely to choose an option, while zetas still suggest that they are more likely to agree with a statement in the same case?
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: In log(ll) : NaNs produced

Post by dpalma »

Just to standardise notation and avoid confusion:
  • The structural equation is LV = a1*z1 + a2*z2 + eta, where LV is the latent variable, a1 and a2 are parameters to be estimated, z1 and z2 are explanatory variables (maybe socio-demographics), and eta is a standard normal error term.
  • The measurement equation is an ordered logit with utility V_ol = zeta * LV, where zeta is a parameter to be estimated.
  • The choice model has an alternative 1 with utility V_1 = b1*x1 + b2*x2 + lambda*LV, where b1, b2 and lambda are parameters to be estimated, and x1 and x2 are explanatory variables (maybe attributes of the alternative).
The parameter zeta has to do with the interpretation of the LV. In this setting, if zeta>0 and significant, it means that the indicator and the LV are positively correlated. For example, if the indicator is the level of agreement with the phrase “I feel very satisfied” from 1 to 5, and zeta>0, then your LV would be measuring satisfaction. If zeta<0, then the LV would be measuring dissatisfaction, because the higher the LV, the lower the indicator.

The parameter lambda relates to the influence of the LV on the probability to choose an alternative. In the setting described above, a significant lambda>0 means that the higher the LV, the higher the chances of selecting alternative 1. In our example, zeta>0 and lambda>0 mean that satisfied people is more likely to choose alternative 1. If lambda<0 and significant, it means that the higher the LV, the lower the chances to select alternative 1. So, in our example, zeta>0 and lambda<0 means that satisfied people are less likely to choose alternative 1.

There is no theoretical requirement for both zeta and lambda to be positive. It is just a matter of interpretation.

Hope this explanation helps.

Cheers
David
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: In log(ll) : NaNs produced

Post by bokapatsila »

That makes total sense, thank you so much for this detailed explanation!
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: In log(ll) : NaNs produced

Post by bokapatsila »

Dear David,

What would be a proper interpretation of the a1 parameter (e.g. it's a dummy for gender, with women coded as 1)?
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: In log(ll) : NaNs produced

Post by dpalma »

Using the notation below, a1 and a2 measure the influence of explanatory variables on the LV.

For example, let's imagine LV represents an individual's satisfaction. Its structural equation is LV = a1*female + a2*income + eta, where "female" is a dummy taken value 1 if the individual is female and 0 otherwise; "income" is a continuous variable indicating the monthly income of the individual; "eta" a standard normal error term; and "a1" and "a2" are parameters to be estimated. Then a1>0 means that -on average- females have a higher level of satisfaction than other people in your sample. On the other hand, if a1<0, it means that females have lower satisfaction than the rest of the people. If a2>0 (a2<0), then people with higher (lower) income are -on average- more satisfied that people with lower income.

Cheers
David
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: In log(ll) : NaNs produced

Post by bokapatsila »

Fantastic, thank you!
Post Reply