Dear Stephane/David
I am working with supermarket scannerdata, and are trying to analyze this using the Apollo eMDC implementation (mainly for “practicing” the use of these models). I have purchasing data for around 700 households over the course of a year, and have picked out 15-20 specific products that I am including as my “inside goods”. I both have a dataset where household purchases are aggregated on a weekly basis, and one where it is aggregated on a monthly basis.
With around 15 products/alternatives, quite often, one of the following messages appear (I do not include my code, as I refer to several different model-specifications):
- Log-likelihood calculation fails at values close to the starting values,
- WARNING: Estimation failed. No covariance matrix to compute,
- Error in if (any(testL == 0)) cat("\nSome observations have zero probability at starting value for eMDCEV model component.").
In my case, this particularly happens when including a budget or when including explanatory variables (e.g. income) in the utility of the outside good, i.e. the estimation process is smooth when excluding these factors, but run into trouble when included. One way to seemingly aid the estimation is to reduce the number of observations per individual (to reduce the chance that the product of many individual observations come too close to zero). Using workInLogs=TRUE in apollo_control or starting values from simpler models generally doesn’t help. This lead me to three questions:
- Is it a general issue that models with a budget / outside good covariates are considerably more difficult to estimate?
- Is there a general “rule-of-thumb” as to the number of alternatives (and/or parameters) that it is feasible to include in an eMDC model? Or is the answer to this simply too case-dependent?
- Is it generally advisable to reduce the number of observations per individual (and “compensate” by increasing the number of individuals), in order to aid estimation of eMDC models with many alternatives/parameters?
I hope such "generic" questions are OK, and highly appreciate your answer!
Best regards
Tobias H. Rønn
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.
General specification of eMDC (to avoid errors)
Re: General specification of eMDC (to avoid errors)
Hi Tobias,
Sorry for the very slow reply.
In my experience, this is a common issue with MDC models. The likelihood can become very small quite easily as the number of alternatives increases. The issue is further exacerbated when the base utilities depend on explanatory variables with large values, which can make the base utilities large even for small values of the parameters. On top of this, if you have multiple observations per individuals, the multiplication of small probabilities becomes even smaller very fast.
I do not have a clear answer for this, but below are some recommendations to find starting values:
Best wishes
David
Sorry for the very slow reply.
In my experience, this is a common issue with MDC models. The likelihood can become very small quite easily as the number of alternatives increases. The issue is further exacerbated when the base utilities depend on explanatory variables with large values, which can make the base utilities large even for small values of the parameters. On top of this, if you have multiple observations per individuals, the multiplication of small probabilities becomes even smaller very fast.
I do not have a clear answer for this, but below are some recommendations to find starting values:
- For the first few models I recommend setting apollo_control$panelData = FALSE and remove the call to apollo_panelProd inside apollo_probabilities. This way you avoid the multiplication of small probabilities and it becomes easier to estimate the model. Once you have found good starting values, you can undo these changes and consider the panel again.
- Start by estimating models without explanatory variables, and only ASCs. You can even try larger starting values for the ASCs of more popular alternatives.
Best wishes
David