Hi Stephane and David!
I would like to confirm whether the following output variables from the MDCEV model refer to:
cont_sd: Standard deviation of predicted continuous consumption (i.e., time allocated to an activity),
disc_sd: Standard deviation of the discrete 0/1 choice decision (whether an activity is chosen or not), and
expe_sd: Standard deviation of expected utility across alternatives or choice occasions.
Additionally, I seek clarification on the interpretation of these standard deviation measures. Specifically:
Do these standard deviations primarily represent modeling uncertainty (i.e., the error or unexplained variation arising because the model cannot fully capture all aspects of decision-making behavior)? Or, in the case of panel data, do they instead reflect intra-individual variability in behavior—that is, the genuine variation in an individual’s activity choices or time use across different days?
If the latter interpretation is valid (i.e., these standard deviations capture actual behavioral variability across repeated observations), can they then be used in further statistical analysis, for example, as dependent variables in models of instability in activity engagement patterns?
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.
Prediction probabilities from MDCEV model
Re: Prediction probabilities from MDCEV model
Hi,
You are correct about cont_sd and disc_sd, but not expe_sd.
Before defining them it is important to understand how the forecast of MDCEV is calculated. The forecast is based on simulation. For each row in your data, a draw for each good or alternative is drawn, then the consumer utility maximisation problem is solved. The process is repeated multiple times for each individual (by default 100 times, but you can change it with the setting nRep), and the final forecast is the average across all those repetitions.
Then the reported measures as defined as follows:
Can these levels of variability be used in further statistical analysis? Depends on what you want to do. If you wanted to simulate decisions, you could use them to define probability distributions for each outcome (e.g. activity engagement in a transport activity-based model). But if that was the case, it would be better to simulate the utilities (including drawing from their epsilons), and solve the optimisation problem, as that would ensure your forecast respected the budget.
Best wishes,
David
You are correct about cont_sd and disc_sd, but not expe_sd.
Before defining them it is important to understand how the forecast of MDCEV is calculated. The forecast is based on simulation. For each row in your data, a draw for each good or alternative is drawn, then the consumer utility maximisation problem is solved. The process is repeated multiple times for each individual (by default 100 times, but you can change it with the setting nRep), and the final forecast is the average across all those repetitions.
Then the reported measures as defined as follows:
- cont_sd: The standard deviation of the continuous consumption for each row in the data, across all repetitions. Repetitions where this alternative was not consumed are included in the calculation.
- disc_sd: Standard deviation of a binary variable (0/1) indicating if an alternative was (1) or not (0) consumed during each repetition. The s.d. is calculated across all repetitions.
- expe_sd: Standard deviation of the expenditure (continuous_consumption*price) for eac row of your data, across all repetitions.
Can these levels of variability be used in further statistical analysis? Depends on what you want to do. If you wanted to simulate decisions, you could use them to define probability distributions for each outcome (e.g. activity engagement in a transport activity-based model). But if that was the case, it would be better to simulate the utilities (including drawing from their epsilons), and solve the optimisation problem, as that would ensure your forecast respected the budget.
Best wishes,
David