Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.

MDCEV problems

Ask questions about errors you encouunter. Please make sure to include full details about your model specifications, and ideally your model file.
Post Reply
ChrisDjie
Posts: 12
Joined: 07 Jul 2023, 10:20

MDCEV problems

Post by ChrisDjie »

Dear Professor,

I am running into difficulties trying to estimate an MDCEV of energy, housing, and other expenditures.
It concerns privacy-sensitive data of (a random subsample of 10k out of) 7 million households.
The focus is the amount of money spent, as virtually all households spent some money on energy and housing
We are thus interested in the gamma_parameters.
The outside "other" category is defined as the budget minus the other expenditures.
Households tend to spent roughly 0.1-50% (not real data/order of magnitude, due to strict privacy regulations) of their budget on energy versus roughly 1- 95% on housing.
Households with less budget naturally tend to spend a higher fraction on energy and housing.
All exogenous variables are sociodemographic (currently: wheter people are retired, their local address density, and their building's age).
Sigma is fixed to 1.
Alpha values are fixed to zero for now.
Apollo is 0.3.5.

However, I keep running into the following issues:
1. I cannot properly estimate the beta parameters due to the lack of corner cases. I tried creating artificial zeros by specifying a minimum consumption level, but that (naturally) results in very low likelihoods for those entries. I also tried fixing beta_energy and beta_housing to 10 or 100, but this does not seem to work properly either.
2. The gamma values estimated are extremely low: in the E-4 to E-8 range.
3. I frequently get zero likelihoods at starting values. When carefully modifying these starting values, I instead get "False Convergence"/"Unconstrained Optimization".

I tried apollo_searchStart, but this does not result in any good options as the likelihoods tend to be zero at starting values.
I also computed the probabilities using apollo_probabilities(functionality = output), but this has not yielded sufficient insight (other than showing that likelihoods for entries with imposed zeros are extremely low and that households with huge budgets cannot be modeled properly either).
Data cleaning is limited to removing households with very high/low budgets or energy/housing expenditures.
The reason is that the datasets have been prepared by third parties (Statistics Netherlands), with pseudonymized household identifiers and little background information.
Sharing or copying the R-file is impossible due to the remote environment (they want to ensure that we do not accidentally give away data on households in our code before we can download said code).

I realize that I am not exactly making this easy, but I was hoping you might have some general thoughts or recommendations?
Maybe this is not suitable for MDCEV after all?
Since there are not really any Discrete Choices of zero consumption being made?

Finally, I keep getting confused about the actual output of the model if it would be working.
I thought this would be in input units (i.e. Euros) but my gamma_parameters do not seem to depend on the scaling of these inputs (i.e. are similarly small if I am modeling euros versus thousands of euros)?

Thank you so much,

Your sincerely,

Chris ten Dam
dpalma
Posts: 217
Joined: 24 Apr 2020, 17:54

Re: MDCEV problems

Post by dpalma »

Dear Chris,

As a first step, I would recommend you not fixing the alpha values to zero. They usually tend towards zero, but fixing them at zero might cause problems. So I would recommend you fixing them to something very close to zero (from the right), but not quite zero, say 1e-9.

Concerning the interpretation of the MDCEV parameters, they do not have any meaningful unit, so beyond their sign and relative magnitude, they are very difficult to interpret. In the case of betas you are basically looking at the sign, while you have to keep in mind the magnitude of their corresponding covariate to say anything about their magnitude. When it comes to gammas, they should always be positive. A larger gamma means more consumption of that alternative. In general, changing the unit of the dependent variable does change the magnitude of the gammas, so I am surprised that the gammas do not change when you work in euros or thousands of euros.

More in general, and just as you mention in your post, I am not sure MDCEV is the right tool for the job. MDCEV is mainly an allocation model, where the budget is allocated to different goods or alternatives, allowing for zero allocation (corner solutions) for some alternatives or goods. Yet you mention that almost every household spends on housing and energy, so you have very few zeros, making MDCEV's main characteristics (allowing for zero allocation) irrelevant. I am not familiar with the energy literature, but you may want to explore other modelling approaches. Maybe you can use fractional logit to model the % of the budget allocated to housing, energy, and other expenditure. Then you can use the household income as an explanatory variable. There are more details about the fractional logit model in section 5.3 of Apollo's manual.

I know this is probably not the answer you were expecting, but I hope it is somewhat useful.

Best wishes,
David
Post Reply