Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Integration of covariates into HB estimation

Ask questions about post-estimation functions (e.g. prediction, conditionals, etc) or other processing of results.
Post Reply
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Integration of covariates into HB estimation

Post by cybey »

Hello, everyone,

I have a question on using covariates based on an old topic in the Google group. I would like to estimate a MIXL model with covariates using HB. When using HB, I define the values of the transformation in apollo_probabilities()…

Code: Select all

b_Price_value = b_Price +
    eta_Income_Price * Income
… and get the conditionals for every coefficient in the model$estimate part. If I include the covariate as a fixed parameter, then under model$estimate I have only one parameter for the covariate, since they are the same for the whole population. In contrast, with a normal distribution, I have one parameter per observation/respondent.

1) In my example, I would like to get/plot the conditionals of b_Price_value and not of b_Price for every respondent. Is there an easy way to do this in apollo? When estimating with MSL I can define this transformation directly in randCoeff(), which is unfortunately not possible with HB.
The only thing I can think of spontaneously is to do it "by hand": With normal distribution, I could simply add the conditionals of model$estimate to get b_Price_value. With fixed covariates (“F”), on the other hand, I would have to multiply the covariates by the properties of the respondents (e.g. income). Is that right?

The problem with my data is that the estimated part-worth utilities (here: b_price) are so strongly influenced by the covariates that they can hardly be interpreted meaningfully without taking the covariates into account.

2) Furthermore, I wonder how the covariates are included in apollo_predictions() when they are fixed? Do they only enter the conditionals indirectly via the "upper model" estimates?

I look forward to your answers.
stephanehess
Site Admin
Posts: 1042
Joined: 24 Apr 2020, 16:29

Re: Integration of covariates into HB estimation

Post by stephanehess »

Hi

Apollo relies on RSGHB for Bayesian estimation, and the posteriors are for each individual coefficient rather than an addition or other transformation involving multiple coefficients. With Normals, you could add up the posterior means, but would you really want to use a Normal for price anyway? With fixed coefficients, you could add the values.

You should also consider whether the conditionals are actually what you want to use in the post-estimation work and whether you should instead work with the upper level model.

In relation to what Apollo uses in prediction with models estimated with HB, these are the posterior means, as discussed in the manual, and with the caveat that users should be careful about using posteriors for this purpose.

Best wishes

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Re: Integration of covariates into HB estimation

Post by cybey »

Maybe I should describe in more detail what I want to do. I wish to estimate a MIXL model in WTP space including covariates. Here are the distributional assumptions in hbDist():

Code: Select all

apollo_HB = list(
  hbDist         = c(wtp_Anbieter2 = "N",
                     wtp_Anbieter3 = "N",
                     wtp_Strommix2 = "N",
                     wtp_Strommix3 = "N",
                     wtp_Strommix4 = "N",
                     wtp_Regioanteil2 = "N",
                     wtp_Regioanteil3 = "N",
                     b_Preis = "CN-",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Anbieter2
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics
                     
                     ## Current
                     wtp_CurrentSupplierKVU_Anbieter2 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Anbieter3
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics

                     ## Current
                     wtp_CurrentSupplierBEG_Anbieter3 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Strommix2
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics
                     
                     ## Current
                     wtp_CurrentMix_Strommix2 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Strommix3
                     # ----------------------------------------------------------------- #

                     ## Sociodemographics
                     
                     ## Current
                     wtp_CurrentMix_Strommix3 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Strommix4
                     # ----------------------------------------------------------------- #

                     ## Sociodemographics
                     
                     ## Current
                     wtp_CurrentMix_Strommix4 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Regioanteil2
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics
                     wtp_Gender_Regioanteil2 = "F",
                     wtp_Age_Regioanteil2 = "F",
                     wtp_Education_Regioanteil2 = "F",
                     wtp_Residence_Regioanteil2 = "F",
                     wtp_FederalState.Wind_Regioanteil2 = "F",
                     wtp_FederalState.PV_Regioanteil2 = "F",

                     ## Current
                     wtp_CurrentMix_Regioanteil2 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Regioanteil3
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics
                     wtp_Gender_Regioanteil3 = "F",
                     wtp_Age_Regioanteil3 = "F",
                     wtp_Education_Regioanteil3 = "F",
                     wtp_Residence_Regioanteil3 = "F",
                     wtp_FederalState.Wind_Regioanteil3 = "F",
                     wtp_FederalState.PV_Regioanteil3 = "F",
                     
                     ## Current
                     wtp_CurrentMix_Regioanteil3 = "F",
                     
                     
                     # ----------------------------------------------------------------- #
                     #---- Preis
                     # ----------------------------------------------------------------- #
                     
                     ## Sociodemographics
                     b_Gender_Preis = "F",
                     b_Age_Preis = "F",
                     b_Income_Preis = "F",
                     b_Residence_Preis = "F",
                     
                     ## Current
                     b_PriceMonthly_centered_Preis = "F"
  )
Because of the covariates, I implement transformations of the variables in apollo_probabilities(). For example:

Code: Select all

  # ----------------------------------------------------------------- #
  #---- Regioanteil2
  # ----------------------------------------------------------------- #
  
  wtp_Regioanteil2_value = wtp_Regioanteil2 +

    ## Sociodemographics
    wtp_Gender_Regioanteil2 * Gender +
    wtp_Age_Regioanteil2 * Age +
    wtp_Education_Regioanteil2 * Education +
    wtp_Residence_Regioanteil2 * Residence +
    wtp_FederalState.Wind_Regioanteil2 * FederalState.Wind +
    wtp_FederalState.PV_Regioanteil2 * FederalState.PV +

    ## Current
    wtp_CurrentMix_Regioanteil2 * CurrentMix


  # ----------------------------------------------------------------- #
  #---- Preis
  # ----------------------------------------------------------------- #
  
  b_Preis_value = b_Preis +
    
    ## Sociodemographics
    b_Gender_Preis * Gender +
    b_Age_Preis * Age +
    b_Income_Preis * Income +
    b_Residence_Preis * Residence +
    
    ## Current
    b_PriceMonthly_centered_Preis * PriceMonthly_centered
The utilities of alternatives then are:

Code: Select all

  V = list()
  V[['alt1']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.1 + wtp_Anbieter3_value * Anbieter3.1 +
                                    wtp_Strommix2_value * Strommix2.1 + wtp_Strommix3_value * Strommix3.1 + wtp_Strommix4_value * Strommix4.1 +
                                    wtp_Regioanteil2_value * Regioanteil2.1 + wtp_Regioanteil3_value * Regioanteil3.1 +
                                    Preis.1)
  
  V[['alt2']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.2 + wtp_Anbieter3_value * Anbieter3.2 +
                                    wtp_Strommix2_value * Strommix2.2 + wtp_Strommix3_value * Strommix3.2 + wtp_Strommix4_value * Strommix4.2 +
                                    wtp_Regioanteil2_value * Regioanteil2.2 + wtp_Regioanteil3_value * Regioanteil3.2 +
                                    Preis.2)
  
  V[['alt3']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.3 + wtp_Anbieter3_value * Anbieter3.3 + 
                                    wtp_Strommix2_value * Strommix2.3 + wtp_Strommix3_value * Strommix3.3 + wtp_Strommix4_value * Strommix4.3 +
                                    wtp_Regioanteil2_value * Regioanteil2.3 + wtp_Regioanteil3_value * Regioanteil3.3 +
                                    Preis.3)
I would like to check whether, on average, the covariates have a significant influence on the WTP and/or the price coefficient. For this reason, I would like to include the covariates with a fixed distribution. A normal distribution (“N” instead of “F” in the code above) for the covariates leads to a (considerable) improvement in the model fit, but the fit is bad for the holdouts, which could indicate overfitting?

The problem with my data is that the estimated parameters, especially wtp_Regioanteil2 and wtp_Regioanteil3, are so strongly influenced by the covariates that in this case they can hardly be interpreted meaningfully alone. For example, without covariates wtp_Regioanteil3 = -0.8, indicating a positive willingness-to-pay, but with covariates wtp_Regioanteil3 = 0.2. This means that not only the absolute value of the parameters changes, but sometimes even the sign. In my understanding, it therefore makes no sense to interpret wtp_Regioanteil3, but only the transformation wtp_Regioanteil3_value? If I have understood your answer correctly, then apollo_prediction() does just that, since it also uses the conditionals per respondent?

So I just can add the fixed covariate estimates to the respective parameter for each respondent? The same not only for the conditionals per respondent, but also for the upper model estimates?
stephanehess
Site Admin
Posts: 1042
Joined: 24 Apr 2020, 16:29

Re: Integration of covariates into HB estimation

Post by stephanehess »

Hi

apollo_prediction uses only the posterior means, so I would be very careful in using it in the situation where some of your marginal utility parameters are actually given by sums of multiple model parameters. Even if that wasn't the case, you should be very careful with using posterior means as these have error measures around them too.

I would encourage you to instead use the upper level model, i.e. like in classical estimation, when you can just add the components up but you should recognise the full distribution where appropriate.

Best wishes

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply