Maybe I should describe in more detail what I want to do. I wish to estimate a MIXL model in WTP space including covariates. Here are the distributional assumptions in hbDist():
Code: Select all
apollo_HB = list(
hbDist = c(wtp_Anbieter2 = "N",
wtp_Anbieter3 = "N",
wtp_Strommix2 = "N",
wtp_Strommix3 = "N",
wtp_Strommix4 = "N",
wtp_Regioanteil2 = "N",
wtp_Regioanteil3 = "N",
b_Preis = "CN-",
# ----------------------------------------------------------------- #
#---- Anbieter2
# ----------------------------------------------------------------- #
## Sociodemographics
## Current
wtp_CurrentSupplierKVU_Anbieter2 = "F",
# ----------------------------------------------------------------- #
#---- Anbieter3
# ----------------------------------------------------------------- #
## Sociodemographics
## Current
wtp_CurrentSupplierBEG_Anbieter3 = "F",
# ----------------------------------------------------------------- #
#---- Strommix2
# ----------------------------------------------------------------- #
## Sociodemographics
## Current
wtp_CurrentMix_Strommix2 = "F",
# ----------------------------------------------------------------- #
#---- Strommix3
# ----------------------------------------------------------------- #
## Sociodemographics
## Current
wtp_CurrentMix_Strommix3 = "F",
# ----------------------------------------------------------------- #
#---- Strommix4
# ----------------------------------------------------------------- #
## Sociodemographics
## Current
wtp_CurrentMix_Strommix4 = "F",
# ----------------------------------------------------------------- #
#---- Regioanteil2
# ----------------------------------------------------------------- #
## Sociodemographics
wtp_Gender_Regioanteil2 = "F",
wtp_Age_Regioanteil2 = "F",
wtp_Education_Regioanteil2 = "F",
wtp_Residence_Regioanteil2 = "F",
wtp_FederalState.Wind_Regioanteil2 = "F",
wtp_FederalState.PV_Regioanteil2 = "F",
## Current
wtp_CurrentMix_Regioanteil2 = "F",
# ----------------------------------------------------------------- #
#---- Regioanteil3
# ----------------------------------------------------------------- #
## Sociodemographics
wtp_Gender_Regioanteil3 = "F",
wtp_Age_Regioanteil3 = "F",
wtp_Education_Regioanteil3 = "F",
wtp_Residence_Regioanteil3 = "F",
wtp_FederalState.Wind_Regioanteil3 = "F",
wtp_FederalState.PV_Regioanteil3 = "F",
## Current
wtp_CurrentMix_Regioanteil3 = "F",
# ----------------------------------------------------------------- #
#---- Preis
# ----------------------------------------------------------------- #
## Sociodemographics
b_Gender_Preis = "F",
b_Age_Preis = "F",
b_Income_Preis = "F",
b_Residence_Preis = "F",
## Current
b_PriceMonthly_centered_Preis = "F"
)
Because of the covariates, I implement transformations of the variables in apollo_probabilities(). For example:
Code: Select all
# ----------------------------------------------------------------- #
#---- Regioanteil2
# ----------------------------------------------------------------- #
wtp_Regioanteil2_value = wtp_Regioanteil2 +
## Sociodemographics
wtp_Gender_Regioanteil2 * Gender +
wtp_Age_Regioanteil2 * Age +
wtp_Education_Regioanteil2 * Education +
wtp_Residence_Regioanteil2 * Residence +
wtp_FederalState.Wind_Regioanteil2 * FederalState.Wind +
wtp_FederalState.PV_Regioanteil2 * FederalState.PV +
## Current
wtp_CurrentMix_Regioanteil2 * CurrentMix
# ----------------------------------------------------------------- #
#---- Preis
# ----------------------------------------------------------------- #
b_Preis_value = b_Preis +
## Sociodemographics
b_Gender_Preis * Gender +
b_Age_Preis * Age +
b_Income_Preis * Income +
b_Residence_Preis * Residence +
## Current
b_PriceMonthly_centered_Preis * PriceMonthly_centered
The utilities of alternatives then are:
Code: Select all
V = list()
V[['alt1']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.1 + wtp_Anbieter3_value * Anbieter3.1 +
wtp_Strommix2_value * Strommix2.1 + wtp_Strommix3_value * Strommix3.1 + wtp_Strommix4_value * Strommix4.1 +
wtp_Regioanteil2_value * Regioanteil2.1 + wtp_Regioanteil3_value * Regioanteil3.1 +
Preis.1)
V[['alt2']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.2 + wtp_Anbieter3_value * Anbieter3.2 +
wtp_Strommix2_value * Strommix2.2 + wtp_Strommix3_value * Strommix3.2 + wtp_Strommix4_value * Strommix4.2 +
wtp_Regioanteil2_value * Regioanteil2.2 + wtp_Regioanteil3_value * Regioanteil3.2 +
Preis.2)
V[['alt3']] = b_Preis_value * ( wtp_Anbieter2_value * Anbieter2.3 + wtp_Anbieter3_value * Anbieter3.3 +
wtp_Strommix2_value * Strommix2.3 + wtp_Strommix3_value * Strommix3.3 + wtp_Strommix4_value * Strommix4.3 +
wtp_Regioanteil2_value * Regioanteil2.3 + wtp_Regioanteil3_value * Regioanteil3.3 +
Preis.3)
I would like to check whether, on average, the covariates have a significant influence on the WTP and/or the price coefficient. For this reason, I would like to include the covariates with a fixed distribution. A normal distribution (“N” instead of “F” in the code above) for the covariates leads to a (considerable) improvement in the model fit, but the fit is bad for the holdouts, which could indicate overfitting?
The problem with my data is that the estimated parameters, especially wtp_Regioanteil2 and wtp_Regioanteil3, are so strongly influenced by the covariates that in this case they can hardly be interpreted meaningfully alone. For example, without covariates wtp_Regioanteil3 = -0.8, indicating a positive willingness-to-pay, but with covariates wtp_Regioanteil3 = 0.2. This means that not only the absolute value of the parameters changes, but sometimes even the sign. In my understanding, it therefore makes no sense to interpret wtp_Regioanteil3, but only the transformation wtp_Regioanteil3_value? If I have understood your answer correctly, then apollo_prediction() does just that, since it also uses the conditionals per respondent?
So I just can add the fixed covariate estimates to the respective parameter for each respondent? The same not only for the conditionals per respondent, but also for the upper model estimates?