MNL Model Estimation with covariate in WTP Space - Difficulty including covariate and to check correctness

NicoS · Post by **NicoS** » 19 Feb 2024, 23:14

Hi Stephane, David or other Forum users,

I currently have the issue that i cannot be sure whether my estimation results from WTP Space are correct, or if I made a mistake in the Utility functions of my R Script. My difficulty comes from not being familiar with including a covariate (Age) for all parameters into the function. I want to estimate in WTP Space (instead of delta method), because I do so with all my models and can then easily compare and have the same method.

The WTP Estimates are not totally off, but I am just not sure if they are calculated correctly or if there is an error in my specification.

To give you details, i provide the relevant code from my R Script, and output.
Rscript:

Code: Select all

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName = "MNL_Model3_WTPSpace",
  modelDescr = "MNL model 3 with sociodemographics Age in WTP Space",
  indivID = "ResponseId", 
  outputDirectory = "output"
)

### Loading data from package
database = as.data.frame(final_data)

### Define settings for MNL model component before model estimation
choiceAnalysis_settings = list(
  alternatives = c(oatdrink=1, soydrink =2, almonddrink=3, barleydrink=4, optout=5),
  avail = list(oatdrink=1, soydrink =1, almonddrink=1, barleydrink=1, optout=1),
  choiceVar = database$Choice,
  explanators  = database[,c("Age")]
)

### Run function to analyse choice data
apollo_choiceAnalysis(choiceAnalysis_settings, apollo_control, database)

### Coefficients
apollo_beta=c(wtp_asc_oatdrink = 0,
              wtp_asc_oatdrink_age = 0,
              wtp_asc_soydrink = 0,
              wtp_asc_soydrink_age = 0,
              wtp_asc_almonddrink = 0,
              wtp_asc_almonddrink_age = 0,
              wtp_asc_barleydrink = 0,
              wtp_asc_barleydrink_age = 0,
              asc_optout = 0,
              wtp_organic = 0,
              wtp_organic_age = 0,
              wtp_regional_Country = 0, #two coefficients, because it is a dummy variable
              wtp_regional_Country_age = 0,
              wtp_regional_FedState = 0,
              wtp_regional_FedState_age = 0,
              wtp_protein = 0,
              wtp_protein_age = 0,
              b_price = 0,
              b_price_age = 0
)

###one ASC needs to be fixed to 0 for the mnl to work
apollo_fixed = c("asc_optout")

apollo_inputs = apollo_validateInputs()

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Create alternative-specific constants and coefficients using interactions with age for each attribute
  asc_oatdrink_value = wtp_asc_oatdrink + wtp_asc_oatdrink_age * Age
  asc_soydrink_value = wtp_asc_soydrink + wtp_asc_soydrink_age * Age
  asc_almonddrink_value = wtp_asc_almonddrink + wtp_asc_almonddrink_age * Age
  asc_barleydrink_value = wtp_asc_barleydrink + wtp_asc_barleydrink_age * Age
  
  ### Coefficients for attributes with interactions with age
  b_organic_value  = wtp_organic + wtp_organic_age* Age
  b_regional_Country_value = wtp_regional_Country + wtp_regional_Country_age*Age
  b_regional_FedState_value = wtp_regional_FedState + wtp_regional_FedState_age*Age
  b_protein_value = wtp_protein + wtp_protein_age*Age
  b_price_value = b_price + b_price_age*Age
  
  
  ### List of utilities:
  V = list()
  V[["oatdrink"]] = b_price_value * (asc_oatdrink_value + b_organic_value * oatdrink.organic + b_regional_Country_value * (oatdrink.regional == 1) + b_regional_FedState_value * (oatdrink.regional == 0) + b_protein_value * oatdrink.proteinlow + oatdrink.price)
  V[["soydrink"]] = b_price_value * (asc_soydrink_value + b_organic_value * soydrink.organic + b_regional_Country_value * (soydrink.regional == 1) + b_regional_FedState_value * (soydrink.regional == 0) + b_protein_value * soydrink.proteinhigh + soydrink.price)
  V[["almonddrink"]] = b_price_value * (asc_almonddrink_value + b_organic_value * almonddrink.organic + b_protein_value * almonddrink.proteinlow + almonddrink.price)
  V[["barleydrink"]] = b_price_value * (asc_barleydrink_value + b_organic_value * barleydrink.organic + b_regional_Country_value * (barleydrink.regional == 1) + b_regional_FedState_value * (barleydrink.regional == 0) + b_protein_value * barleydrink.protein + barleydrink.price)
  V[["optout"]] = asc_optout
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives = c(oatdrink=1, soydrink =2, almonddrink=3, barleydrink=4, optout=5),
    avail = list(oatdrink=1, soydrink =1, almonddrink=1, barleydrink=1, optout=1),
    choiceVar = Choice,
    utilities = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

model = apollo_estimate(apollo_beta,
                        apollo_fixed,
                        apollo_probabilities,
                        apollo_inputs)

###Default Output
apollo_modelOutput(model)

Output:

Code: Select all

Model name                                  : MNL_Model3_WTPSpace
Model description                           : MNL model 3 with sociodemographics Age in WTP Space
Model run at                                : 2024-02-18 22:55:50.277622
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -2.570127
     reciprocal of condition number         : 4.49029e-07
Number of individuals                       : 275
Number of rows in database                  : 4125
Number of modelled outcomes                 : 4125

Number of cores used                        :  1 
Model without mixing

LL(start)                                   : -6638.93
LL at equal shares, LL(0)                   : -6638.93
LL at observed shares, LL(C)                : -6092.08
LL(final)                                   : -4657.3
Rho-squared vs equal shares                  :  0.2985 
Adj.Rho-squared vs equal shares              :  0.2958 
Rho-squared vs observed shares               :  0.2355 
Adj.Rho-squared vs observed shares           :  0.2332 
AIC                                         :  9350.6 
BIC                                         :  9464.45 

Estimated parameters                        : 18
Time taken (hh:mm:ss)                       :  00:01:5.64 
     pre-estimation                         :  00:00:35.76 
     estimation                             :  00:00:6.3 
          initial estimation                :  00:00:4.9 
          estimation after rescaling        :  00:00:1.41 
     post-estimation                        :  00:00:23.58 
Iterations                                  :  11  
     initial estimation                     :  10 
     estimation after rescaling             :  1 

Unconstrained optimisation.

Estimates:
                             Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
wtp_asc_oatdrink            -4.631705    0.294722    -15.7155    0.523573       -8.8463
wtp_asc_oatdrink_age         0.080941    0.009335      8.6705    0.016929        4.7812
wtp_asc_soydrink            -3.544120    0.375351     -9.4421    0.563221       -6.2926
wtp_asc_soydrink_age         0.077557    0.012028      6.4481    0.017948        4.3212
wtp_asc_almonddrink         -2.760941    0.294614     -9.3714    0.617303       -4.4726
wtp_asc_almonddrink_age      0.051136    0.009146      5.5913    0.020992        2.4360
wtp_asc_barleydrink         -2.772683    0.339844     -8.1587    0.538654       -5.1474
wtp_asc_barleydrink_age      0.063928    0.010823      5.9067    0.017879        3.5756
asc_optout                   0.000000          NA          NA          NA            NA
wtp_organic                 -0.386947    0.157769     -2.4526    0.276184       -1.4010
wtp_organic_age             -0.014698    0.005132     -2.8638    0.009181       -1.6008
wtp_regional_Country        -0.196627    0.198908     -0.9885    0.264015       -0.7448
wtp_regional_Country_age    -0.017065    0.006559     -2.6018    0.008611       -1.9817
wtp_regional_FedState        0.242446    0.208383      1.1635    0.313081        0.7744
wtp_regional_FedState_age   -0.033820    0.006884     -4.9128    0.010599       -3.1910
wtp_protein                 -0.051604    0.069761     -0.7397    0.070191       -0.7352
wtp_protein_age             -0.002489    0.002226     -1.1182    0.002262       -1.1007
b_price                     -1.311163    0.074166    -17.6787    0.127061      -10.3192
b_price_age                  0.009934    0.002033      4.8870    0.003558        2.7922

R Script from Preference Space in comparison:

Code: Select all

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName = "MNL_Model3",
  modelDescr = "MNL model 3 with sociodemographics Age",
  indivID = "ResponseId", 
  outputDirectory = "output"
)

### Loading data from package
database = as.data.frame(final_data)

### Define settings for MNL model component before model estimation
choiceAnalysis_settings = list(
  alternatives = c(oatdrink=1, soydrink =2, almonddrink=3, barleydrink=4, optout=5),
  avail = list(oatdrink=1, soydrink =1, almonddrink=1, barleydrink=1, optout=1),
  choiceVar = database$Choice,
  explanators  = database[,c("Age")]
)

### Run function to analyse choice data
apollo_choiceAnalysis(choiceAnalysis_settings, apollo_control, database)

### Coefficients
apollo_beta=c(asc_oatdrink = 0,
              asc_oatdrink_age = 0,
              asc_soydrink = 0,
              asc_soydrink_age = 0,
              asc_almonddrink = 0,
              asc_almonddrink_age = 0,
              asc_barleydrink = 0,
              asc_barleydrink_age = 0,
              asc_optout = 0,
              b_organic = 0,
              b_organic_age = 0,
              b_regional_Country = 0, #two coefficients, because it is a dummy variable
              b_regional_Country_age = 0,
              b_regional_FedState = 0,
              b_regional_FedState_age = 0,
              b_protein = 0,
              b_protein_age = 0,
              b_price = 0,
              b_price_age = 0
              )

###one ASC needs to be fixed to 0 for the mnl to work
apollo_fixed = c("asc_optout")

apollo_inputs = apollo_validateInputs()

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Create alternative-specific constants and coefficients using interactions with age for each attribute
  asc_oatdrink_value = asc_oatdrink + asc_oatdrink_age * Age
  asc_soydrink_value = asc_soydrink + asc_soydrink_age * Age
  asc_almonddrink_value = asc_almonddrink + asc_almonddrink_age * Age
  asc_barleydrink_value = asc_barleydrink + asc_barleydrink_age * Age
  
  ### Coefficients for attributes with interactions with age
  b_organic_value  = b_organic + b_organic_age* Age
  b_regional_Country_value = b_regional_Country + b_regional_Country_age*Age
  b_regional_FedState_value = b_regional_FedState + b_regional_FedState_age*Age
  b_protein_value = b_protein + b_protein_age*Age
  b_price_value = b_price + b_price_age*Age
  
  
  ### List of utilities:
  V = list()
  V[["oatdrink"]] = asc_oatdrink_value + b_organic_value * oatdrink.organic + b_regional_Country_value * (oatdrink.regional == 1) + b_regional_FedState_value * (oatdrink.regional == 0) + b_protein_value * oatdrink.proteinlow + b_price_value * oatdrink.price
  V[["soydrink"]] = asc_soydrink_value + b_organic_value * soydrink.organic + b_regional_Country_value * (soydrink.regional == 1) + b_regional_FedState_value * (soydrink.regional == 0) + b_protein_value * soydrink.proteinhigh + b_price_value * soydrink.price
  V[["almonddrink"]] = asc_almonddrink_value + b_organic_value * almonddrink.organic + b_protein_value * almonddrink.proteinlow + b_price_value * almonddrink.price
  V[["barleydrink"]] = asc_barleydrink_value + b_organic_value * barleydrink.organic + b_regional_Country_value * (barleydrink.regional == 1) + b_regional_FedState_value * (barleydrink.regional == 0) + b_protein_value * barleydrink.protein + b_price_value * barleydrink.price
  V[["optout"]] = asc_optout
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives = c(oatdrink=1, soydrink =2, almonddrink=3, barleydrink=4, optout=5),
    avail = list(oatdrink=1, soydrink =1, almonddrink=1, barleydrink=1, optout=1),
    choiceVar = Choice,
    utilities = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

model = apollo_estimate(apollo_beta,
                        apollo_fixed,
                        apollo_probabilities,
                        apollo_inputs)

###Default Output
apollo_modelOutput(model)

Output from Preference Space as comparison:

Code: Select all

Model name                                  : MNL_Model3
Model description                           : MNL model 3 with sociodemographics Age
Model run at                                : 2024-02-18 19:07:54.325985
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -2.973251
     reciprocal of condition number         : 4.06615e-07
Number of individuals                       : 275
Number of rows in database                  : 4125
Number of modelled outcomes                 : 4125

Number of cores used                        :  1 
Model without mixing

LL(start)                                   : -6638.93
LL at equal shares, LL(0)                   : -6638.93
LL at observed shares, LL(C)                : -6092.08
LL(final)                                   : -4660.81
Rho-squared vs equal shares                  :  0.298 
Adj.Rho-squared vs equal shares              :  0.2952 
Rho-squared vs observed shares               :  0.2349 
Adj.Rho-squared vs observed shares           :  0.2326 
AIC                                         :  9357.61 
BIC                                         :  9471.46 

Estimated parameters                        : 18
Time taken (hh:mm:ss)                       :  00:00:44.33 
     pre-estimation                         :  00:00:21.89 
     estimation                             :  00:00:4.15 
          initial estimation                :  00:00:3.39 
          estimation after rescaling        :  00:00:0.75 
     post-estimation                        :  00:00:18.29 
Iterations                                  :  10  
     initial estimation                     :  9 
     estimation after rescaling             :  1 

Unconstrained optimisation.

Estimates:
                           Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
asc_oatdrink               4.820786    0.289155     16.6720    0.491465        9.8090
asc_oatdrink_age          -0.084413    0.008205    -10.2883    0.014328       -5.8913
asc_soydrink               3.484251    0.345010     10.0990    0.509897        6.8332
asc_soydrink_age          -0.073216    0.009915     -7.3843    0.014099       -5.1929
asc_almonddrink            2.802598    0.279704     10.0199    0.563693        4.9719
asc_almonddrink_age       -0.050962    0.007761     -6.5663    0.017305       -2.9449
asc_barleydrink            2.745257    0.314256      8.7357    0.484154        5.6702
asc_barleydrink_age       -0.061041    0.009002     -6.7810    0.014435       -4.2286
asc_optout                 0.000000          NA          NA          NA            NA
b_organic                  0.713536    0.123677      5.7694    0.198213        3.5999
b_organic_age              0.003716    0.003686      1.0083    0.006095        0.6098
b_regional_Country         0.440345    0.173684      2.5353    0.220703        1.9952
b_regional_Country_age     0.008665    0.005294      1.6368    0.006394        1.3552
b_regional_FedState        0.121564    0.176510      0.6887    0.238249        0.5102
b_regional_FedState_age    0.021009    0.005338      3.9353    0.007085        2.9652
b_protein                  0.102096    0.062089      1.6444    0.059519        1.7153
b_protein_age            8.0493e-04    0.001827      0.4405    0.001756        0.4585
b_price                   -1.274035    0.078668    -16.1950    0.145215       -8.7734
b_price_age                0.008827    0.002205      4.0030    0.004251        2.0761

I would greatly appreciate your opinion, it would help me a lot!
Best, Nico

Post by **stephanehess** » 23 Feb 2024, 11:22

Nico

if you look at your partial derivatives of the utility functions, you will see that they are now different for the non price attributes given that you have placed the age interaction with cost on the parameter b_price_value

so now you are introducing scale heterogeneity as a function of age.

One alternative would be to have the age interaction with the price attribute, rather than the price coefficient

Stephane

ApolloChoiceModelling forum

MNL Model Estimation with covariate in WTP Space - Difficulty including covariate and to check correctness

MNL Model Estimation with covariate in WTP Space - Difficulty including covariate and to check correctness

Re: MNL Model Estimation with covariate in WTP Space - Difficulty including covariate and to check correctness