I'm estimating preferences for ecosystem services using an MNL model. The choice experiment consists of four categorical attributes with two levels each (plus a "no change" option) and a monetary individual cost attribute with five levels. I've followed the MNL_SP example for building my model. I'm using the latest version of Apollo (2.8)
Code: Select all
### Load Apollo library
library(apollo)
rm(list = ls())
### Initialise code
apollo_initialise()
### Set core controls
apollo_control = list(
modelName = "MNL_fee",
modelDescr = "Simple MNL model on NBS fees",
indivID = "ID",
outputDirectory = "output"
)
### Loading data from package
setwd(file.path("C:", "Users", "mna", "NIVA", "190070-DESCIPHER - General", "Valuation survey", "Dataset"))
database = read.csv("dataset_fee_v2.csv", header = TRUE)
head(database)
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta <- c(
asc_a = 0,
asc_b = 0,
b_water_25 = 0,
b_water_50 = 0,
b_temp_2 = 0,
b_temp_4 = 0,
b_veg_knee = 0,
b_veg_hip = 0,
b_partly_nat = 0,
b_nearly_nat = 0,
b_fee = 0
)
### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta
apollo_fixed = c()
### Group and validate inputs
apollo_inputs = apollo_validateInputs()
### Probability function
apollo_probabilities <- function(apollo_beta, apollo_inputs, functionality = "estimate") {
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P <- list()
V <- list()
V[["alt1"]] <- apollo_beta[["asc_a"]] +
apollo_beta[["b_water_25"]] * water_25_1 +
apollo_beta[["b_temp_2"]] * temp_2_1 +
apollo_beta[["b_veg_knee"]] * veg_knee_1 +
apollo_beta[["b_partly_nat"]] * partly_nat_1 +
apollo_beta[["b_fee"]] * fee_1
V[["alt2"]] <- apollo_beta[["asc_b"]] +
apollo_beta[["b_water_50"]] * water_50_2 +
apollo_beta[["b_temp_4"]] * temp_4_2 +
apollo_beta[["b_veg_hip"]] * veg_hip_2 +
apollo_beta[["b_nearly_nat"]] * nearly_nat_2 +
apollo_beta[["b_fee"]] * fee_2
V[["alt3"]] <- 0
### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
### Define settings for MNL model component
mnl_settings <- list(
alternatives = c(alt1 = 1,alt2 = 2, alt3 = 0),
avail = list(alt1 = 1, alt2 = 1, alt3 = 1),
choiceVar = choice,
V = V
)
### Compute probabilities using MNL model
P[["model"]] <- apollo_mnl(mnl_settings, functionality)
### Take product across observation for same individual
P <- apollo_panelProd(P, apollo_inputs, functionality)
### Prepare and return outputs of function
P <- apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
### Model estimation
model <- apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
### Model outputs
apollo_modelOutput(model, list(printPVal = TRUE))
apollo_saveOutput(model)
- The scaling of the model led to differences in the results, and was thus not used. This is unexpected. You may want to contact the developers.
- Parameters could not be scaled for the Hessian calculation.
- Hessian calculation failed using numerical methods (maxLik).
- ERROR: Hessian could not be calculated.
Here are the outputs:
Code: Select all
Preparing user-defined functions.
Testing likelihood function...
Overview of choices for MNL model component :
alt1 alt2 alt3
Times available 4626.00 4626.00 4626.00
Times chosen 1879.00 2173.00 574.00
Percentage chosen overall 40.62 46.97 12.41
Percentage chosen when available 40.62 46.97 12.41
Pre-processing likelihood function...
Analytical gradient is different to numerical one. Numerical gradients will be used.
Testing influence of parameters...........
Starting main estimation
Initial function value: -5082.18
Initial gradient value:
asc_a asc_b b_water_25 b_water_50 b_temp_2 b_temp_4 b_veg_knee b_veg_hip b_partly_nat b_nearly_nat b_fee
337.000000 631.000001 108.000000 685.666667 -9.999999 528.666667 266.000000 63.666667 61.666668 362.666668 52903.333321
initial value 5082.180447
iter 2 value 4958.495744
iter 3 value 4848.945846
iter 4 value 4550.495268
iter 5 value 4485.415058
iter 6 value 4415.217617
iter 7 value 4390.223079
iter 8 value 4379.445850
iter 9 value 4374.283925
iter 10 value 4364.579324
iter 11 value 4344.860928
iter 12 value 4188.210236
iter 13 value 4102.243056
iter 14 value 4077.049302
iter 15 value 4073.983863
iter 16 value 4071.202714
iter 17 value 4071.166928
iter 18 value 4071.164119
iter 19 value 4071.164023
iter 19 value 4071.164009
iter 19 value 4071.164009
final value 4071.164009
converged
Additional convergence test using scaled estimation. Parameters will be scaled by their current estimates and additional iterations will be performed.
#########################################################################################################################################################################################
The scaling of the model led to differences in the results, and was thus not used. This is unexpected. You may want to contact the developers.
... current process will resume in 5 seconds unless interrupted by the user
#########################################################################################################################################################################################
Estimated parameters:
Estimate
asc_a 0.729641
asc_b 0.579150
b_water_25 0.774136
b_water_50 1.677109
b_temp_2 0.253043
b_temp_4 0.833418
b_veg_knee 0.248553
b_veg_hip -1.175075
b_partly_nat 0.299125
b_nearly_nat 0.407710
b_fee -0.001444
Final LL: -4071.164
Calculating log-likelihood at equal shares (LL(0)) for applicable models...
Calculating log-likelihood at observed shares from estimation data (LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Parameters could not be scaled for the Hessian calculation.
Computing covariance matrix using numerical methods (numDeriv).
0%....25%....50%....75%....100% (121 NA values)
Computing covariance matrix using numerical methods (maxLik). This may take a while, no progress bar displayed.
Hessian calculation failed using numerical methods (maxLik).
ERROR: Hessian could not be calculated.
Computing score matrix...
Model run by MNA using Apollo 0.2.8 on R 4.2.0 for Windows.
www.ApolloChoiceModelling.com
Model name : MNL_fee
Model description : Simple MNL model on NBS fees
Model run at : 2023-05-30 12:57:24
Estimation method : bfgs
Model diagnosis : successful convergence
Number of individuals : 513
Number of rows in database : 4626
Number of modelled outcomes : 4626
Number of cores used : 1
Model without mixing
LL(start) : -5082.18
LL at equal shares, LL(0) : -5082.18
LL at observed shares, LL(C) : -4532.61
LL(final) : -4071.16
Rho-squared vs equal shares : 0.1989
Adj.Rho-squared vs equal shares : 0.1968
Rho-squared vs observed shares : 0.1018
Adj.Rho-squared vs observed shares : 0.0994
AIC : 8164.33
BIC : 8235.16
Estimated parameters : 11
Time taken (hh:mm:ss) : 00:00:8.22
pre-estimation : 00:00:0.33
estimation : 00:00:6.5
post-estimation : 00:00:1.38
Iterations : 22
Unconstrained optimisation.
These outputs have had the scaling used in estimation applied to them.
Estimates:
Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e. Rob.t.rat.(0) p(1-sided)
asc_a 0.729641 NA NA NA NA NA NA
asc_b 0.579150 NA NA NA NA NA NA
b_water_25 0.774136 NA NA NA NA NA NA
b_water_50 1.677109 NA NA NA NA NA NA
b_temp_2 0.253043 NA NA NA NA NA NA
b_temp_4 0.833418 NA NA NA NA NA NA
b_veg_knee 0.248553 NA NA NA NA NA NA
b_veg_hip -1.175075 NA NA NA NA NA NA
b_partly_nat 0.299125 NA NA NA NA NA NA
b_nearly_nat 0.407710 NA NA NA NA NA NA
b_fee -0.001444 NA NA NA NA NA NA
When I add the following code, I get the estimates that align with my hypotheses.
Code: Select all
# estimate settings
estimate_settings <- list(hessianRoutine = 'numDeriv', scaleAfterConvergence = TRUE, scaleHessian = FALSE)
Code: Select all
Preparing user-defined functions.
Testing likelihood function...
Overview of choices for MNL model component :
alt1 alt2 alt3
Times available 4626.00 4626.00 4626.00
Times chosen 1879.00 2173.00 574.00
Percentage chosen overall 40.62 46.97 12.41
Percentage chosen when available 40.62 46.97 12.41
Pre-processing likelihood function...
Analytical gradient is different to numerical one. Numerical gradients will be used.
Testing influence of parameters...........
Starting main estimation
Initial function value: -5082.18
Initial gradient value:
asc_a asc_b b_water_25 b_water_50 b_temp_2 b_temp_4 b_veg_knee b_veg_hip b_partly_nat b_nearly_nat b_fee
337.000000 631.000001 108.000000 685.666667 -9.999999 528.666667 266.000000 63.666667 61.666668 362.666668 52903.333321
initial value 5082.180447
iter 2 value 4958.495744
iter 3 value 4848.945846
iter 4 value 4550.495268
iter 5 value 4485.415058
iter 6 value 4415.217617
iter 7 value 4390.223079
iter 8 value 4379.445850
iter 9 value 4374.283925
iter 10 value 4364.579324
iter 11 value 4344.860928
iter 12 value 4188.210236
iter 13 value 4102.243056
iter 14 value 4077.049302
iter 15 value 4073.983863
iter 16 value 4071.202714
iter 17 value 4071.166928
iter 18 value 4071.164119
iter 19 value 4071.164023
iter 19 value 4071.164009
iter 19 value 4071.164009
final value 4071.164009
converged
Additional convergence test using scaled estimation. Parameters will be scaled by their current estimates and additional iterations will be performed.
#########################################################################################################################################################################################
The scaling of the model led to differences in the results, and was thus not used. This is unexpected. You may want to contact the developers.
... current process will resume in 5 seconds unless interrupted by the user
#########################################################################################################################################################################################
Estimated parameters:
Estimate
asc_a 0.729641
asc_b 0.579150
b_water_25 0.774136
b_water_50 1.677109
b_temp_2 0.253043
b_temp_4 0.833418
b_veg_knee 0.248553
b_veg_hip -1.175075
b_partly_nat 0.299125
b_nearly_nat 0.407710
b_fee -0.001444
Final LL: -4071.164
Calculating log-likelihood at equal shares (LL(0)) for applicable models...
Calculating log-likelihood at observed shares from estimation data (LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using numerical methods (numDeriv).
0%....25%....50%....75%....100%
Negative definite Hessian with maximum eigenvalue: -42.993009
Computing score matrix...
Model run by MNA using Apollo 0.2.8 on R 4.2.0 for Windows.
www.ApolloChoiceModelling.com
Model name : MNL_fee
Model description : Simple MNL model on NBS fees
Model run at : 2023-05-30 13:03:36
Estimation method : bfgs
Model diagnosis : successful convergence
Number of individuals : 513
Number of rows in database : 4626
Number of modelled outcomes : 4626
Number of cores used : 1
Model without mixing
LL(start) : -5082.18
LL at equal shares, LL(0) : -5082.18
LL at observed shares, LL(C) : -4532.61
LL(final) : -4071.16
Rho-squared vs equal shares : 0.1989
Adj.Rho-squared vs equal shares : 0.1968
Rho-squared vs observed shares : 0.1018
Adj.Rho-squared vs observed shares : 0.0994
AIC : 8164.33
BIC : 8235.16
Estimated parameters : 11
Time taken (hh:mm:ss) : 00:00:8.13
pre-estimation : 00:00:0.32
estimation : 00:00:6.42
post-estimation : 00:00:1.38
Iterations : 22
Min abs eigenvalue of Hessian : 42.99301
Unconstrained optimisation.
These outputs have had the scaling used in estimation applied to them.
Estimates:
Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e. Rob.t.rat.(0) p(1-sided)
asc_a 0.729641 0.07059 10.336 0.000000 0.13926 5.239 8.052e-08
asc_b 0.579150 0.07508 7.714 6.106e-15 0.14207 4.077 2.285e-05
b_water_25 0.774136 0.09849 7.860 1.887e-15 0.05649 13.704 0.000000
b_water_50 1.677109 0.09611 17.450 0.000000 0.08484 19.768 0.000000
b_temp_2 0.253043 0.09457 2.676 0.003730 0.07810 3.240 5.9758e-04
b_temp_4 0.833418 0.08270 10.078 0.000000 0.07813 10.667 0.000000
b_veg_knee 0.248553 0.07375 3.370 3.7566e-04 0.06755 3.680 1.1678e-04
b_veg_hip -1.175075 0.09342 -12.578 0.000000 0.09483 -12.392 0.000000
b_partly_nat 0.299125 0.08829 3.388 3.5205e-04 0.06697 4.467 3.973e-06
b_nearly_nat 0.407710 0.09349 4.361 6.473e-06 0.08558 4.764 9.493e-07
b_fee -0.001444 4.0294e-04 -3.585 1.6885e-04 5.4880e-04 -2.632 0.004246
Do you know what causes the issue and how I may solve it (not through setting scaleHessian = FALSE)? I've tried building up the model gradually with initially just including a single parameter, but the issue remains the same.
Thanks a lot for your help!
Greetings to Leeds - I've got fond memories from my PhD at Earth and Environment
Best wishes,
Max