Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.

hybrid choice model with large positive opt-out ASC

Ask general questions about model specification and estimation that are not Apollo specific but relevant to Apollo users.
Post Reply
jackhan1208
Posts: 4
Joined: 28 Apr 2025, 21:12

hybrid choice model with large positive opt-out ASC

Post by jackhan1208 »

Hi Professor Hess,
Thank you so much in advance :D
Q1: Large positive ASC, only appears when adding demographic, even when only one demographic variable added, so it should not be multicollinearity right ? then would it be model misspecification or scaling issues?
Q2: can you recommend me any articles that provides good procedures to follow for continuous LV indicators ?
Q3: Whether this result indicates that there are non-linear relationship between utility and LV ?, I suspect this due to when one demographic variable loaded, the model still fail to compare with model result with random LV


This time I encountered another issue, which I do not know where to start to diagnose,
I'm running a hybrid choice model with 1 Latent variable, the latent variable is associated with various demographic variables, including education, age, income and gender.
The setting is 2000 sobol draws, I tried mlhs, different results in demographic, but still large positive opt-out ASC
here is the code

Code: Select all

# ---------------------------------------------------- #
#### DEFINE MODEL PARAMETERS ####
# ---------------------------------------------------- #
apollo_beta = c(
  b_asc_opt = -3.246779012,
  mu_price = -0.766091022,
  b_SD_price = 0.1,
  mu_nho = -0.89856159,
  b_SD_nho = 0.1,
  mu_gee = -0.552042064,
  b_SD_gee = 0.1,
  gamma_education =0.2, # base =less than high school
  gamma_age =0.1, # base =age group 1, continuous
  gamma_income =0,# base = income group 1 , continuous
  gamma_female = 0.1,# base =female, dummy
  lambda_tiss = 1,
  sigma_trust = 1,
  zeta_trust = 1
)

apollo_fixed = c()  # No fixed parameters

# ---------------------------------------------------- #
#### DEFINE RANDOM DRAWS ####
# ---------------------------------------------------- #
apollo_draws = list(
  interDrawsType = "sobol", 
  interNDraws = 2000,          
  interUnifDraws = c(),      
  interNormDraws = c("xi_price", "xi_nho", "xi_gee", "eta")
  
)


# ---------------------------------------------------- #
#### DEFINE RANDOM COEFFICIENTS ####
# ---------------------------------------------------- #
apollo_randCoeff = function(apollo_beta, apollo_inputs) {
  randcoeff = list()
  randcoeff[["b_price"]] = mu_price + exp(b_SD_price) * xi_price
  randcoeff[["b_nho"]]   = mu_nho   + exp(b_SD_nho)   * xi_nho
  randcoeff[["b_gee"]]   = mu_gee   + exp(b_SD_gee)   * xi_gee
  randcoeff[["LV_tiss"]] = gamma_education*education+eta
  return(randcoeff)
}

# ---------------------------------------------------- #
#### VALIDATE INPUTS ####
# ---------------------------------------------------- #
apollo_inputs = apollo_validateInputs()

# ---------------------------------------------------- #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ---------------------------------------------------- #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  P = list()
  
  ### INDICATOR MODEL
  normalDensity_settings1 = list(
    outcomeNormal = trustinscience,
    xNormal       = zeta_trust * LV_tiss,
    mu            = 0,
    sigma         = sigma_trust,
    rows          = (choice.set == 1)
  )
  P[["indic_trust"]] = apollo_normalDensity(normalDensity_settings1, functionality)
  P[["indic_trust"]] = apollo_panelProd(P[["indic_trust"]], apollo_inputs, functionality)
  
  ### CHOICE MODEL
  V = list()
  V[['alt1']] = b_price * price_1  + b_nho * nho_1 + b_gee * gee_1 + lambda_tiss * LV_tiss
  V[['alt2']] = b_price * price_2  + b_nho * nho_2 + b_gee * gee_2 + lambda_tiss * LV_tiss
  V[['alt3']] = b_asc_opt 
  
  mnl_settings = list(
    alternatives = c(alt1 = 1, alt2 = 2, alt3 = 3),
    avail = list(alt1 = 1, alt2 = 1, alt3 = 1),
    choiceVar = choice,
    V = V
  )
  ### Compute probabilities for MNL model component
  P[["choice"]] = apollo_mnl(mnl_settings, functionality)
  P[["choice"]] = apollo_panelProd(P[["choice"]]  , apollo_inputs, functionality)
  
  ### Likelihood of the whole model
  P = apollo_combineModels(P, apollo_inputs, functionality)
  
  ### Average across inter-individual draws
  P = apollo_avgInterDraws(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ---------------------------------------------------- #
#### MODEL ESTIMATION ####
# ---------------------------------------------------- #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
This is the result

Code: Select all

Model name                                  : HCM_Treatment1_MNL_2000draws
Model description                           : HCM model estimated in preference space Treatment 1 2000 draws using mlhs distribution
Model run at                                : 2025-06-06 03:58:28.178614
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -0.178741
     reciprocal of condition number         : 7.72943e-06
Number of individuals                       : 208
Number of rows in database                  : 2496
Number of modelled outcomes                 : 2704
                     indic_trust : 208
                          choice : 2496

Number of cores used                        :  20 
Number of inter-individual draws            : 2000 (sobol)

LL(start)                                   : -2355.1
LL (whole model) at equal shares, LL(0)     : NA
LL (whole model) at observed shares, LL(C)  : NA
LL(final, whole model)                      : -1752.42
Rho-squared vs equal shares                  :  Not applicable 
Adj.Rho-squared vs equal shares              :  Not applicable 
Rho-squared vs observed shares               :  Not applicable 
Adj.Rho-squared vs observed shares           :  Not applicable 
AIC                                         :  3532.83 
BIC                                         :  3579.56 

LL(0,indic_trust)                : Not applicable
LL(final,indic_trust)            : -263.32
LL(0,choice)                     : -2742.14
LL(final,choice)                 : -1490.18

Estimated parameters                        : 14
Time taken (hh:mm:ss)                       :  00:02:48.45 
     pre-estimation                         :  00:00:45.39 
     estimation                             :  00:00:36.95 
     post-estimation                        :  00:01:26.11 
Iterations                                  :  32  

Unconstrained optimisation.

Estimates:
                   Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
b_asc_opt          11.28392     2.32500       4.853     3.40070         3.318
mu_price           -2.64323     0.18138     -14.573     0.27898        -9.475
b_SD_price          0.75458     0.10400       7.255     0.16746         4.506
mu_nho             -2.15757     0.30177      -7.150     0.37525        -5.750
b_SD_nho            1.24080     0.10693      11.604     0.16064         7.724
mu_gee             -5.31314     0.54375      -9.771     0.71307        -7.451
b_SD_gee            1.66154     0.10727      15.489     0.12156        13.668
gamma_education     0.44642     0.05265       8.480     0.05703         7.827
gamma_age           0.22133     0.04297       5.151     0.04398         5.033
gamma_income        0.06990     0.02606       2.682     0.02545         2.746
gamma_female        0.83595     0.12783       6.539     0.12358         6.765
lambda_tiss         4.64404     0.51534       9.012     0.81086         5.727
sigma_trust         0.73317     0.05125      14.306     0.05037        14.555
zeta_trust          0.67336     0.04279      15.735     0.04506        14.944
Here is the result of single demographic variable :Education load on to LV

Code: Select all

Model name                                  : HCM_Treatment1_MNL_2000draws
Model description                           : HCM model estimated in preference space Treatment 1 2000 draws using mlhs distribution
Model run at                                : 2025-06-06 04:11:22.396455
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -0.293495
     reciprocal of condition number         : 4.12937e-05
Number of individuals                       : 208
Number of rows in database                  : 2496
Number of modelled outcomes                 : 2704
                     indic_trust : 208
                          choice : 2496

Number of cores used                        :  20 
Number of inter-individual draws            : 2000 (sobol)

LL(start)                                   : -2532.21
LL (whole model) at equal shares, LL(0)     : NA
LL (whole model) at observed shares, LL(C)  : NA
LL(final, whole model)                      : -1806.76
Rho-squared vs equal shares                  :  Not applicable 
Adj.Rho-squared vs equal shares              :  Not applicable 
Rho-squared vs observed shares               :  Not applicable 
Adj.Rho-squared vs observed shares           :  Not applicable 
AIC                                         :  3635.52 
BIC                                         :  3672.23 

LL(0,indic_trust)                : Not applicable
LL(final,indic_trust)            : -325.59
LL(0,choice)                     : -2742.14
LL(final,choice)                 : -1498.47

Estimated parameters                        : 11
Time taken (hh:mm:ss)                       :  00:02:44.23 
     pre-estimation                         :  00:01:2.84 
     estimation                             :  00:00:36.86 
     post-estimation                        :  00:01:4.54 
Iterations                                  :  35  

Unconstrained optimisation.

Estimates:
                   Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
b_asc_opt            7.4112     1.72364       4.300     2.31625         3.200
mu_price            -2.5499     0.17822     -14.307     0.28585        -8.920
b_SD_price           0.7049     0.10132       6.957     0.15391         4.580
mu_nho              -2.3786     0.27586      -8.623     0.33407        -7.120
b_SD_nho             1.1707     0.10353      11.308     0.16408         7.135
mu_gee              -5.2572     0.53627      -9.803     0.85324        -6.162
b_SD_gee             1.6507     0.11061      14.923     0.12685        13.013
gamma_education      0.6433     0.04095      15.709     0.07660         8.398
lambda_tiss          6.1140     0.77682       7.871     1.28269         4.767
sigma_trust          0.8864     0.06713      13.205     0.05932        14.944
zeta_trust           1.0469     0.06612      15.833     0.12055         8.684
I ran same setup, without demographic variables

Code: Select all

apollo_beta = c(
  b_asc_opt = -3.246779012
,
  mu_price = -0.766091022
,
  b_SD_price = 0,
  mu_nho = -0.89856159
,
  b_SD_nho = 0,
  mu_gee = -0.552042064
,
  b_SD_gee = 0,
  lambda_tiss = 0,
  sigma_trust = 1,
  zeta_trust = 1
)

apollo_fixed = c()  # No fixed parameters

# ---------------------------------------------------- #
#### DEFINE RANDOM DRAWS ####
# ---------------------------------------------------- #
apollo_draws = list(
  interDrawsType = "sobol", 
  interNDraws = 2000,          
  interUnifDraws = c(),      
  interNormDraws = c("xi_price", "xi_nho", "xi_gee", "eta")
  
)


# ---------------------------------------------------- #
#### DEFINE RANDOM COEFFICIENTS ####
# ---------------------------------------------------- #
apollo_randCoeff = function(apollo_beta, apollo_inputs) {
  randcoeff = list()
  randcoeff[["b_price"]] = mu_price + exp(b_SD_price) * xi_price

  randcoeff[["b_nho"]]   = mu_nho   + exp(b_SD_nho)   * xi_nho
  randcoeff[["b_gee"]]   = mu_gee   + exp(b_SD_gee)   * xi_gee
  randcoeff[["LV_tiss"]] = eta
  return(randcoeff)
}

# ---------------------------------------------------- #
#### VALIDATE INPUTS ####
# ---------------------------------------------------- #
apollo_inputs = apollo_validateInputs()

# ---------------------------------------------------- #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ---------------------------------------------------- #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  P = list()
  
  ### INDICATOR MODEL
  normalDensity_settings1 = list(
    outcomeNormal = trustinscience,
    xNormal       = zeta_trust * LV_tiss,
    mu            = 0,
    sigma         = sigma_trust,
    rows          = (choice.set == 1)
  )
  P[["indic_trust"]] = apollo_normalDensity(normalDensity_settings1, functionality)
  P[["indic_trust"]] = apollo_panelProd(P[["indic_trust"]], apollo_inputs, functionality)
  
  ### CHOICE MODEL
  V = list()
  V[['alt1']] = b_price * price_1  + b_nho * nho_1 + b_gee * gee_1 + lambda_tiss * LV_tiss
  V[['alt2']] = b_price * price_2  + b_nho * nho_2 + b_gee * gee_2 + lambda_tiss * LV_tiss
  V[['alt3']] = b_asc_opt
  
  mnl_settings = list(
    alternatives = c(alt1 = 1, alt2 = 2, alt3 = 3),
    avail = list(alt1 = 1, alt2 = 1, alt3 = 1),
    choiceVar = choice,
    V = V
  )
  ### Compute probabilities for MNL model component
  P[["choice"]] = apollo_mnl(mnl_settings, functionality)
  P[["choice"]] = apollo_panelProd(P[["choice"]]  , apollo_inputs, functionality)
  
  ### Likelihood of the whole model
  P = apollo_combineModels(P, apollo_inputs, functionality)
  
  ### Average across inter-individual draws
  P = apollo_avgInterDraws(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ---------------------------------------------------- #
#### MODEL ESTIMATION ####
# ---------------------------------------------------- #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

# ---------------------------------------------------- #
#### MODEL OUTPUTS ####
# ---------------------------------------------------- #
apollo_modelOutput(model)
this is what I get

Code: Select all

Model name                                  : HCM_Treatment1_MNL_2000draws
Model description                           : HCM model estimated in preference space Treatment 1 2000 draws using mlhs distribution
Model run at                                : 2025-06-06 04:04:41.439208
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -1.152679
     reciprocal of condition number         : 0.00140236
Number of individuals                       : 208
Number of rows in database                  : 2496
Number of modelled outcomes                 : 2704
                     indic_trust : 208
                          choice : 2496

Number of cores used                        :  20 
Number of inter-individual draws            : 2000 (sobol)

LL(start)                                   : -2705.08
LL (whole model) at equal shares, LL(0)     : NA
LL (whole model) at observed shares, LL(C)  : NA
LL(final, whole model)                      : -1979.19
Rho-squared vs equal shares                  :  Not applicable 
Adj.Rho-squared vs equal shares              :  Not applicable 
Rho-squared vs observed shares               :  Not applicable 
Adj.Rho-squared vs observed shares           :  Not applicable 
AIC                                         :  3978.37 
BIC                                         :  4011.75 

LL(0,indic_trust)                : Not applicable
LL(final,indic_trust)            : -545.61
LL(0,choice)                     : -2742.14
LL(final,choice)                 : -1445.13

Estimated parameters                        : 10
Time taken (hh:mm:ss)                       :  00:02:1.09 
     pre-estimation                         :  00:00:46.23 
     estimation                             :  00:00:26.2 
     post-estimation                        :  00:00:48.65 
Iterations                                  :  29  

Unconstrained optimisation.

Estimates:
               Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
b_asc_opt       -9.0799     0.73729     -12.315      1.0007        -9.074
mu_price        -2.7380     0.19892     -13.765      0.2748        -9.963
b_SD_price       0.6838     0.09073       7.536      0.1193         5.733
mu_nho          -2.4211     0.30978      -7.816      0.3899        -6.210
b_SD_nho         1.3490     0.10098      13.360      0.1537         8.778
mu_gee          -5.8997     0.57324     -10.292      0.6909        -8.539
b_SD_gee         1.7092     0.10093      16.935      0.1176        14.531
lambda_tiss      7.1680     0.62870      11.401      0.8003         8.957
sigma_trust      3.0837     0.17797      17.327      0.1338        23.056
zeta_trust       1.2612     0.30592       4.123      0.3069         4.110
Regards
Jack
stephanehess
Site Admin
Posts: 1338
Joined: 24 Apr 2020, 16:29

Re: hybrid choice model with large positive opt-out ASC

Post by stephanehess »

Hi

when you add socio-demographics, the mean of your LV is no longer zero, and when you then add it to the first two alternatives, the means of those become more positive, and the model compensates with the ASC for the opt-out

but none of this is an issue. only differences in utility matter. You cannot interpret the ASC on its own anyway

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
jackhan1208
Posts: 4
Joined: 28 Apr 2025, 21:12

Re: hybrid choice model with large positive opt-out ASC

Post by jackhan1208 »

Thank you Professor stephane !
I understand the issue now.

I have another question, which im concerning of, when using hybrid choice model, I looked up different literatures. majority of the study used scale items as indicator with ordinal logit model. However, in my case, my scale, called "trust in science and scientists" contains 21 items (1-5 likert scale), I tried to estimate this, using ordinal logit model, but it take ages to even see the first iteration. so I realized it is not really feasible to do it in this way.

Then I thought maybe I should use the mean of all scale items and treated it as a single continuous indicator ( as I have justify its structure through factor analysis and ran CFA test), however, when it comes to identification, I don't know if I did it properly or not. I used raw data which contains value between 1-5 first, the model seems extremely hard to converge, hence I tried following two approaches:

My first approach is to centering the individual result by mean , mean trust of individual - population mean (trust),
this way, I use the individual at the average trust as base, the results are interpretable, indicator shows valid sign (significant and positive, as I expected), but I don't know if I should use standardization to change the interpretation or not or whether this approach is appropriate.

The second approach, is bit tricky and I dont even know if this is allowed, I used is that shift the base group to 1, this would shift the individual mean by -1, then I will have individual = 0 as base instead of the original 0 based on 1-5 likert scale that does not exists in data, but I dont know if this approach is correct way to do or not.

There are limited studies using continuous indicator, and I rarely seen if anyone tried to centering or normalize it.
Would you mind justify my approach if possible,

Kind regards
Jack
stephanehess
Site Admin
Posts: 1338
Joined: 24 Apr 2020, 16:29

Re: hybrid choice model with large positive opt-out ASC

Post by stephanehess »

Jack

you have a number of options:

- use a continuous measure model instead of ordinal, even though your indicators are ordinal. So replace apollo_ol with apollo_normalDensity. You can then also mean centre each indicator
- extract the factor score for each person, and mean centre that on zero, and use a continuous measurement model

In both cases, you would estimate zeta (impact of LV on indicator) and sigma (std dev of indicator)

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply