post-estimation scenario analysis

Bob123 · Post by **Bob123** » 20 Sep 2023, 11:36

Dear Stephane & David,

I have some questions on producing hypothetical scenarios post-estimation as an extension of predictions.

I have looked through the manual, examples and forum - the only thing I can find was a previous similar question on the forum which was never answered (here: viewtopic.php?p=2150&hilit=scenario#p2150). Apologies if I have missed anything simple in advance.

Background: I have an unlabelled experiment with 2 choices: Drug A and Drug B. Each have attributes (including WAIT TIME, numerical variable) and there are also important scenario attributes (e.g. QoL (categorical) and Life Expectancy (numerical)) which are added as interactions (because they're constant across alternatives). I have used Apollo Predictions to calculate impact on choice share (A/B) on increasing Wait time with no issue.

I would now like to create some hypothetical (policy-relevant) scenarios and look at the impact on choice share of changing certain attributes - I guess similar to a sub-group analysis (but instead of using actual data, use the model to predict pre- and post- when different attributes are changed. As an example, I might like to have in Scenario A: individuals have "X" QoL and "X" Life Expectancy and the Drug A attributes also have certain levels, I would then like to show what would be the impact of increasing wait time versus when the drug attributes have different level etc.

The purpose of doing this would be to create "policy-relevant" scenarios that would be easier for people to interpret. It might also serve as a validity check by looking at the model base prediction to see if it is as we would expect when certain attributes are fixed.

Is there a way to use specify multiple attributes in the predictions_base and then change multiple attribute in predictions_new? For example, when changing the database to show an increase in DrugA_WAIT of 1 year, could I also specify that QoL = 1 and Life = 2:

E.g. what I currently have:
database$DrugA_WAIT=1+database$DrugA_WAIT

I would then add something like database$QOL==1 to select only the cases where QOL =1 etc.?

Thanks in advance for your advice,

Rob

Bob123 · Post by **Bob123** » 22 Sep 2023, 14:50

As an extension of my question above, I have the following code which might better illustrate what I am trying to do and an example scenario.

Where I have added "[Q1]" to the code, I am trying to set up a scenario for a model prediction where one of my categorical variables is fixed at level 3. I then double the WAIT time in the way you suggest. Is this method valid/ agreeable with you or should I be combining the arguments together rather than adding one after the other.

Where I have added "[Q2]" to the code, I am trying to show the mean impact on choice share for the above scenario when LIFE=3, and QOL=3 using logic (by using "&").

The purpose for doing this is to use the model to specify unique cases where I can see if it responds as expected to changes (e.g. in change to WAIT) i.e., rather a validity check that for forecasting purposes.

Code: Select all

      # ----------------------------------------------------------------- #
      #---- PREDICTIONS [3]                                               ----
      # ----------------------------------------------------------------- #
      
      ### Use the estimated model to make predictions
      predictions_base = apollo_prediction(model, apollo_probabilities, apollo_inputs, prediction_settings=list(runs=30))
      
      ### Now imagine a scenario where WAIT for Drug A increases by double, AND DrugA_Benefit (a categorical variable) = 3
[Q1] 
      database$DrugA_Benefit <- 3
      database$DrugA_WAIT = 2*database$DrugA_WAIT
      
      ### Rerun predictions with the new data
      apollo_inputs = apollo_validateInputs()
      predictions_new = apollo_prediction(model, apollo_probabilities, apollo_inputs)
      
      ### Return to original data
      
      ?not sure how I would return DrugA_Benefit back to normal?
      database$DrugA_WAIT = 1/2*database$DrugA_WAIT
      apollo_inputs = apollo_validateInputs()
      
      ### work with predictions at estimates
      predictions_base=predictions_base[["at_estimates"]]
      
      ### Compute change in probabilities
      change=(predictions_new-predictions_base)/predictions_base
      
      ### Not interested in chosen alternative now, so drop last column
      change=change[,-ncol(change)]
      
      ### First two columns (change in ID and task) also not needed
      change=change[,-c(1,2)]
      
      ### And person 9, who has all 4 modes available
      # change[database$ID==9,]
      
      ### Summary of changes (possible presence of NAs for unavailable alternatives)
      summary(change)
      
      ### Look at mean changes for subsets of the data, ignoring NAs
      colMeans(change,na.rm=TRUE)
      
      ### Look at mean changes for an increase in WAIT when LIFE = 1, ignoring NAs
      colMeans(subset(change,database$LIFE==1),na.rm=TRUE)
      
      ### Look at mean changes for an increase in WAIT when LIFE = 2, ignoring NAs
      colMeans(subset(change,database$LIFE==2),na.rm=TRUE)
      
      ### Look at mean changes for an increase in WAIT when LIFE = 3 AND QOL = 3, ignoring NAs
[Q2]  
      colMeans(subset(change,database$LIFE==3 & database$QOL==3 ),na.rm=TRUE)

Finally, in a previous post (viewtopic.php?t=456) you suggested the following code database$DumVar1_Alt1==2 = database$DumVar1_Alt1==1. In my [Q1] above I think I am trying to achieve similar but changing all to 1 rather than just 2.

Thanks in advance for any advice which is much appreciated!

Rob

Post by **stephanehess** » 03 Oct 2023, 09:58

Hi

first, you mention:

in a previous post (viewtopic.php?t=456) you suggested the following code database$DumVar1_Alt1==2 = database$DumVar1_Alt1==1. In my [Q1] above I think I am trying to achieve similar but changing all to 1 rather than just 2.

Have another look at that post. I don't actually suggest that code but say that it would be wrong.

What you are doing below would seem to work, but when you go back to the original data, you indeed also need to return the benefits to the original. The easiest way would be to keep a backup of the original data before changes, e.g. database_backup=database, and then return to that later

Stephane

Bob123 · Post by **Bob123** » 05 Oct 2023, 18:41

Dear Stephane,

Thank you for your reply, the database backup works, however I am having some issues with working with predictions at estimates.

In the Apollo manual (Fig. 9.9), when running predictions, the mean change calculated after you change the cost of rail by 1% corresponds with the relative change in average prob. before/after. However this is not the case in my code.

In the code below, the prob. of choosing Drug A before any changes to the database is 0.53 and the average prob. after I change a variable Wait by 1 additional year is 0.43. However, when I calculate the difference using the code given in the Apollo manual, it gives a mean of -0.22 which does not reflect this relative difference before/after. (0.4312-0.5384/0.0384 = -0.199)? I can not then accurately look at mean changes at estimates and cannot work out where -0.22 is coming from?

Similarly the prob of choosing Drug B goes from 0.4625 to 0.5688 (relative change of 0.229) but in the output the change is given as 0.29?

Even if I change the code so it is the absolute change rather than relative change i.e., I use: change=(predictions_new-predictions_base) instead of change=(predictions_new-predictions_base)/predictions_base, it still doesn't calculate the correct difference?

Any advice would be much appreciated as I've ran it several times and cannot get round it but need to work with predictions at estimates.

Thank you in advance,

Rob

Code: Select all

 # ----------------------------------------------------------------- #
> #---- PREDICTIONS [3]                                               ----
> # ----------------------------------------------------------------- #
> 
> ### Use the estimated model to make predictions
> predictions_base = apollo_prediction(model, apollo_probabilities, apollo_inputs, prediction_settings=list(runs=30))
Running predictions from model using parameter estimates...
Running predictions across draws from the asymptotic distribution for maximum likelihood estimates.
Predicting for set of draws 1/30...
Predicting for set of draws 2/30...
Predicting for set of draws 3/30...
Predicting for set of draws 4/30...
Predicting for set of draws 5/30...
Predicting for set of draws 6/30...
Predicting for set of draws 7/30...
Predicting for set of draws 8/30...
Predicting for set of draws 9/30...
Predicting for set of draws 10/30...
Predicting for set of draws 11/30...
Predicting for set of draws 12/30...
Predicting for set of draws 13/30...
Predicting for set of draws 14/30...
Predicting for set of draws 15/30...
Predicting for set of draws 16/30...
Predicting for set of draws 17/30...
Predicting for set of draws 18/30...
Predicting for set of draws 19/30...
Predicting for set of draws 20/30...
Predicting for set of draws 21/30...
Predicting for set of draws 22/30...
Predicting for set of draws 23/30...
Predicting for set of draws 24/30...
Predicting for set of draws 25/30...
Predicting for set of draws 26/30...
Predicting for set of draws 27/30...
Predicting for set of draws 28/30...
Predicting for set of draws 29/30...
Predicting for set of draws 30/30...

Aggregated prediction
      at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
DrugA   5621         5611            59.01           5527           5722
DrugB   4819         4829            59.01           4718           4913

Average prediction
      at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
DrugA 0.5384       0.5375         0.005652         0.5294         0.5481
DrugB 0.4616       0.4625         0.005652         0.4519         0.4706

The output from apollo_prediction is a list with two elements: a data.frame containing the predictions at the estimated values, and an array with predictions for
  different values of the parameters drawn from their asymptotic distribution.
> summary(predictions_base)
             Length Class      Mode   
at_estimates      5 data.frame list   
draws        939600 -none-     numeric
> 
> ###Create a database backup to return to original data once predictions are run
> database_backup=database
> ### Imagine adding 1 year to WAIT
> database$DrugA_WAIT = database$DrugA_WAIT + 1
> ### Rerun predictions with the new data
> apollo_inputs = apollo_validateInputs()
Several observations per individual detected based on the value of RID. Setting panelData in apollo_control set to TRUE.
All checks on apollo_control completed.
All checks on database completed.
> predictions_new = apollo_prediction(model, apollo_probabilities, apollo_inputs)
Running predictions from model using parameter estimates...
Prediction at model estimates
            DrugA   DrugB
Aggregate 4501.95 5938.05
Average      0.43    0.57

The output from apollo_prediction is a matrix containing the predictions at the estimated values.
> summary(predictions_new)
       ID           Observation        DrugA            DrugB            chosen      
 Min.   :      9   Min.   : 1.00   Min.   :0.1055   Min.   :0.1796   Min.   :0.1055  
 1st Qu.:   1016   1st Qu.: 3.75   1st Qu.:0.2703   1st Qu.:0.4345   1st Qu.:0.4245  
 Median :   1942   Median : 6.50   Median :0.4011   Median :0.5989   Median :0.5802  
 Mean   :  56227   Mean   : 6.50   Mean   :0.4312   Mean   :0.5688   Mean   :0.5686  
 3rd Qu.:   2688   3rd Qu.: 9.25   3rd Qu.:0.5655   3rd Qu.:0.7297   3rd Qu.:0.7518  
 Max.   :1000335   Max.   :12.00   Max.   :0.8204   Max.   :0.8945   Max.   :0.8945  
> ### Return to original database (undo all changes) using the database_backup
> database=database_backup
> ### validate inputs
> apollo_inputs = apollo_validateInputs()
Several observations per individual detected based on the value of RID. Setting panelData in apollo_control set to TRUE.
All checks on apollo_control completed.
All checks on database completed.
> 
> ### work with predictions at estimates
> predictions_base=predictions_base[["at_estimates"]]
> 
> ### Compute change in probabilities
> change=(predictions_new-predictions_base)/predictions_base
> #view without last bit:
> summary(change)
       ID     Observation     DrugA              DrugB             chosen          
 Min.   :0   Min.   :0    Min.   :-0.52091   Min.   :0.03785   Min.   :-0.5209052  
 1st Qu.:0   1st Qu.:0    1st Qu.:-0.29001   1st Qu.:0.16212   1st Qu.:-0.1974178  
 Median :0   Median :0    Median :-0.21251   Median :0.24980   Median :-0.0917861  
 Mean   :0   Mean   :0    Mean   :-0.22367   Mean   :0.30530   Mean   :-0.0006838  
 3rd Qu.:0   3rd Qu.:0    3rd Qu.:-0.15473   3rd Qu.:0.35399   3rd Qu.: 0.2091958  
 Max.   :0   Max.   :0    Max.   :-0.04592   Max.   :1.14231   Max.   : 1.1423098  
> ### Not interested in chosen alternative now, so drop last column
> change=change[,-ncol(change)]
> 
> ### First two columns (change in ID and task) also not needed
> change=change[,-c(1,2)]
> 
> ### And person 9, who has all 4 modes available
> # change[database$ID==9,]
> 
> ### Summary of changes (possible presence of NAs for unavailable alternatives)
> summary(change)
     DrugA              DrugB        
 Min.   :-0.52091   Min.   :0.03785  
 1st Qu.:-0.29001   1st Qu.:0.16212  
 Median :-0.21251   Median :0.24980  
 Mean   :-0.22367   Mean   :0.30530  
 3rd Qu.:-0.15473   3rd Qu.:0.35399  
 Max.   :-0.04592   Max.   :1.14231  
> 
> ### Look at mean changes for subsets of the data, ignoring NAs
> colMeans(change,na.rm=TRUE)
     DrugA      DrugB 
-0.2236692  0.3053020 
> 
> ### Look at mean changes if LIFE = 1, ignoring NAs
> colMeans(subset(change,database$QOL==4),na.rm=TRUE)
     DrugA      DrugB 
-0.2371106  0.2999949 
>

Post by **stephanehess** » 09 Oct 2023, 11:46

Hi

the answer to your question is that you are comparing a mean of ratios approach to a ratio of means. The change in the code below is calculated at the observation level, and averaging that across observations is not going to be the same as the change in the average probabilities

Stephane

alvinej · Post by **alvinej** » 16 May 2024, 13:08

stephanehess wrote: ↑09 Oct 2023, 11:46 Hi stephane

I have a question regarding post estimation scenario analysis
my question is how to calculate WTP for different scenarios after model estimation.

Post by **stephanehess** » 17 May 2024, 13:08

This depends entirely on your model specification and what you mean exactly with "different scenarios". The WTP will only be different if the parameters are different for these different scenarios

alvinej · Post by **alvinej** » 20 May 2024, 11:32

hi stephane,

sorry for the confusion
i have some question regarding latent class modelling, willingness to pay and post estimation analysis

i have a model with 2 classes,

questions
1. regarding willingness to pay, in my model, for both class purchase price is a non random parameter and for class 2, operating cost (oc) is following a negative log normal distribution, charging time (ct) follows a negative log uniform distribution and charging facility (cf) follows a triangular distribution, how can i calculate willingness to pay for these parameters ?

2.

Summary of class allocation for model component :
Mean prob.
Class_1 0.5306
Class_2 0.4694

is this means that out of my data 53% belongs to class 1, and 47% belongs to class 2 ?

3.
please see the code below

Code: Select all

parallel::detectCores()

#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
# ################################################################# #

### Clear memory
rm(list = ls())

### Load Apollo library
library(apollo)
library(foreach)
library(iterators)
library(doParallel)

### Initialise code
apollo_initialise()
# LN-OC,CT,R_b,CF_b&T-EM BFGS_1500
### Set core controls
apollo_control = list(
  modelName       = "MODEL_X_2000",
  modelDescr      = "MODEL_X_2000",
  indivID         = "Person_ID",
  nCores          =  10, 
  outputDirectory = "output"
)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

### Loading data from package
### if data is to be loaded from a file (e.g. called data.csv), 
### the code would be: database = read.csv("data.csv",header=TRUE)
database = read.csv("D:\\IST\\Project\\Data\\fourWheelerFaceToFace_V3(cleaning).csv", header = TRUE)
database1= read.csv("D:\\IST\\Project\\Data\\fourWheelerFaceToFace_V3(cleaning).csv", header = TRUE)
### for data dictionary, use ?apollo_swissRouteChoiceData

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(asc_1_a           = 0.329,
                asc_1_b           = -0.682,
                
                asc_2_a           = 0,
                asc_2_b           = 0,
                
                beta_pp_a       =  -0.005 ,
                beta_pp_b       =  -0.015 ,
                
                # beta_oc_a       = 0,
                # beta_oc_b       = 0,
                #
                 betaa_oc_a       = -0.054,
                # log_oc_a_mu     = 0,
                #
                 # log_oc_a_sig    =  0,
                log_oc_b_mu     =  -5.566,
                 log_oc_b_sig    =  3.756,
                #betaa_oc_b       = 0,
                
                # Jn_oc_a_mu     = 0,
                #  Jn_oc_b_mu     =  0,
                # Jn_oc_a_sig    =  0,
                #  Jn_oc_b_sig    =  0,
                
                
                # tr_oc_p_a = 0,
                #tr_oc_p_b = 0,
                # tr_oc_q_a = 0,
                # tr_oc_q_b = 0,
                
                # beta_ct_a       = 0,
                # beta_ct_b       = 0,
                betaa_ct_a       =  -0.020,
                # log_ct_a_mu   =  0,
                #
                # log_ct_a_sig    =   0,
                log_ct_b_mu    = 5.488,
                log_ct_b_sig    =-16.790,
                 # betaa_ct_b       =  0,
                
                # tr_ct_p_a = 0,
                #tr_ct_p_b = 0,
                # tr_ct_q_a = 0,
                # tr_ct_q_b = 0,
                
                 # betaa_r_a       = 0,
                # beta_r_b       =   0,
                # log_r_a_mu    = 0,
                # log_r_a_sig    =  0,
                # log_r_b_mu    = 0,
                # 
                # log_r_b_sig    = 0,
                
               #  tr_r_p_a = 0,
               # tr_r_p_b = 0,
                tr_r_q_a = 0.129,
                tr_r_q_b = 0.134,
               # betaa_r_b = 0,

                 beta_cf_a       =  0.112,
                 beta_cf_b       =  0.337,
               #  log_cf_a_mu   =  0,
               # log_cf_a_sig    =  0,
                # log_cf_b_mu    =  0,
                # 
                # log_cf_b_sig    =   0,
                
                # tr_cf_p_a = 0,
                # tr_cf_p_b = 0,
               # betaa_cf_a = 0,
                # tr_cf_q_a = 0,
                # tr_cf_q_b = 0,
                # betaa_cf_b = 0,
               
                # beta_em_a       =  0,
                #  beta_em_b       =  0,
               #  log_em_a_mu    =  -1.9987,
               # log_em_a_sig    =  -0.08649,
               #  log_em_b_mu    =  -10.6245,
               # 
               #  log_em_b_sig    =   -0.1149,

                # tr_em_p_a = 0,
                #tr_em_p_b = 0,
                  # tr_em_q_a = 0,
               # betaa_em_a       =  0,
                # tr_em_q_b = 0,
               #   
                beta_priorityLanes_a = 0.215,
                beta_priorityLanes_b =  -0.099,

                beta_freeParking_a =    0.077,
                beta_freeParking_b =  0.240,

                beta_tollExemption_a = 0,
                beta_tollExemption_b = 0,
                # 
                
                
                
                delta_a        = -0.841,
                # gamma_knowledgeLow_a = 0,
                # gamma_knowledgeMid_a = 0,
                #  gamma_KnowledgeHigh_a = 0,
                gamma_voCoded_a =  0.347,
                gamma_dailyDistanceLow_a =0 ,
                gamma_dailyDistanceMid_a = -0.669,
                gamma_dailyDistanceHigh_a = -1.397,
                gamma_longDistanceLow_a = 0,
                gamma_longDistanceMid_a =  1.013,
                gamma_longDistanceHigh_a = 1.732,
                # gamma_techEnth_a =  0,
                # gamma_enviEnth_a =  0,
              # gamma_socIma_a = 0,
                # gamma_monBen_a = 0,
                 gamma_perFee_a = -0.339,
                gamma_perRisk_a =  -0.608,
                # gamma_insUti_a = 0,
                
                delta_b         = 0,
                # gamma_knowledgeLow_b = 0,
                # gamma_knowledgeMid_b = 0,
                #  gamma_KnowledgeHigh_b = 0,
                gamma_voCoded_b = 0,
                gamma_dailyDistanceLow_b = 0,
                gamma_dailyDistanceMid_b = 0,
                gamma_dailyDistanceHigh_b = 0,
                gamma_longDistanceLow_b = 0,
                gamma_longDistanceMid_b = 0,
                gamma_longDistanceHigh_b = 0,
                # gamma_techEnth_b = 0,
                # gamma_enviEnth_b = 0,
               # gamma_socIma_b = 0,
                # gamma_monBen_b = 0,
               gamma_perFee_b = 0,
               gamma_perRisk_b = 0
                # gamma_insUti_b = 0
               
)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_2_a","asc_2_b", "delta_b",
                 "gamma_dailyDistanceLow_a",
                 "gamma_longDistanceLow_a",
                 # "gamma_knowledgeLow_a",
                 # "gamma_knowledgeLow_b",
                 # "gamma_knowledgeMid_b",
                 #  "gamma_KnowledgeHigh_b",
                 "gamma_voCoded_b" ,
                 "gamma_dailyDistanceLow_b" ,
                 "gamma_dailyDistanceMid_b",
                 "gamma_dailyDistanceHigh_b",
                 "gamma_longDistanceLow_b",
                 "gamma_longDistanceMid_b" ,
                 "gamma_longDistanceHigh_b",
                 "beta_tollExemption_a",
                 "beta_tollExemption_b",
                 # # "gamma_techEnth_b",
                 # # "gamma_enviEnth_b",
                 # # "gamma_socIma_b",
                 # # "gamma_monBen_b",
                  "gamma_perFee_b",
                  "gamma_perRisk_b"
                 # "gamma_insUti_b"
                 )

# ################################################################# #
#### DEFINE RANDOM COMPONENTS                                    ####
# ################################################################# #

### Set parameters for generating draws
apollo_draws = list(
  interDrawsType="mlhs",           
  interNDraws= 2000,               
  interUnifDraws=c("draws_ct_a","draws_r_a", "draws_r_b"),                
  interNormDraws=c("draws_oc_a"),    
# "draws_oc_a", "draws_oc_b", "draws_ct_a", "draws_ct_b", "draws_r_a", "draws_r_b", "draws_cf_a", "draws_cf_b" , "draws_em_a", "draws_em_b"
# "draws_oc_a", "draws_ct_a", "draws_r_a", "draws_cf_a", "draws_em_a"
  
  intraDrawsType="mlhs",
  intraNDraws=0,
  intraUnifDraws=c(),
  intraNormDraws=c()
)

### Create random parameters
apollo_randCoeff = function(apollo_beta, apollo_inputs){
  randcoeff = list()
  
  # one class lognormal

  randcoeff[["beta_oc_a"]] = betaa_oc_a + 0*(draws_oc_a)
  randcoeff[["beta_oc_b"]] = -exp(log_oc_b_mu + log_oc_b_sig*draws_oc_a)
  randcoeff[["beta_ct_a"]] = betaa_ct_a + 0*(draws_ct_a)
  randcoeff[["beta_ct_b"]] = -exp(log_ct_b_mu + log_ct_b_sig*draws_ct_a)
  # randcoeff[["beta_r_a"]] = (betaa_r_a + 0*draws_r_a)
  # randcoeff[["beta_r_b"]] = exp(log_r_b_mu + log_r_b_sig*draws_r_a)
  # randcoeff[["beta_cf_a"]] = (betaa_cf_a + 0*draws_cf_a)
  # randcoeff[["beta_cf_b"]] = exp(log_cf_b_mu + log_cf_b_sig*draws_cf_a)
  # randcoeff[["beta_em_a"]] = (log_em_a_mu + 0*draws_em_a)
  # randcoeff[["beta_em_b"]] = -exp(log_em_b_mu + log_em_b_sig*draws_em_a)
  
  # lognormal distribution
  
  #  randcoeff[["beta_oc_a"]] = -exp(log_oc_a_mu + log_oc_a_sig*draws_oc_a)
  #  randcoeff[["beta_oc_b"]] = -exp(log_oc_b_mu + log_oc_b_sig*draws_oc_a)
  # randcoeff[["beta_ct_a"]] = -exp(log_ct_a_mu + log_ct_a_sig*draws_ct_a)
  # randcoeff[["beta_ct_b"]] = -exp(log_ct_b_mu + log_ct_b_sig*draws_ct_a)
  # randcoeff[["beta_r_a"]] = exp(log_r_a_mu + log_r_a_sig*draws_r_a)
  # randcoeff[["beta_r_b"]] = exp(log_r_b_mu + log_r_b_sig*draws_r_a)
  # randcoeff[["beta_cf_a"]] = exp(log_cf_a_mu + log_cf_a_sig*draws_cf_a)
  # randcoeff[["beta_cf_b"]] = exp(log_cf_b_mu + log_cf_b_sig*draws_cf_a)
  # randcoeff[["beta_em_a"]] = -exp(log_em_a_mu + log_em_a_sig*draws_em_a)
  # randcoeff[["beta_em_b"]] = -exp(log_em_b_mu + log_em_b_sig*draws_em_a)

  #triangular distribution
  
  # randcoeff[["beta_oc_a"]] = tr_oc_q_a * (draws_oc_a + draws_oc_b)/2
  # randcoeff[["beta_oc_b"]] = tr_oc_q_b *(draws_oc_a + draws_oc_b)/2
  # randcoeff[["beta_ct_a"]] = tr_ct_q_a * (draws_ct_a + draws_ct_b)/2
  # randcoeff[["beta_ct_b"]] = tr_ct_q_b *(draws_ct_a + draws_ct_b)/2
  randcoeff[["beta_r_a"]] =  tr_r_q_a * (draws_r_a + draws_r_b)/2
  randcoeff[["beta_r_b"]] =  tr_r_q_b*(draws_r_a + draws_r_b)/2
  # randcoeff[["beta_cf_a"]] = tr_cf_q_a *(draws_cf_a + draws_cf_b)/2
  # randcoeff[["beta_cf_b"]] = tr_cf_q_b * (draws_cf_a + draws_cf_b)/2
  # randcoeff[["beta_em_a"]] = tr_em_q_a * (draws_em_a + draws_em_b)/2
  # randcoeff[["beta_em_a"]] = tr_em_q_a* (draws_em_a + draws_em_b)/2
  # randcoeff[["beta_em_b"]] = tr_em_q_b * (draws_em_a + draws_em_b)/2
  
  #Johnson's distribution
  # randcoeff[['beta_oc_a']] = exp(Jn_oc_a_mu + Jn_oc_a_sig*draws_oc_a)/(1 + exp(Jn_oc_a_mu + Jn_oc_a_sig*draws_oc_a))
  # randcoeff[['beta_oc_b']] = exp(Jn_oc_b_mu + Jn_oc_b_sig*draws_oc_a)/(1 + exp(Jn_oc_b_mu + Jn_oc_b_sig*draws_oc_a))
  
  return(randcoeff)
}

# ################################################################# #
#### DEFINE LATENT CLASS COMPONENTS                              ####
# ################################################################# #

apollo_lcPars = function(apollo_beta, apollo_inputs){
  lcpars = list()
  lcpars[["asc_1"]] = list(asc_1_a, asc_1_b)
  lcpars[["asc_2"]] = list(asc_2_a, asc_2_b)
  lcpars[["beta_pp"]] = list(beta_pp_a, beta_pp_b)
  lcpars[["beta_oc"]] = list(beta_oc_a, beta_oc_b)
  lcpars[["beta_ct"]] = list(beta_ct_a, beta_ct_b)
  lcpars[["beta_r"]]  = list(beta_r_a, beta_r_b)
  lcpars[["beta_cf"]] = list(beta_cf_a, beta_cf_b)
  # lcpars[["beta_em"]] = list(beta_em_a, beta_em_b)
  lcpars[["beta_priorityLanes"]] = list(beta_priorityLanes_a, beta_priorityLanes_b)
  lcpars[["beta_freeParking"]]   = list(beta_freeParking_a, beta_freeParking_b)
  lcpars[["beta_tollExemption"]] = list(beta_tollExemption_a, beta_tollExemption_b)
  
  V=list()
  V[["class_a"]] = delta_a + gamma_voCoded_a*evoCoded + gamma_dailyDistanceLow_a*dailyDistanceLow + gamma_dailyDistanceMid_a*dailyDistanceMiddle+ gamma_dailyDistanceHigh_a*dailyDistanceHigh  + gamma_longDistanceLow_a*longDistanceLow + gamma_longDistanceMid_a*longDistanceMiddle+ gamma_longDistanceHigh_a*longDistanceHigh + gamma_perFee_a*perFeeML + gamma_perRisk_a*perRiskML
  #+ gamma_perFee_a*perFeeML + gamma_perRisk_a*perRiskML
  #+ +gamma_voCoded_a*evoCoded
  #+ gamma_knowledgeLow_a*knowledgeLow + gamma_knowledgeMid_a*knowledgeMiddle + gamma_KnowledgeHigh_a*knowledgeHigh 
  #+ gamma_monBen_a*monBenML + gamma_perFee_a*perFeeML + gamma_perRisk_a*perRiskML + gamma_insUti_a*insUtiML  
  V[["class_b"]] = delta_b +gamma_voCoded_b*evoCoded  + gamma_dailyDistanceLow_b*dailyDistanceLow + gamma_dailyDistanceMid_b*dailyDistanceMiddle+ gamma_dailyDistanceHigh_b*dailyDistanceHigh + gamma_longDistanceLow_b*longDistanceLow + gamma_longDistanceMid_b*longDistanceMiddle+ gamma_longDistanceHigh_b*longDistanceHigh +  gamma_perFee_b*perFeeML + gamma_perRisk_b*perRiskML 
  #+ gamma_perFee_b*perFeeML + gamma_perRisk_b*perRiskML 
  #+ + gamma_voCoded_b*evoCoded 
  #+ gamma_monBen_b*monBenML + gamma_perFee_b*perFeeML + gamma_perRisk_b*perRiskML + gamma_insUti_b*insUtiML
  #+ gamma_knowledgeLow_b*knowledgeLow + gamma_knowledgeMid_b*knowledgeMiddle + gamma_KnowledgeHigh_b*knowledgeHigh  + gamma_voCoded_b*evoCoded
  
  classAlloc_settings = list(
    classes      = c(class_a=1, class_b=2), 
    utilities    = V
  )
  
  lcpars[["pi_values"]] = apollo_classAlloc(classAlloc_settings)
  
  return(lcpars)
}

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Define settings for MNL model component that are generic across classes
  mnl_settings = list(
    alternatives = c(alt1=1, alt2=2),
    avail        = list(alt1=1, alt2=1),
    choiceVar    = choiceVehicle
  )
  
  ### Loop over classes
  for(s in 1:2){
    ### Compute class-specific utilities
    V=list()
    V[["alt1"]]  = asc_1[[s]] + beta_pp[[s]]*ppEVI +  beta_oc[[s]]*ocEV+ beta_ct[[s]]*ctEV/60 + beta_r[[s]]*rEV/100 + beta_cf[[s]]*cfEV + beta_priorityLanes[[s]]*priorityLanes + beta_freeParking[[s]]*freeParking + beta_tollExemption[[s]]*tollExemption
    #+ beta_ct[[s]]*ctEV/60 + beta_r[[s]]*rEV/100 + beta_cf[[s]]*cfEV 
    #+ beta_cf[[s]]*cfEV + beta_em[[s]]*eEV
    V[["alt2"]]  = asc_2[[s]]  + beta_pp[[s]]*ppCVI +  beta_oc[[s]]*ocCV                       + beta_r[[s]]*rCV/100        
    
    mnl_settings$utilities = V
    mnl_settings$componentName = paste0("Class_",s)
    
    ### Compute within-class choice probabilities using MNL model
    P[[paste0("Class_",s)]] = apollo_mnl(mnl_settings, functionality)
    
    ### Take product across observation for same individual
    P[[paste0("Class_",s)]] = apollo_panelProd(P[[paste0("Class_",s)]], apollo_inputs ,functionality)
    
    ### Average across inter-individual draws within classes
    P[[paste0("Class_",s)]] = apollo_avgInterDraws(P[[paste0("Class_",s)]], apollo_inputs, functionality)
  }
  
  ### Compute latent class model probabilities
  lc_settings  = list(inClassProb = P, classProb=pi_values)
  P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)
  
  ### Average across inter-individual draws in class allocation probabilities
  P[["model"]] = apollo_avgInterDraws(P[["model"]], apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}


# ################################################################# #
#### MODEL ESTIMATION AND OUTPUT                                 ####
# ################################################################# #

### Estimate model
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs, estimate_settings = list(estimationRoutine="BFGS",maxIterations= 5000))

### Show output in screen
apollo_modelOutput(model)

### Save output to file(s)
apollo_saveOutput(model)



#scenario
predictions_base = apollo_prediction(model, apollo_probabilities, apollo_inputs,prediction_settings=list(runs=30))
database$ppCVI = 1.10*database$ppCVI
apollo_inputs = apollo_validateInputs()
predictions_new = apollo_prediction(model, apollo_probabilities, apollo_inputs)

and the result also,

Code: Select all

Model name                                  : MODEL_X_2000
Model description                           : MODEL_X_2000
Model run at                                : 2024-05-20 13:42:00.992985
Estimation method                           : bfgs
Model diagnosis                             : successful convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -0.218446
     reciprocal of condition number         : 7.27028e-07
Number of individuals                       : 1389
Number of rows in database                  : 5207
Number of modelled outcomes                 : 5207

Number of cores used                        :  10 
Number of inter-individual draws            : 2000 (mlhs)

LL(start)                                   : -3210.34
LL (whole model) at equal shares, LL(0)     : -3609.22
LL (whole model) at observed shares, LL(C)  : -3567.91
LL(final, whole model)                      : -3208.72
Rho-squared vs equal shares                  :  0.111 
Adj.Rho-squared vs equal shares              :  0.1038 
Rho-squared vs observed shares               :  0.1007 
Adj.Rho-squared vs observed shares           :  0.0939 
AIC                                         :  6469.45 
BIC                                         :  6639.95 

LL(0,Class_1)                    : -3609.22
LL(final,Class_1)                : -3888.63
LL(0,Class_2)                    : -3609.22
LL(final,Class_2)                : -3649.1

Estimated parameters                        : 26
Time taken (hh:mm:ss)                       :  00:40:32.87 
     pre-estimation                         :  00:07:39.8 
     estimation                             :  00:08:51.22 
          initial estimation                :  00:08:14.53 
          estimation after rescaling        :  00:00:36.69 
     post-estimation                        :  00:24:1.85 
Iterations                                  :  38  
     initial estimation                     :  37 
     estimation after rescaling             :  1 

Unconstrained optimisation.

Estimates:
                             Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
asc_1_a                      0.331412    0.268261      1.2354    0.269248        1.2309
asc_1_b                     -0.668859    0.368526     -1.8150    0.396546       -1.6867
asc_2_a                      0.000000          NA          NA          NA            NA
asc_2_b                      0.000000          NA          NA          NA            NA
beta_pp_a                   -0.005657    0.003442     -1.6435    0.003687       -1.5342
beta_pp_b                   -0.014932    0.007046     -2.1193    0.007945       -1.8795
betaa_oc_a                  -0.055251    0.032593     -1.6952    0.034491       -1.6019
log_oc_b_mu                 -5.737140    0.806014     -7.1179    0.708061       -8.1026
log_oc_b_sig                 3.791235    0.413215      9.1750    0.106119       35.7264
betaa_ct_a                  -0.021405    0.015280     -1.4009    0.015831       -1.3521
log_ct_b_mu                  5.564738    0.874453      6.3637    0.667799        8.3330
log_ct_b_sig               -16.981455    1.956827     -8.6781    0.627229      -27.0738
tr_r_q_a                     0.129624    0.073960      1.7526    0.072394        1.7905
tr_r_q_b                     0.271609    0.141913      1.9139    0.135484        2.0047
beta_cf_a                    0.114268    0.120334      0.9496    0.116319        0.9824
beta_cf_b                    0.329494    0.230016      1.4325    0.223245        1.4759
beta_priorityLanes_a         0.215107    0.134927      1.5942    0.132680        1.6212
beta_priorityLanes_b        -0.090734    0.260987     -0.3477    0.257714       -0.3521
beta_freeParking_a           0.080637    0.122307      0.6593    0.120110        0.6714
beta_freeParking_b           0.239490    0.236397      1.0131    0.237267        1.0094
beta_tollExemption_a         0.000000          NA          NA          NA            NA
beta_tollExemption_b         0.000000          NA          NA          NA            NA
delta_a                     -0.829807    0.454567     -1.8255    0.442077       -1.8771
gamma_voCoded_a              0.336912    0.309079      1.0901    0.347647        0.9691
gamma_dailyDistanceLow_a     0.000000          NA          NA          NA            NA
gamma_dailyDistanceMid_a    -0.669468    0.239092     -2.8000    0.250148       -2.6763
gamma_dailyDistanceHigh_a   -1.387831    0.570590     -2.4323    0.733335       -1.8925
gamma_longDistanceLow_a      0.000000          NA          NA          NA            NA
gamma_longDistanceMid_a      1.016427    0.306283      3.3186    0.293675        3.4611
gamma_longDistanceHigh_a     1.745900    0.310466      5.6235    0.303920        5.7446
gamma_perFee_a              -0.333892    0.073352     -4.5519    0.074397       -4.4880
gamma_perRisk_a             -0.607276    0.126737     -4.7916    0.138340       -4.3897
delta_b                      0.000000          NA          NA          NA            NA
gamma_voCoded_b              0.000000          NA          NA          NA            NA
gamma_dailyDistanceLow_b     0.000000          NA          NA          NA            NA
gamma_dailyDistanceMid_b     0.000000          NA          NA          NA            NA
gamma_dailyDistanceHigh_b    0.000000          NA          NA          NA            NA
gamma_longDistanceLow_b      0.000000          NA          NA          NA            NA
gamma_longDistanceMid_b      0.000000          NA          NA          NA            NA
gamma_longDistanceHigh_b     0.000000          NA          NA          NA            NA
gamma_perFee_b               0.000000          NA          NA          NA            NA
gamma_perRisk_b              0.000000          NA          NA          NA            NA


Summary of class allocation for model component :
         Mean prob.
Class_1      0.5306
Class_2      0.4694

> 
> ### Save output to file(s)
> apollo_saveOutput(model)
Model output saved to output/MODEL_X_2000_output.txt 
Estimates saved to output/MODEL_X_2000_estimates.csv 
Model object saved to output/MODEL_X_2000.rds 
> 
> 
> 
> #scenario
> predictions_base = apollo_prediction(model, apollo_probabilities, apollo_inputs,prediction_settings=list(runs=30))
INFORMATION: Your model contains continuous random parameters. apollo_prediction will perform averaging across draws for these.
  For predictions at the level of individual draws, please call the apollo_probabilities function using
  model$estimate as the parameters, and with functionality="raw". 
Running predictions from model using parameter estimates...
Running predictions across draws from the asymptotic distribution for maximum likelihood estimates.
Predicting for set of draws 1/30...
Predicting for set of draws 2/30...
Predicting for set of draws 3/30...
Predicting for set of draws 4/30...
Predicting for set of draws 5/30...
Predicting for set of draws 6/30...
Predicting for set of draws 7/30...
Predicting for set of draws 8/30...
Predicting for set of draws 9/30...
Predicting for set of draws 10/30...
Predicting for set of draws 11/30...
Predicting for set of draws 12/30...
Predicting for set of draws 13/30...
Predicting for set of draws 14/30...
Predicting for set of draws 15/30...
Predicting for set of draws 16/30...
Predicting for set of draws 17/30...
Predicting for set of draws 18/30...
Predicting for set of draws 19/30...
Predicting for set of draws 20/30...
Predicting for set of draws 21/30...
Predicting for set of draws 22/30...
Predicting for set of draws 23/30...
Predicting for set of draws 24/30...
Predicting for set of draws 25/30...
Predicting for set of draws 26/30...
Predicting for set of draws 27/30...
Predicting for set of draws 28/30...
Predicting for set of draws 29/30...
Predicting for set of draws 30/30...

Aggregated prediction for model component: Class_1
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1   3218         3192            118.9           2967           3390
alt2   1989         2015            118.9           1817           2240

Average prediction for model component: Class_1
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1  0.618       0.6129          0.02283         0.5698         0.6511
alt2  0.382       0.3871          0.02283         0.3489         0.4302

Aggregated prediction for model component: Class_2
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1   1231         1236            170.6          974.4           1563
alt2   3976         3971            170.6         3643.8           4233

Average prediction for model component: Class_2
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1 0.2364       0.2373          0.03277         0.1871         0.3002
alt2 0.7636       0.7627          0.03277         0.6998         0.8129

Aggregated prediction for model component: model
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1   2286         2299            57.12           2204           2428
alt2   2921         2908            57.12           2779           3003

Average prediction for model component: model
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1  0.439       0.4415          0.01097         0.4233         0.4663
alt2  0.561       0.5585          0.01097         0.5337         0.5767

The output from apollo_prediction is a list, with one element per model component. If the user asks for confidence
  intervals, then, for each model component, a list with two elements is returned: a data.frame containing the
  predictions at the estimated values, and an array with predictions for different values of the parameters drawn
  from their asymptotic distribution.
> database$ppCVI = 1.10*database$ppCVI
> apollo_inputs = apollo_validateInputs()
apollo_draws and apollo_randCoeff were found, so apollo_control$mixing was set to TRUE
Several observations per individual detected based on the value of Person_ID. Setting panelData in apollo_control
  set to TRUE.
All checks on apollo_control completed.
WARNING: Your database contains some entries that are NA. This may well be intentional, but be advised that if these entries
  are used in your model, the behaviour may be unexpected. 
All checks on database completed.
Generating inter-individual draws .... Done
> predictions_new = apollo_prediction(model, apollo_probabilities, apollo_inputs)
INFORMATION: Your model contains continuous random parameters. apollo_prediction will perform averaging across draws for these.
  For predictions at the level of individual draws, please call the apollo_probabilities function using
  model$estimate as the parameters, and with functionality="raw". 
Running predictions from model using parameter estimates...
Prediction at user provided parameters for model component: Class_1
             alt1    alt2
Aggregate 3264.89 1942.11
Average      0.63    0.37

Prediction at user provided parameters for model component: Class_2
             alt1    alt2
Aggregate 1290.52 3916.48
Average      0.25    0.75

Prediction at user provided parameters for model component: model
             alt1    alt2
Aggregate 2338.55 2868.45
Average      0.45    0.55

The output from apollo_prediction is a list, with one element per model component. For each model component, the
  list element is given by a matrix containing the predictions at the estimated values.

please look at the below result,

Code: Select all

Aggregated prediction for model component: Class_1
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1   3218         3192            118.9           2967           3390
alt2   1989         2015            118.9           1817           2240

Average prediction for model component: Class_1
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1  0.618       0.6129          0.02283         0.5698         0.6511
alt2  0.382       0.3871          0.02283         0.3489         0.4302

Aggregated prediction for model component: Class_2
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1   1231         1236            170.6          974.4           1563
alt2   3976         3971            170.6         3643.8           4233

Average prediction for model component: Class_2
     at MLE Sampled mean Sampled std.dev. Quantile 0.025 Quantile 0.975
alt1 0.2364       0.2373          0.03277         0.1871         0.3002
alt2 0.7636       0.7627          0.03277         0.6998         0.8129

even after segmentation, total data in included in ''aggregated prediction for model component : class_1'' (3218+1989 = 5207) and total data is included in ''aggregated prediction for model component : class_2'' (1231 +3976 = 5207) too. how is it possible ? i should be getting segmented result right ?

Code: Select all

Prediction at user provided parameters for model component: Class_1
             alt1    alt2
Aggregate 3264.89 1942.11
Average      0.63    0.37

Prediction at user provided parameters for model component: Class_2
             alt1    alt2
Aggregate 1290.52 3916.48
Average      0.25    0.75

Prediction at user provided parameters for model component: model
             alt1    alt2
Aggregate 2338.55 2868.45
Average      0.45    0.55

above are the result after giving 10% increase in purchase price of conventional vehicle (ppCVI), how can i interpret this result ? why total data is included in both classes ?

thank you

Post by **stephanehess** » 22 May 2024, 11:50

Hi

1. Not sure how you want to calculate WTP if you have different cost variables. With a random cost, you can use the ratio of different elements in apollo_unconditionals.

2. It means that at the sample average, these are the weights for the classes. Latent class does NOT put each person into one class, but does so probabilistically.

3. Each class uses every observation (see above), so these results are correct.

4. Same point as above

Stephane

ApolloChoiceModelling forum

post-estimation scenario analysis

post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis

Re: post-estimation scenario analysis