Page 1 of 1

Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 26 Sep 2021, 09:45
by janak12_jp
Hello,

I have tried several specifications of Hybrid Choice Model with Bayesian Estimator to solve my choice problem. My data contains several mode specific LVs in addition to other observable attributes like trip characteristics. I used initial 100k iterations as burn-in and next 100k for estimation. But after completing the estimation, I come across the same error each time:

Code: Select all

WARNING: RSGHB has censored the probabilities. Please note that in at least some iterations RSGHB has avoided numerical
issues by left censoring the probabilities. This has the side effect of zero or negative probabilities not leading to
failures!
Warning messages:
1: In log(test2_LL) : NaNs produced
2: In log(test1_LL) : NaNs produced
The problem as I seen in my results is most probably related to estimation of covariance matrix (though not 100% sure).
It shows the initial loglikelihood value to -Inf though it was somewhere around -56000 when I started estimation.

I produces correlation matrices but NaNs in covariance matrices, what is the reason for that? I read somewhere that correlation matrix is unstandardized version of covariance matrix. Can you please suggest on this also?

I am attaching the code as well as my results.

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 30 Sep 2021, 09:48
by dpalma
Hi,

It's difficult to diagnose the problem without looking at your data. Could you share your database? If you don;t want to share it in the forum, you can email it to D.Palma [at] leeds.ac.uk

Cheers
David

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 30 Sep 2021, 10:00
by janak12_jp
Dear Dr Palma,

I have mailed the data file to the stated email. Thank you.

Regards,
Janak

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 11 Oct 2021, 15:00
by dpalma
Hi Janak,

Sorry for the slow response.

First of all, you are not getting errors, but only warnings. The first warning (related to the censoring of probabilities) means that during the estimation process (i.e. for some values in your chain) the likelihood values you obtained were so small that they were indistinguishable from zero. To avoid numerical issues Apollo replaced those values by a small (but bigger than zero) value. If the chain looks like it converged correctly, then this was probably due to the chain going through these problematic values only for a short time, and then moved on to better values.

The second warning (NaN in log(test2_LL)) has to do with your starting values leading to very a small likelihood, i.e. too close to zero. Considering the previous warning, my guess is your starting values are not very good. I would recommend using your estimated values as starting values, and performing the estimation again. You will probably run into fewer warnings then.

Finally, concerning the NaNs in the "Covariances of random parameters", this was to be expected as you set the option gFULLCV= FALSE inside apollo_HB. This means you are forcing the chains of the random parameters to be uncorrelated, that is why the covariance matrix only has values for the diagonal (e.g. asc_bus_asc_bus, asc_air_asc_air), but not for off-diagonal elements (e.g. asc_air_asc_bus).

Finally, note that the number of draws to keep in the chain is defined in the setting "gNEREP", not "GNEREP". R is case sensitive, so using upper or lower case does matter.

Cheers
David

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 14 Sep 2024, 09:29
by janak12_jp
Hi,

I am facing a similar issue with another set of data. I am starting with the simpler MNL model on SP data. In my data, I have two-level choice sets: when an individual choose to travel with public transport in upper level SCEs, he/she will get another set of SCEs for access/egress to the public transport. Looking to this, I have fixed two asc: one for upper level choice, another for lower (access/egress) choice.

The warning I am getting is:

Code: Select all

WARNING: RSGHB has censored the probabilities. Please note that in at least some iterations RSGHB has avoided numerical issues by left censoring the probabilities. This has the side effect of zero or negative probabilities not leading to failures! 
In the results, NaNs have been produced for model fit statistics which are shown below.

Also, when I estimate the same model using Maximum Simulated Liklihood instead of HB, I get the message that "Negative definite Hessian with maximum eigenvalue: -0.579" Can you please suggest on this as well?

My code is:

Code: Select all

### Clear memory
rm(list = ls())

### Load Apollo library
library(apollo)

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName       = "MNL_SP",
  modelDescr      = "Pilot MNL model on mode choice SP data with HB",
  indivID         = "ID",
  outputDirectory = "output",
  HB = T
)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

### Loading data from package
### if data is to be loaded from a file (e.g. called data.csv), 
### the code would be: database = read.csv("data.csv",header=TRUE)
database = read.csv("Analysis.csv", header = T)
### for data dictionary, use ?apollo_modeChoiceData

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(asc_car = 0,
              asc_pt = -0.28,
              asc_ecar = -0.4,
              asc_ebike = -0.38,
              asc_escoot = 4.38,
              asc_aebike = -1.7,
              asc_aescoot = -4.30,
              asc_awalk = 0,
              b_car_ivtt = -0.1,
              b_car_egt = -0.05,
              b_car_tc = -0.21,
              b_car_pc = -0.25,
              b_pt_ivtt = -0.11,
              b_pt_act = -0.03,
              b_pt_egt = -0.1,
              b_pt_wt = -0.1,
              b_pt_tc = 0,
              b_ecar_ivtt = -0.11,
              b_ecar_act = -0.1,
              b_ecar_egt = -0.14,
              b_ecar_tc = -0.23,
              b_ecar_av = 0,
              b_ecar_wt = -0.02,
              b_ebike_ivtt = -0.05,
              b_ebike_act = 0,
              b_ebike_egt = -0.06,
              b_ebike_tc = -0.52,
              b_ebike_av = -0.01,
              b_ebike_wt = -0.38,
              b_escoot_ivtt = -0.15,
              b_escoot_act = -0.21,
              b_escoot_egt = -0.27,
              b_escoot_tc = -0.13,
              b_escoot_av = -0.04,
              b_escoot_wt = -0.59,
              b_aebike_ivtt = -0.02,
              b_aebike_wk = -0.01,
              b_aebike_tc = -0.43,
              b_aebike_av = 0,
              b_aebike_wt = -0.15,
              b_aescoot_ivtt = -0.24,
              b_aescoot_wk = 0,
              b_aescoot_tc = -0.3,
              b_aescoot_av = 0,
              b_aescoot_wt = 0,
              b_awalk = -0.06)
# b_age = 0,
# b_female = 0,
# b_hhinc = 0)

apollo_HB=list(hbDist = c(asc_car = "NR",
              asc_pt = "N",
              asc_ecar = "N",
              asc_ebike = "N",
              asc_escoot = "N",
              asc_aebike = "N",
              asc_aescoot = "N",
              asc_awalk = "NR",
              b_car_ivtt = "LN-",
              b_car_egt = "N",
              b_car_tc = "LN-",
              b_car_pc = "LN-",
              b_pt_ivtt = "LN-",
              b_pt_act = "LN-",
              b_pt_egt = "LN-",
              b_pt_wt = "LN-",
              b_pt_tc = "LN-",
              b_ecar_ivtt = "LN-",
              b_ecar_act = "LN-",
              b_ecar_egt = "LN-",
              b_ecar_tc = "LN-",
              b_ecar_av = "N",
              b_ecar_wt = "LN-",
              b_ebike_ivtt = "LN-",
              b_ebike_act = "LN-",
              b_ebike_egt = "LN-",
              b_ebike_tc = "LN-",
              b_ebike_av = "N",
              b_ebike_wt = "LN-",
              b_escoot_ivtt = "LN-",
              b_escoot_act = "LN-",
              b_escoot_egt = "LN-",
              b_escoot_tc = "LN-",
              b_escoot_av = "N",
              b_escoot_wt = "LN-",
              b_aebike_ivtt = "LN-",
              b_aebike_wk = "LN-",
              b_aebike_tc = "LN-",
              b_aebike_av = "N",
              b_aebike_wt = "LN-",
              b_aescoot_ivtt = "LN-",
              b_aescoot_wk = "LN-",
              b_aescoot_tc = "LN-",
              b_aescoot_av = "N",
              b_aescoot_wt = "LN-",
              b_awalk = "LN-"),
              gNCREP          = 50000, # burn-in iterations
              gNEREP          = 20000, # post burn-in iterations
              gINFOSKIP       = 500
              )

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_car", "asc_awalk")

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[["car"]] = asc_car + b_car_ivtt*car_ivtt + b_car_egt*car_egt + b_car_tc*car_tc + b_car_pc*car_pc #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["pt"]] = asc_pt + b_pt_ivtt*pt_ivtt + b_pt_act*pt_act + b_pt_egt*pt_egt + b_pt_wt*pt_wt + b_pt_tc*pt_tc #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["ecar"]] = asc_ecar + b_ecar_ivtt*ecar_ivtt + b_ecar_act*ecar_act + b_ecar_egt*ecar_egt + b_ecar_tc*ecar_tc + b_ecar_av*ecar_av + b_ecar_wt*ecar_wt #+ b_age*age + b_female*female + b_hhinc*hhinc  
  V[["ebike"]] = asc_ebike + b_ebike_ivtt*ebike_ivtt + b_ebike_act*ebike_act + b_ebike_egt*ebike_egt + b_ebike_tc*ebike_tc + b_ebike_av*ebike_av + b_ebike_wt*ebike_wt #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["escoot"]] = asc_escoot + b_escoot_ivtt*escoot_ivtt + b_escoot_act*escoot_act + b_escoot_egt*escoot_egt + b_escoot_tc*escoot_tc + b_escoot_av*escoot_av + b_escoot_wt*escoot_wt #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["aebike"]] = asc_aebike + b_aebike_ivtt*aebike_ivtt + b_aebike_wk*aebike_wk + b_aebike_tc*aebike_tc + b_aebike_av*aebike_av + b_aebike_wt*aebike_wt #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["aescoot"]] = asc_aescoot + b_aescoot_ivtt*aescoot_ivtt + b_aescoot_wk*aescoot_wk + b_aescoot_tc*aescoot_tc + b_aescoot_av*aescoot_av + b_aescoot_wt*aescoot_wt #+ b_age*age + b_female*female + b_hhinc*hhinc
  V[["awalk"]] = asc_awalk + b_awalk*awalk #+ b_age*age + b_female*female + b_hhinc*hhinc
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = c(car=1, pt=2, ecar=3, ebike=4, escoot=5, aebike=6, aescoot=7, awalk=8), 
    avail         = list(car=av_car, pt=av_pt, ecar=av_ecar, ebike=av_ebike, escoot=av_escoot, aebike=av_aebike, aescoot=av_aescoot, awalk=av_awalk), 
    choiceVar     = choice,
    utilities     = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Take product across observation for same individual
  #P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

modelB = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs, estimate_settings = list(estimationRoutine="BFGS"))

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN)                               ----
# ----------------------------------------------------------------- #

apollo_modelOutput(modelB)

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name)               ----
# ----------------------------------------------------------------- #

apollo_saveOutput(model)
The results are:

Code: Select all

Model name                                  : MNL_SP
Model description                           : Pilot MNL model on mode choice SP data with HB
Model run at                                : 2024-09-13 22:57:34.44561
Estimation method                           : Hierarchical Bayes
Number of individuals                       : 726
Number of rows in database                  : 7101
Number of modelled outcomes                 : 7101

Number of cores used                        :  1 

Estimation carried out using RSGHB
Burn-in iterations                          : 50000
Post burn-in iterations                     : 20000

Classical model fit statistics were calculated at parameter values obtained using averaging across the post burn-in
  iterations.
LL(start)                                   : -32122.57
LL at equal shares, LL(0)                   : -9488.5
LL at observed shares, LL(C)                : -8044.23
LL(final)                                   : NaN
Rho-squared vs equal shares                  :  NaN 
Adj.Rho-squared vs equal shares              :  NaN 
Rho-squared vs observed shares               :  NaN 
Adj.Rho-squared vs observed shares           :  NaN 
AIC                                         :  NaN 
BIC                                         :  NaN 

Equiv. estimated parameters                 :  1034
 (means of random parameters                :  44)
 (covariance matrix terms                   :  990)

Time taken (hh:mm:ss)                       :  01:00:13.09 
     pre-estimation                         :  00:00:0.91 
     estimation                             :  00:59:25.09 
     post-estimation                        :  00:00:47.09 


Summary of parameter chains

Non-random coefficients 
          Mean SD
asc_car      0 NA
asc_awalk    0 NA

Results for posterior means for random coefficients 
                  Mean     SD
asc_pt         -1.0472 0.7376
asc_ecar        0.4351 0.4538
asc_ebike      -0.0703 0.4662
asc_escoot      4.8581 1.0556
asc_aebike     -1.8167 0.6715
asc_aescoot    -3.7317 1.0784
b_car_ivtt     -1.0584 1.1678
b_car_egt      -0.1606 0.1169
b_car_tc       -1.1081 0.3329
b_car_pc       -5.2449 6.2265
b_pt_ivtt      -1.1254 1.0565
b_pt_act       -1.0518 1.0266
b_pt_egt       -1.4521 1.5207
b_pt_wt        -1.1535 1.8829
b_pt_tc        -0.9030 0.4929
b_ecar_ivtt    -1.5415 1.5379
b_ecar_act     -1.4452 2.1103
b_ecar_egt     -2.6467 5.8017
b_ecar_tc      -1.1002 0.6490
b_ecar_av      -0.0527 0.0767
b_ecar_wt      -0.3873 0.2447
b_ebike_ivtt   -1.3902 1.0425
b_ebike_act    -1.9084 2.2286
b_ebike_egt    -1.0841 0.7588
b_ebike_tc     -2.5252 3.1672
b_ebike_av     -0.1257 0.1219
b_ebike_wt     -1.2109 1.3047
b_escoot_ivtt  -1.4285 1.0987
b_escoot_act   -1.0613 0.6492
b_escoot_egt   -2.8530 3.2570
b_escoot_tc    -0.9622 0.4404
b_escoot_av    -0.1744 0.0803
b_escoot_wt    -0.9376 0.5198
b_aebike_ivtt  -0.6085 0.1891
b_aebike_wk    -0.3683 0.1452
b_aebike_tc    -3.5890 3.2095
b_aebike_av     0.0616 0.0851
b_aebike_wt    -1.3384 1.0136
b_aescoot_ivtt -2.3479 3.4822
b_aescoot_wk   -0.5830 0.2333
b_aescoot_tc   -0.5605 0.3892
b_aescoot_av    0.0939 0.1330
b_aescoot_wt   -1.6251 1.7322
b_awalk        -0.4549 0.1197

Summary of distributions of random coeffients (after distributional transforms) 
                  Mean      SD
asc_pt         -1.0316  1.1864
asc_ecar        0.4352  0.7631
asc_ebike      -0.0635  0.7761
asc_escoot      4.8517  1.5765
asc_aebike     -1.8086  1.1005
asc_aescoot    -3.7524  1.5630
b_car_ivtt     -1.2127  3.5594
b_car_egt      -0.1623  0.2603
b_car_tc       -1.1089  0.6841
b_car_pc       -5.7248 16.9518
b_pt_ivtt      -1.3133  5.0289
b_pt_act       -1.1737  3.3195
b_pt_egt       -1.6831  5.5773
b_pt_wt        -1.3340  6.9558
b_pt_tc        -0.9127  0.9704
b_ecar_ivtt    -1.8840  7.2014
b_ecar_act     -1.6116  6.3814
b_ecar_egt     -3.1589 23.6125
b_ecar_tc      -1.0921  1.4635
b_ecar_av      -0.0508  0.1409
b_ecar_wt      -0.3850  0.4691
b_ebike_ivtt   -1.4396  2.5358
b_ebike_act    -2.0824  8.7448
b_ebike_egt    -1.0916  1.8745
b_ebike_tc     -2.6726  8.2982
b_ebike_av     -0.1251  0.2137
b_ebike_wt     -1.3562  4.2061
b_escoot_ivtt  -1.5280  3.1523
b_escoot_act   -1.0777  1.3789
b_escoot_egt   -3.3517 12.6202
b_escoot_tc    -0.9559  0.8978
b_escoot_av    -0.1705  0.1866
b_escoot_wt    -0.9204  1.1084
b_aebike_ivtt  -0.5999  0.3913
b_aebike_wk    -0.3691  0.2610
b_aebike_tc    -3.6606 10.9210
b_aebike_av     0.0580  0.1729
b_aebike_wt    -1.3128  1.7149
b_aescoot_ivtt -2.6093 10.8662
b_aescoot_wk   -0.5889  0.5182
b_aescoot_tc   -0.5658  0.7885
b_aescoot_av    0.0951  0.2319
b_aescoot_wt   -1.5555  2.6745
b_awalk        -0.4546  0.2391

Upper level model results for mean parameters for underlying Normals 
                  Mean     SD
asc_pt         -1.0474 0.0836
asc_ecar        0.4353 0.0739
asc_ebike      -0.0702 0.0877
asc_escoot      4.8579 0.1402
asc_aebike     -1.8168 0.2451
asc_aescoot    -3.7319 0.0973
b_car_ivtt     -1.1547 0.1013
b_car_egt      -0.1606 0.0266
b_car_tc       -0.0557 0.0440
b_car_pc        0.2340 0.0895
b_pt_ivtt      -0.9634 0.1021
b_pt_act       -1.2549 0.1188
b_pt_egt       -1.0469 0.0967
b_pt_wt        -1.7198 0.1455
b_pt_tc        -0.4881 0.1596
b_ecar_ivtt    -1.1034 0.1161
b_ecar_act     -1.1388 0.1119
b_ecar_egt     -1.3788 0.1478
b_ecar_tc      -0.4078 0.1678
b_ecar_av      -0.0527 0.0108
b_ecar_wt      -1.4294 0.0877
b_ebike_ivtt   -0.4421 0.1323
b_ebike_act    -0.9773 0.1923
b_ebike_egt    -0.6754 0.1764
b_ebike_tc     -0.2676 0.2091
b_ebike_av     -0.1256 0.0262
b_ebike_wt     -1.0463 0.0971
b_escoot_ivtt  -0.4589 0.1007
b_escoot_act   -0.4194 0.0814
b_escoot_egt   -0.5779 0.1281
b_escoot_tc    -0.3541 0.0927
b_escoot_av    -0.1743 0.0190
b_escoot_wt    -0.5147 0.0964
b_aebike_ivtt  -0.6764 0.0956
b_aebike_wk    -1.1921 0.0856
b_aebike_tc     0.1377 0.1199
b_aebike_av     0.0615 0.0123
b_aebike_wt    -0.1988 0.0575
b_aescoot_ivtt -0.7798 0.1293
b_aescoot_wk   -0.8381 0.1542
b_aescoot_tc   -1.1202 0.0890
b_aescoot_av    0.0939 0.0193
b_aescoot_wt   -0.2269 0.1193
b_awalk        -0.9065 0.0709

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 20 Sep 2024, 08:38
by stephanehess
Hi

again, the first message is a warning, not a failure.

The reason you get NaN for the equivalent fit statistics is likely due to some extreme values in the betas leading to 0 probs in some cases.

Stephane

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 20 Sep 2024, 09:19
by janak12_jp
Hi Stephane,

Thank you for the reply. Yes, I followed that it's a warning.

I have tried HB model several times. I am getting NaN everytime. Can you suggest how to avoid getting extreme values?

Re: Errors in Hybrid Choice Model with Bayesian Estimator

Posted: 20 Sep 2024, 09:29
by stephanehess
You could try other distributions, as the problems with extreme values might be less severe