Hybrid Choice Model (HCM) using Bayesian estimation

cybey · Post by **cybey** » 28 Oct 2020, 15:26

Hello everybody,

I recently read Article “Hess et al. (2018) - Analysis of mode choice for intercity travel: Application of a hybrid choice model to two distinct US corridors)” in which the Hybrid Choice Model was estimated using the Bayesian approach.Now I wanted to try estimating a HCM using HB and used Apollo example 25. However, I have found that the model results between Maximum Simulated Likelihood (MSL) and Bayes differ quite a lot.

MSL:

Code: Select all

Estimates:
                    Estimate Std.err. t.ratio(0) Rob.std.err. Rob.t.ratio(0)
b_brand_Artemis       0.0000       NA         NA           NA             NA
b_brand_Novum        -0.2767   0.0308      -8.98       0.0319          -8.66
b_brand_BestValue    -0.5811   0.0661      -8.78       0.0643          -9.04
b_brand_Supermarket  -0.2670   0.0673      -3.97       0.0662          -4.03
b_brand_PainAway     -1.2493   0.0675     -18.51       0.0655         -19.08
b_country_CH          0.6704   0.0399      16.82       0.0388          17.28
b_country_DK          0.3376   0.0382       8.83       0.0375           8.99
b_country_USA         0.0000       NA         NA           NA             NA
b_country_IND        -0.2967   0.0573      -5.18       0.0581          -5.11
b_country_RUS        -0.8937   0.0617     -14.48       0.0611         -14.62
b_country_BRA        -0.6558   0.0602     -10.89       0.0617         -10.62
b_char_standard       0.0000       NA         NA           NA             NA
b_char_fast           0.7699   0.0292      26.35       0.0289          26.65
b_char_double         1.2125   0.0378      32.10       0.0366          33.12
b_risk               -0.0016   0.0001     -26.99       0.0001         -26.51
b_price              -0.7243   0.0181     -39.97       0.0173         -41.90
lambda                0.6586   0.0325      20.26       0.0316          20.84
gamma_reg_user       -0.7451   0.0798      -9.34       0.0794          -9.38
gamma_university     -0.3797   0.0738      -5.14       0.0739          -5.14
gamma_age_50          0.6735   0.0763       8.82       0.0747           9.02
zeta_quality          0.5342   0.0393      13.59       0.0387          13.82
zeta_ingredient      -0.5028   0.0402     -12.51       0.0396         -12.69
zeta_patent           0.6069   0.0410      14.81       0.0384          15.80
zeta_dominance       -0.3960   0.0367     -10.80       0.0351         -11.30
sigma_qual            1.0616   0.0275      38.62       0.0270          39.37
sigma_ingr            1.1088   0.0279      39.79       0.0263          42.16
sigma_pate            1.0840   0.0291      37.21       0.0275          39.48
sigma_domi            1.0445   0.0253      41.26       0.0231          45.17

Summary statistics for dependent variable for model component "NormD":
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -1.741  -0.741   0.259   0.000   0.259   2.259 

Summary statistics for dependent variable for model component "NormD":
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -2.232  -0.232  -0.232   0.000   0.768   1.768 

Summary statistics for dependent variable for model component "NormD":
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -1.806  -0.806   0.194   0.000   1.194   2.194 

Summary statistics for dependent variable for model component "NormD":
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -2.164  -0.164  -0.164   0.000   0.836   1.836

HB:

Code: Select all

Summary of parameter chains

Non-random coefficients 
                       Mean     SD
b_brand_Artemis      0.0000     NA
b_brand_Novum       -0.1006 0.0093
b_brand_BestValue   -0.1751 0.0065
b_brand_Supermarket -0.0318 0.0109
b_brand_PainAway    -0.5641 0.0477
b_country_CH         0.4855 0.0210
b_country_DK         0.2851 0.0266
b_country_USA        0.0000     NA
b_country_IND       -0.0796 0.0069
b_country_RUS       -0.4323 0.0253
b_country_BRA       -0.2872 0.0111
b_char_standard      0.0000     NA
b_char_fast          0.4366 0.0235
b_char_double        0.8598 0.0402
b_risk              -0.0016 0.0001
b_price             -0.4972 0.0161
lambda               0.4800 0.0404
gamma_reg_user       0.2984 0.0201
gamma_university     0.1369 0.0044
gamma_age_50         0.0414 0.0123
zeta_quality         0.0297 0.0393
zeta_ingredient     -0.1351 0.0240
zeta_patent          0.0908 0.0267
zeta_dominance      -0.1890 0.0075
sigma_qual           1.2087 0.0146
sigma_ingr           1.1820 0.0071
sigma_pate           1.2876 0.0134
sigma_domi           1.1064 0.0110

Upper level model results for mean parameters for underlying Normals 
    Mean SD
eta    0  0

Upper level model results for covariance matrix for underlying Normals (means across iterations) 
    eta
eta   1

Upper level model results for covariance matrix for underlying Normals (SD across iterations) 
    eta
eta   0

Summary of distributions of random coeffients (after distributional transforms) 
       Mean     SD
[1,] 0.0049 0.9896

Results for posterior means for random coefficients 
      [,1]   [,2]
eta 0.1335 0.7056

To avoid errors resulting from inaccuracies in the estimation process, I have increased the number of Halton draws at MSL to 500. With Bayes, I use 50,000 draws in the burn-in phase and another 50,000 for the estimation phase. Therefore, I think something is wrong with my model specification. I suspect it might have something to do with the distributional assumptions (normally distributed and fixed parameters and their restrictions).

I would be happy if you could take a look at the R file and give me a hint. Thanks a lot in advance!

Best wishes
Nico

dpalma · Post by **dpalma** » 09 Nov 2020, 18:10

Hi Nico,

It looks like the HB estimation has not converged (i.e. the chains are not stable yet). You probably need more post burn-in iterations. You should look at the Geweke test to determine convergence (they are like a t-test, check the manual).

Cheers
David

cybey · Post by **cybey** » 10 Nov 2020, 15:16

Hi David,

thanks for your answer. So the specification is correct?

I've tried to run the model with 500,000 draws for the burn-in phase and another 500,000 draws for estimation. Unfortunately, after ~400k draws an error occured:

Code: Select all

Error: cannot allocate vector of size 78 KB
Error during wrapup: cannot allocate vector of size 0 KB
Error: no more error handlers available (recursive errors?); invoking 'abort' restart

It might have something to do with my machine. However, I've also realized on other machines (e.g. servers with lots of ram and cores) that using a large number of draws takes a lot of time. So the time required does not increase linearly with the number of draws. Furthermore, the estimation process also stops on these machines when I estimate models with HB and use a huge number of draws, for example 2 million.

To find out whether the model results can be compared I not use 0 as initial value for the parameter estimators, but the estimation results of the Maximum Simulated Likelihood (MSL) model. Then I set the number of draws for the burn-in phase to 100,000 and 100,000 for the estimation phase. Et voila: The results look quite similar (see files attached). But what emerges is that the standard errors of the parameter estimates are way smaller in the HB model compared to the MSL model. Is this a result of the low number of 500 halton draws in the MSL model?

My last point or questions is about the slow convergence of the Markov chain, which is why I used the MSL results as starting values of the HB model. Usually, it's the other way around. In the literature you generally read statements that HB models find the "optimal" - or more precisely: reasonable local optima - faster than classical estimation. But here, this seems not to be the case. Is it because of the complex nature of hybrid choice models?

Best wishes
Nico

Post by **stephanehess** » 11 Nov 2020, 21:44

Nico

regarding memory, this is an issue in RSGHB, so somewhat out of our hands.

I've discussed the other point with Thijs Dekker. Here are his thoughts: "Rsghb is optimised for mmnl not hybrid models. So as a consequence it will make use of inefficient Gibbs samplers that overly rely on metropolis Hastings algorithms. Especially in hybrid models there will be a lot of parameters going into this. augmented latent variables would make the sampler more efficient as seen in work by Daziano and myself (doi.org/10.1016/j.reseneeco.2015.11.002).

With a limited number of tuning parameters for the MH, the Gibbs sampler will be (very) slow in convergence despite starting at MSL values. The initial draws of the GS can take you far away from this point by chance. Hence the low st errors are possibly a consequence of small steps taken by the MH algorithm but impossible to judge by examining the mcmc chain in more detail - I wouldn't take these low std errors as the truth since rsghb does not rely on priors and should in theory be identical."

Hope this helps

Stephane

cybey · Post by **cybey** » 13 Nov 2020, 17:27

This helps a lot, thank you very much! So it's not an option to use RSGHB for HCM, at least with the default settings and without in-depth knowledge about bayesian estimation.

Nico

Post by **stephanehess** » 15 Nov 2020, 20:54

You can, but the caveats below apply in terms of slow convergence

ApolloChoiceModelling forum

Hybrid Choice Model (HCM) using Bayesian estimation

Hybrid Choice Model (HCM) using Bayesian estimation

Re: Hybrid Choice Model (HCM) using Bayesian estimation

Re: Hybrid Choice Model (HCM) using Bayesian estimation

Re: Hybrid Choice Model (HCM) using Bayesian estimation

Re: Hybrid Choice Model (HCM) using Bayesian estimation

Re: Hybrid Choice Model (HCM) using Bayesian estimation