Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Autocorrelation in Markov Chain Monte Carlo Simulation

Ask general questions about model specification and estimation that are not Apollo specific but relevant to Apollo users.
Post Reply
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Autocorrelation in Markov Chain Monte Carlo Simulation

Post by cybey »

Hello everyone,

I have a question about estimating models using HB, either with Apollo or other software. The Markov Chain Monte Carlo draws are supposed to be independent, i.e. autocorrelation is undesirable.

For example, Train and Weeks (2005) wrote in their paper: "[...] 10,000 iterations were used as “burn-in” after which every tenth draw was retained from 10,000 additional iterations, providing a total 1,000 draws from the posterior distribution of the parameters. Previous analysis of these data by Train and Sonnier, as well as our own analysis, indicates that the MCMC sequences converged within the burn-in period."

Train, Kenneth; Weeks, Melvyn (2005): Discrete Choice Models in Preference Space and Willing-to-pay Space. In: Riccardo Scarpa und Anna Alberini (Hg.): Applications of Simulation Methods in Environmental and Resource Economics. Dordrecht: Springer (The economics of non-market goods and resources, 6), pp. 1–16.

Now I discovered that setting gNSKIP to 10, for example, is a suitable strategy for MIXL models. However, if I try to estimate more complex models using HB, such as LC-MIXL or ICLV, the R-package 'RSGHB' has problems with a huge number of draws. Anything over 1 million draws is going to be really difficult, despite having a good machine. Hence, I have to trade off between a low number of draws with low autocorrelation (gNSKIP >> 1) and a high number of draws that are more correlated (gNSKIP >= 1). Do you have any experience on what the better strategy is?

I look forward to your responses.

Best
Nico
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by dpalma »

Hi Nico,

Skipping draws in a MCMC (or "thinning the chain") is no longer a recommended practice. See, for example, https://doi.org/10.1111/j.2041-210X.2011.00131.x. So I would advise keeping all draws in the chain.

Concerning memory usage, this is a known limitation of RSGHB (the package Apollo relies on for Bayesian estimation). I am afraid that, other than running your model on a machine with more RAM, there isn't much to do. We might improve this in the future, but as it is not part of the core Apollo code, it might take a while.

Best
David
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by cybey »

Hi David,

thank you for your answer - this paper helps a lot.

One more questions about using large chains in Apollo (or RSGHB): Is it appropriate to divide the chain in severel sub-chains? In classical estimation using Maximum Simulated Likelihood, there a some papers which recommend to esimate a model first with Bayes, and then use these estimates as starting values for MSL. I thought of something similiar for large chains, for example in a first step, you use 300,000 draws as burn-in and another 100,000 draws for estimation. Now you use these estimates resulting from the 100k draws as starting values for a second chain with, for example, another 100,000 draws for burn-in and 700,000 draws for estimation.

In this example, 400k draws are used for burn-in, 100k for estimation in step 1, and another 700k for estimation in step 2. In total, 1,200k draws are used, which is usually not possible in just one estimation step.

Best
Nico
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by stephanehess »

Nico

this is probably a question that you want to put to a Bayesian modeller rather than an Apollo question per se. The issue would be that you would need to provide the full covariance matrix as starting values again after the initial run

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by cybey »

Hello Stephane,

thanks for the answer. If I understand it correctly, then the output from the first step can simply be integrated in the second step. Example:

After step 1:

Code: Select all

### Save priors for step 2
saveRDS(model$F_convergence, file = "./priors_step2/FC_step1.rds") # A vector of starting values for the fixed parameters
saveRDS(model$A_convergence, file = "./priors_step2/svN_step1.rds") # A vector of starting values for the means of the underlying normals for the random parameters
saveRDS(model$random_coeff_covar, file = "./priors_step2/pvMatrix_step1.rds") # A custom prior variance-covariance matrix to be used in estimation
saveRDS(c(model$F_convergence, model$A_convergence), file = "./priors_step2/apollo_beta_step1.rds") # A vector of starting values for the fixed and random parameters
In step 2:

Code: Select all

### Load priors from step 1
priors.step2 = list()
priors.step2[["FC"]] = apply(subset(model$F, select=-c(1)), 2, mean)
priors.step2[["svN"]] = apply(subset(model$A, select=-c(1)), 2, mean)
priors.step2[["pvMatrix"]] = model$random_coeff_covar

saveRDS(priors.step2[["FC"]], file = "./priors_step2/FC_step1.rds") # A vector of starting values for the fixed parameters
saveRDS(priors.step2[["svN"]], file = "./priors_step2/svN_step1.rds") # A vector of starting values for the fixed parameters
saveRDS(priors.step2[["pvMatrix"]], file = "./priors_step2/pvMatrix_step1.rds") # A vector of starting values for the fixed parameters

Code: Select all

### HB settings
FC = priors.step1$FC,
sVn = priors.step1$svN,
pvMatrix = priors.step1$pvMatrix,
Or did I miss something?


I have a second question about model estimation with HB: So far, in classical estimation with MSL, I have put the data into a format that is suitable for optimisation, so that the coefficients of the parameters are in comparable orders of magnitude, e.g. for price 50-euro increments (numerically as 1, 2, 3) instead of 1-euro increments. Does rescaling play the same role in model estimation when using HB? I have only found papers that discuss rescaling in the context of MSL. I did not find anything in the Apollo manual either, except that the rescaling argument can be used in estimation_settings.

Nico
Last edited by cybey on 19 Mar 2021, 06:51, edited 1 time in total.
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by stephanehess »

Nico

I have never done what you're suggesting here in terms of two step approaches, and I think it's a question you would need to put to the developers of RSGHB, which is the package that's used inside Apollo.

In relation to your second question, yes, I believe that scaling is also important in a HB context.

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by cybey »

Hi Stephane,

ok, I'm on it and will write to the developers of the package 'RSGHB' and keep you posted.

The problem with this two step approach and Apollo I encountered is the following:
Settings modelname, gVarNamesFixed , gVarNamesNormal , gDIST , svN and FC should not be included in apollo_HB , as these are automatically set by Apollo.
Source: Apollo manual, page 97

Hence, the parameter estimators imported from step 1 are simply overwritten. If the user specifies values for FC, svN, and pvMatrix it would be great if Apollo used these values as starting values for estimation. At the moment, I don't see an "elegant way" (instead of copy+paste) to use parameter estimators from an HB model as starting values for another HB model, because no 'modelname_estimates.csv' file is generated that can be imported using apollo_readBeta. Furthermore, as you mentioned, it would be helpful to be able to import the covariance matrix as well. That would at least be a cool feature for future versions of Apollo - but is just an idea. ;)

Best
Nico
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by stephanehess »

Nico

svN is simply generated on the basis of apollo_beta, so that's easy for you to address, I think

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
cybey
Posts: 60
Joined: 26 Apr 2020, 19:38

Re: Autocorrelation in Markov Chain Monte Carlo Simulation

Post by cybey »

Hello, everyone,

I just wanted to let you know that splitting the HB estimation process to avoid Rstudio crashing works. You can do something like this:

After, let's say, the frist 1,000,000 draws (step 1):

Code: Select all

### Save priors for step 2
priors.step2 = list()
priors.step2[["FC"]] = apply(subset(model$F, select=-c(1)), 2, mean)
priors.step2[["svN"]] = apply(subset(model$A, select=-c(1)), 2, mean)
priors.step2[["pvMatrix"]] = model$random_coeff_covar

saveRDS(priors.step2[["FC"]], file = "./priors_step2/FC_step1.rds")
saveRDS(priors.step2[["svN"]], file = "./priors_step2/svN_step1.rds")
saveRDS(priors.step2[["pvMatrix"]], file = "./priors_step2/pvMatrix_step1.rds")
Afterwards in step 2 when loading the data ...

Code: Select all

# priors.step2 = list()
# priors.step2[["FC"]] = readRDS(file = "./priors_step2/FC_step1.rds") # A vector of starting values for the fixed parameters
# priors.step2[["svN"]] = readRDS(file = "./priors_step2/svN_step1.rds") # A vector of starting values for the means of the underlying normals for the random parameters
# priors.step2[["pvMatrix"]] = readRDS(file = "./priors_step2/pvMatrix_step1.rds") # A custom prior variance-covariance matrix to be used in estimation
... and after apollo_beta:

Code: Select all

apollo_beta_new = replace(apollo_beta, names(priors.step2$FC), priors.step2$FC)
apollo_beta_new = replace(apollo_beta_new, names(priors.step2$svN), priors.step2$svN)
apollo_beta = apollo_beta_new[names(apollo_beta)]
The only downside is that you cannot use the prior variance from step 1 in step 2, as the values from apollo_HB are used and I don't know how to modify these.

Best
Nico
Post Reply