Hi authors,
Thank you very much for the package and the great forum. It is super helpful.
I am hoping to try the EMDC model that considers complementarity based on Palma and Hess (2022). I am wondering how we can simulate the budget allocation data? For example, I hope to simulate some budget allocation data first, and then I can use the package the estimate the model using the simulated data to recover parameters. I understand how I can do the estimation, but do not know how to do the data simulation.
Thank you.
Best,
TW
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.
Any example to simulate data for EMDC model?
-
- Site Admin
- Posts: 1142
- Joined: 24 Apr 2020, 16:29
Re: Any example to simulate data for EMDC model?
Hi
you can use apollo_prediction to simulate some choices, but as this will be averaged across the draws for the error terms, it won't have any corner solutions.
If you want predictions at the draw level, you could include rawPrediction=TRUE in emdc_settings
Stephane & David
you can use apollo_prediction to simulate some choices, but as this will be averaged across the draws for the error terms, it won't have any corner solutions.
If you want predictions at the draw level, you could include rawPrediction=TRUE in emdc_settings
Stephane & David
Re: Any example to simulate data for EMDC model?
Hi Stephane and David,
Thank you very much for your quick response. This makes a lot of sense. I tried to use this function to simulate/predict. I did realize that the prediction takes much longer time (the estimation takes about 1 minute, while the prediction takes about 30 minutes using the data in the sample code). I believe it is because it requires a lot of draws. Thus, I am wondering what is the minimum number of draws required for prediction and how I can change that?
Best,
TW
Thank you very much for your quick response. This makes a lot of sense. I tried to use this function to simulate/predict. I did realize that the prediction takes much longer time (the estimation takes about 1 minute, while the prediction takes about 30 minutes using the data in the sample code). I believe it is because it requires a lot of draws. Thus, I am wondering what is the minimum number of draws required for prediction and how I can change that?
Best,
TW
Re: Any example to simulate data for EMDC model?
I also tried to use the simulated data to estimate the model again to see whether the true parameters can be recovered. Specifically, what I did was:
Given the sample code, I obtained the true parameters (I also fixed all the complementarity/substitution deltas to be 0 to make it simpler). Then I used these parameters to do the simulation (prediction). After that, I used the simulated choices to estimate the model again to see whether I can get the same set of estimates (I also used to the true parameters as the starting value in this step). However, I cannot seem to get the same estimates. Did I do it correctly?
Here is my code. Thank you!
Given the sample code, I obtained the true parameters (I also fixed all the complementarity/substitution deltas to be 0 to make it simpler). Then I used these parameters to do the simulation (prediction). After that, I used the simulated choices to estimate the model again to see whether I can get the same set of estimates (I also used to the true parameters as the starting value in this step). However, I cannot seem to get the same estimates. Did I do it correctly?
Here is my code. Thank you!
Code: Select all
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
### Clear memory and initialise
rm(list = ls())
library(apollo)
apollo_initialise()
### Set core controls
apollo_control = list(
modelName ="eMDC_with_budget",
modelDescr ="Extended MDC with complementarity and substitution, with observed budget and socio-demographics",
indivID ="indivID",
outputDirectory="output"
)
# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# ################################################################# #
### Load data from within the Apollo package
database = apollo_timeUseData
### Create consumption variables for combined activities
# outside good: time spent at home and travelling
database$t_outside = rowSums(database[,c("t_a01", "t_a06", "t_a10", "t_a11", "t_a12")])
database$t_leisure = rowSums(database[,c("t_a07", "t_a08", "t_a09")])
# ### Randomly split dataset into estimation (70%) and validation (30%)
# set.seed(1)
# database$validation <- runif(nrow(database))>0.7
# dbVal <- database[ database$validation,] # validation sample
# database <- database[!database$validation,] # estimation sample
# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Parameters starting values c(name1=value1, name2=value2, ...)
apollo_beta = c( sigma = 0.71, ###Here I remove aFemale
# Satiation
gWork = 10.425,
gSchool = 9.955,
gShopping= 2.065,
gPrivate = 4.419,
gLeisure = 6.927,
# Base utility
bWork =-3.372, bSchool =-5.250,
bShopping =-3.663, bPrivate =-3.973,
bLeisure =-3.300, bWork_FT = 0.708,
bWork_wknd=-1.600, bSchool_young= 0.948,
bLeisure_wknd= 0.155,
# Compl/subst
# dWorkScho=-0.008, dWorkShop= 0.000,
# dWorkPriv= 0.000, dWorkLeis= 0.000,
# dSchoShop= 0.000, dSchoPriv= 0.000,
# dSchoLeis= 0.000, dShopPriv= 0.010,
# dShopLeis= 0.012, dPrivLeis= 0.012
dWorkScho=-0.000, dWorkShop= 0.000,
dWorkPriv= 0.000, dWorkLeis= 0.000,
dSchoShop= 0.000, dSchoPriv= 0.000,
dSchoLeis= 0.000, dShopPriv= 0.000,
dShopLeis= 0.000, dPrivLeis= 0.000)
### Names of fixed parameters
apollo_fixed = c('dWorkShop', 'dWorkPriv', 'dSchoShop',
'dSchoPriv', 'dSchoLeis', 'dWorkLeis' ,
"dWorkScho" , "dShopPriv" , "dShopLeis" , "dPrivLeis") #############Here I also fix sigma to the true value 0.71
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){
### Initialise
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
### Prepare Inputs
alts = c("work", "school", "shopping", "private", "leisure")
nAlt = length(alts)
ones = setNames(as.list(rep(1, nAlt)), alts)
continuousChoice = list(work = t_a02/60,
school = t_a03/60,
shopping = t_a04/60,
private = t_a05/60,
leisure = t_leisure/60)
utilities = list(
work = bWork + bWork_FT*occ_full_time + bWork_wknd*weekend,
school = bSchool + bSchool_young*(age<=30),
shopping = bShopping,
private = bPrivate,
leisure = bLeisure + bLeisure_wknd*weekend
)
gamma = list(work = gWork,
school = gSchool,
shopping = gShopping,
private = gPrivate,
leisure = gLeisure)
delta <- c(0, 0, 0, 0, 0,
dWorkScho, 0, 0, 0, 0,
dWorkShop, dSchoShop, 0, 0, 0,
dWorkPriv, dSchoPriv, dShopPriv, 0, 0,
dWorkLeis, dSchoLeis, dShopLeis, dPrivLeis, 0)
delta <- matrix(delta, nrow=nAlt, ncol=nAlt, byrow=TRUE)
emdc_settings <- list(continuousChoice = continuousChoice,
avail = ones,
utilityOutside = 0,
utilities = utilities,
budget = 24,
sigma = sigma,
gamma = gamma,
delta = delta,
cost = ones)
P[["model"]] = apollo_emdc(emdc_settings, functionality)
### Comment out as necessary
P = apollo_panelProd(P, apollo_inputs, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION & OUTPUT ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed,
apollo_probabilities, apollo_inputs)
apollo_modelOutput(model)
apollo_saveOutput(model)
# ################################################################# #
#### PREDICTION ####
# ################################################################# #
#############################Use the model estimates to predict and see whether it is close to what is observed
model <- apollo_loadModel(apollo_control$modelName)
apollo_inputs <- apollo_validateInputs(database=database)
apollo_inputs$apollo_control$nCores <- 4
pred <- apollo_prediction(model, apollo_probabilities, apollo_inputs)
XObs <- cbind(work = database$t_a02/60,
school = database$t_a03/60,
shopping = database$t_a04/60,
private = database$t_a05/60,
leisure = database$t_leisure/60)
XPre <- pred[,4:8]
round(sqrt(colMeans((XObs - XPre)^2)),2) # RMSE per product: 3.59 0.81 1.96 1.86 3.74
round(sqrt(mean((colSums(XObs) - colSums(XPre))^2)),2) # 372.38
###############Now use predicted choice to estimate again
database$t_a02 = XPre$work*60
database$t_a03 = XPre$school*60
database$t_a04 = XPre$shopping*60
database$t_a05 = XPre$private*60
database$t_leisure = XPre$leisure*60
### Parameters starting values c(name1=value1, name2=value2, ...)
apollo_beta = c( sigma = 1.944, ###Here I remove aFemale
# Satiation
gWork = 3.2425,
gSchool = 3.7222,
gShopping= 0.3727,
gPrivate = 0.6274,
gLeisure = 1.5038,
# Base utility
bWork =-3.6871, bSchool =-7.4054,
bShopping =-4.0058, bPrivate =-4.5592,
bLeisure =-3.5019, bWork_FT = 1.2540,
bWork_wknd=-2.9429, bSchool_young= 1.8492,
bLeisure_wknd= 0.3962,
# Compl/subst
# dWorkScho=-0.008, dWorkShop= 0.000,
# dWorkPriv= 0.000, dWorkLeis= 0.000,
# dSchoShop= 0.000, dSchoPriv= 0.000,
# dSchoLeis= 0.000, dShopPriv= 0.010,
# dShopLeis= 0.012, dPrivLeis= 0.012
dWorkScho=-0.000, dWorkShop= 0.000,
dWorkPriv= 0.000, dWorkLeis= 0.000,
dSchoShop= 0.000, dSchoPriv= 0.000,
dSchoLeis= 0.000, dShopPriv= 0.000,
dShopLeis= 0.000, dPrivLeis= 0.000)
### Names of fixed parameters
apollo_fixed = c('dWorkShop', 'dWorkPriv', 'dSchoShop',
'dSchoPriv', 'dSchoLeis', 'dWorkLeis' ,
"dWorkScho" , "dShopPriv" , "dShopLeis" , "dPrivLeis") #############Here I also fix sigma to the true value 0.71
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){
### Initialise
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
### Prepare Inputs
alts = c("work", "school", "shopping", "private", "leisure")
nAlt = length(alts)
ones = setNames(as.list(rep(1, nAlt)), alts)
continuousChoice = list(work = t_a02/60,
school = t_a03/60,
shopping = t_a04/60,
private = t_a05/60,
leisure = t_leisure/60)
utilities = list(
work = bWork + bWork_FT*occ_full_time + bWork_wknd*weekend,
school = bSchool + bSchool_young*(age<=30),
shopping = bShopping,
private = bPrivate,
leisure = bLeisure + bLeisure_wknd*weekend
)
gamma = list(work = gWork,
school = gSchool,
shopping = gShopping,
private = gPrivate,
leisure = gLeisure)
delta <- c(0, 0, 0, 0, 0,
dWorkScho, 0, 0, 0, 0,
dWorkShop, dSchoShop, 0, 0, 0,
dWorkPriv, dSchoPriv, dShopPriv, 0, 0,
dWorkLeis, dSchoLeis, dShopLeis, dPrivLeis, 0)
delta <- matrix(delta, nrow=nAlt, ncol=nAlt, byrow=TRUE)
emdc_settings <- list(continuousChoice = continuousChoice,
avail = ones,
utilityOutside = 0,
utilities = utilities,
budget = 24,
sigma = sigma,
gamma = gamma,
delta = delta,
cost = ones)
P[["model"]] = apollo_emdc(emdc_settings, functionality)
### Comment out as necessary
P = apollo_panelProd(P, apollo_inputs, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION & OUTPUT ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed,
apollo_probabilities, apollo_inputs)
apollo_modelOutput(model)
Re: Any example to simulate data for EMDC model?
Hi,
Sorry for the long delay in our reply.
Best wishes
David
Sorry for the long delay in our reply.
When you simulate, you only need one draw. However, the forecasting code might not work if you only use one draw (it assumes that it must average over draws, and averaging over a single element may lead to issues). So if you want to simulate data, I would recommend you use very small number of draws (e.g. 2). Sadly, we have not implemented a rawPrediction option into emdc yet (only the mdcev model has it). But It is actually a good idea, and we will put it in our "to do" list.TWayne wrote: ↑07 Dec 2023, 07:21 I tried to use this function to simulate/predict. I did realize that the prediction takes much longer time (the estimation takes about 1 minute, while the prediction takes about 30 minutes using the data in the sample code). I believe it is because it requires a lot of draws. Thus, I am wondering what is the minimum number of draws required for prediction and how I can change that?
Best wishes
David
Re: Any example to simulate data for EMDC model?
Hi,
So about the following post:
I believe that should help with the recovery of the parameters.
Best wishes
David
So about the following post:
If I understand correctly, what you want to do is use the parameters defined in lines 41 - 65 to simulate data, and then estimate in that data and recover the parameters. If so, then you should not estimate the model in line 140, because this will lead to a change in the parameters. You should instead estimate it with a limit of zero iterations. You can do that as follows:TWayne wrote: ↑11 Dec 2023, 21:15 I also tried to use the simulated data to estimate the model again to see whether the true parameters can be recovered. Specifically, what I did was:
Given the sample code, I obtained the true parameters (I also fixed all the complementarity/substitution deltas to be 0 to make it simpler). Then I used these parameters to do the simulation (prediction). After that, I used the simulated choices to estimate the model again to see whether I can get the same set of estimates (I also used to the true parameters as the starting value in this step). However, I cannot seem to get the same estimates. Did I do it correctly?
Here is my code. Thank you!
(...)
Code: Select all
estimate_settings=list(maxIterations=0, estimationRoutine="bfgs")
model = apollo_estimate(apollo_beta, apollo_fixed,
apollo_probabilities, apollo_inputs, estimate_settings)
Best wishes
David