Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Multiple Imputation and ICLV Model Estimation

Ask questions about data format and processing of data, including the use of pre-estimation functions in Apollo. If your question relates to a specific error you are getting, please provide some of the output.
Post Reply
hossain
Posts: 11
Joined: 09 Apr 2021, 05:22

Multiple Imputation and ICLV Model Estimation

Post by hossain »

Dear All,

Hope this email finds you well. We are working on a dataset which some missing values. We want to perform multiple imputations to estimate the missing values and produce several datasets. Is possible to use those multiple imputed datasets to estimated an ICLV model where the model coefficients are pooled estimates (from multiple datasets) in apollo? Thank you!
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: Multiple Imputation and ICLV Model Estimation

Post by dpalma »

Hi,

I believe there are two ways to approach this problem: sequential and simultaneous.

Sequential approach
1) Estimate an auxiliary model to impute the missing data
2) Predict the value of the missing data using the auxiliary model. Save this new dataset (without any missing data) as, for example, db1.csv.
3) Repeat step (2) either with a different auxiliary model, or the same but adding noise to it. An example of adding noise: let's imagine your auxiliary model is a linear regression z = b1*x + e, where z is the missing data, x is another explanatory variable, and e is the error term. You could generate multiple predictions by simulating the value of e by drawing from its random distribution. After this step M times you will end up with multiple datasets: db1, db2, db3, ...., dbM (M is the number of dataset you generated).
4) Stack together all your datasets (one of top of the other). You'll end up with a new big dataset with M*N rows (N is the number of observations in the original dataset), and the same number of variables (columns) as the original dataset.
5) Estimate your model on the big dataset.

Simultaneous approach
You could estimate the auxiliary and main model together using simulated full information maximum likelihood. While preferable from a statistical point of view, it might be more difficult to implement and estimate, as the particular implementation will depend on what kind of models you are using, and the estimation will be more computationally intensive and prone to fall in local optima.

Best
David
hossain
Posts: 11
Joined: 09 Apr 2021, 05:22

Re: Multiple Imputation and ICLV Model Estimation

Post by hossain »

Hi Dr. Palma,

Thank you for your kind and informative reply. I really appreciate it.

Regards

Hossain
hossain
Posts: 11
Joined: 09 Apr 2021, 05:22

Re: Multiple Imputation and ICLV Model Estimation

Post by hossain »

Dear Dr. Palma,

Thank you for your previous reply. I have a further query regarding the Sequential approach you have mentioned. For example, if I have a dataset of 500 samples with a lot of missing values. I have done multiple imputations using the MICE package and as an example, I have produced 20 datasets. If I stack the data together, I get a dataset of 10000 observations. I am concerned about running the ICLV on the inflated dataset as the statistical properties may not same compared to the original one. So, after producing the 20 datasets, I want to run the ICLV model 20 times.

Is it possible to run a loop or is there a function that can run the ICLV model 20 times and then pooled the estimates from the 20 ICLV estimates? Is it possible to pool the estimates using the apollo platform? I appreciate your time on this. Thank you!

Regards
Hossain
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Multiple Imputation and ICLV Model Estimation

Post by stephanehess »

Hossain

you could easily set this up as a loop. Are your 20 datasets in separate data.frames?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
hossain
Posts: 11
Joined: 09 Apr 2021, 05:22

Re: Multiple Imputation and ICLV Model Estimation

Post by hossain »

Dear Dr. Hess,

The MICE package gives the data in alist format, however, I can make them into separate data frames. I do not know how I can set up the code on the apollo platform so that it can give me pooled estimates from 20 datasets. For example, some portion of the code.

database = read.csv("Use.csv",header=TRUE)


### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()


### Set parameters for generating draws
apollo_draws = list(
interDrawsType="halton",
interNDraws=100,
interUnifDraws=c(),
interNormDraws=c("nu_n1","nu_n2","nu_n3"),

intraDrawsType='',
intraNDraws=0,
intraUnifDraws=c(),
intraNormDraws=c()
)

### Create random parameters
apollo_randCoeff=function(apollo_beta, apollo_inputs){
randcoeff = list()

randcoeff[["LV1"]] = nu_n1 + a_woman1 * Woman
randcoeff[["LV2"]] = nu_n2 + a_income_L2*income_low
randcoeff[["LV3"]] = nu_n3 + a_education3*education_college

return(randcoeff)
}

###-----------------
V = list()
V[['no']] = 0
V[['yes']] = asc_yes + b_student*student+ b_age* age + b_Woman* Woman+b_education*education_college+
b_income_L*income_low+ b_income_M*income_middle+b_employed*employ_fulltime+
gamma1*LV1+gamma2*LV2+gamma3*LV3

### Define settings for MNL model component
mnl_settings = list(
alternatives = c(no=0, yes=1),
avail = 1,
choiceVar = Use,
V = V
)

### Compute probabilities for MNL model component
P[["JUMP_use"]] = apollo_mnl(mnl_settings, functionality)

### Likelihood of the whole model
P = apollo_combineModels(P, apollo_inputs, functionality)

### Take product across observation for same individual
#P = apollo_panelProd(P, apollo_inputs, functionality)

### Average across inter-individual draws
P = apollo_avgInterDraws(P, apollo_inputs, functionality)

### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}


I appreciate your time on this. Thank you!

Regards
Hossain
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Multiple Imputation and ICLV Model Estimation

Post by stephanehess »

Hi Hossain

let's assume you have a list called data, which contains the different versions of the dataset. Then you could do something like this:

First part is before the loop, with all the details you want in apollo_control for your model

Code: Select all

rm(list = ls())
library(apollo)
apollo_initialise()
apollo_control = list(
  ...
)
Then we load your list

Code: Select all

data = readRDS("overall_list.rds)
Then initialise a new list

Code: Select all

models=list()
Loop over your datasets

Code: Select all

for(s in 1:length(data)){

database = data[[s]]

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()


### Set parameters for generating draws
apollo_draws = list(

...

P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}

models[[s]]=apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)}
You now have a new list called models, where each element contains the outputs for one of your datasets.

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
hossain
Posts: 11
Joined: 09 Apr 2021, 05:22

Re: Multiple Imputation and ICLV Model Estimation

Post by hossain »

Hi Dr. Hess,

Thank you for your kind and detailed reply. I appreciate it.

I just need one more clarification. Given that I have m number of datasets, I will get m number of outputs (such as m number of regression estimates for a predictor).

I was curious whether it is possible to pool the estimate. So that for m number of datasets, I will get one output of estimates. One of the functions of the MICE package is after imputing multiple datasets, I can get one regression output table. It is mentioned they do it by Rubin's rule ("The pool() function combines the estimates from m repeated complete data analyses"). However, I can not run the ICLV model in MICE.

Is it possible to do this pool operation or a similar type of operation in the apollo platform for an ICLV model with multiple input datasets? I appreciate your time on this. Thank you!

Regards

Hossain
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: Multiple Imputation and ICLV Model Estimation

Post by dpalma »

Hi Hossain,

Sadly, and as far as I know, the models estimated using Apollo cannot be used as inputs to the pool function you mentioned. Therefore, you will have to apply Rubin's rule manually. The method was proposed in:
  • Rubin, D.B. (1987). Multiple Imputation for Nonresponse in Surveys. New York: John Wiley and Sons.
But is also described in section 2 of :
Sorry for not providing more detailed instructions, but I am not familiar with the method.

Cheers
David
Post Reply