Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.
Structuring a dataset with forced choice followed by unforced/opt-out
-
- Posts: 7
- Joined: 05 May 2021, 00:05
Structuring a dataset with forced choice followed by unforced/opt-out
Hello there,
I was interested to know how to code a dataset that presents the participant with a forced choice (Program A or Program B), followed by an opportunity to opt-out (Program A or not participate at all, Program B or not participate at all). The DCE_forced and opt-out image shows what the participant might see in the survey (--only one of the opt-out questions is shown to the participant based on the previous forced choice of Program A or Program B) My full dataset is available at the OSF: https://osf.io/jfz5h, and I've also included a screenshot of the dataset to give you a sense of how I've currently structured the dataset.
For example Program A is represented by: goal1, form1, mag1, dir1; Program B is represented by: goal2, form2, mag2, dir2. At present if the participant opted out of participation, I have that represented by goal3, form3, mag3, dir3 (all with the values are "SQ"). Column D: choice_forced represents whether the participant selected Program A=1 or Program B=2, Column E: choice_tx represents whether the participant selected to participate in either Program (=1) or opt-out (=0), and Column F: choice_best represent the first choice of the participant (Program A=1, Program B=2, opt-out=0) I appreciate any guidance you can provide on how to code these opt-out data in the dataset.
Thanks!
My-Linh
I was interested to know how to code a dataset that presents the participant with a forced choice (Program A or Program B), followed by an opportunity to opt-out (Program A or not participate at all, Program B or not participate at all). The DCE_forced and opt-out image shows what the participant might see in the survey (--only one of the opt-out questions is shown to the participant based on the previous forced choice of Program A or Program B) My full dataset is available at the OSF: https://osf.io/jfz5h, and I've also included a screenshot of the dataset to give you a sense of how I've currently structured the dataset.
For example Program A is represented by: goal1, form1, mag1, dir1; Program B is represented by: goal2, form2, mag2, dir2. At present if the participant opted out of participation, I have that represented by goal3, form3, mag3, dir3 (all with the values are "SQ"). Column D: choice_forced represents whether the participant selected Program A=1 or Program B=2, Column E: choice_tx represents whether the participant selected to participate in either Program (=1) or opt-out (=0), and Column F: choice_best represent the first choice of the participant (Program A=1, Program B=2, opt-out=0) I appreciate any guidance you can provide on how to code these opt-out data in the dataset.
Thanks!
My-Linh
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi
this depends entirely on what you want to do in your model. Are you wanting to jointly model the choice of the preferred programme and then for each programme the decision on whether they would participate or not? This is not difficult to do, but it would mean you needing to make assumptions about whether the preferences that drive the choice between are the same as those that determine acceptance or not of either programme. If you're happy with that, you could model this as having three dependent variables, the choice between, and then the participation in each, and all three would be binary models. Is that you want, then I can help you set it up?
Stephane
this depends entirely on what you want to do in your model. Are you wanting to jointly model the choice of the preferred programme and then for each programme the decision on whether they would participate or not? This is not difficult to do, but it would mean you needing to make assumptions about whether the preferences that drive the choice between are the same as those that determine acceptance or not of either programme. If you're happy with that, you could model this as having three dependent variables, the choice between, and then the participation in each, and all three would be binary models. Is that you want, then I can help you set it up?
Stephane
-
- Posts: 7
- Joined: 05 May 2021, 00:05
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi Stephane!
Thanks for your response; yes, the suggestion you made below is the direction we would like to go in. Greatly appreciate your assistance
My-Linh
Thanks for your response; yes, the suggestion you made below is the direction we would like to go in. Greatly appreciate your assistance
My-Linh
stephanehess wrote: ↑12 Jun 2021, 13:58 Hi
this depends entirely on what you want to do in your model. Are you wanting to jointly model the choice of the preferred programme and then for each programme the decision on whether they would participate or not? This is not difficult to do, but it would mean you needing to make assumptions about whether the preferences that drive the choice between are the same as those that determine acceptance or not of either programme. If you're happy with that, you could model this as having three dependent variables, the choice between, and then the participation in each, and all three would be binary models. Is that you want, then I can help you set it up?
Stephane
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi
the easiest way to prepare the data would be one row per choice card, with the choice between and the participation questions all in the same row. Then you would just have a model with three components, probably all MNL to start with
Stephane
the easiest way to prepare the data would be one row per choice card, with the choice between and the participation questions all in the same row. Then you would just have a model with three components, probably all MNL to start with
Stephane
-
- Posts: 7
- Joined: 05 May 2021, 00:05
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi Stephane,
I am not sure if I have understood your comments. As the data are currently structured, I do have one row per choice (Column C: task) with the choice between two programs (Column D: choice_forced) and the choice between the selected program and the treatment (Column E: choice_tx). At present I have the unforced choice/opt-out coded in the data as "SQ" across the different levels of the attribute (Columns O:R), which I presume is not appropriate. [please see the attached screenshot]. I have been able to successfully run an MNL model for the forced choice for the choice between, but wasn't clear on how to run the MNL model for the unforced choice for the participation in each since I don't believe I've coded the data correctly to reflect the opt-out choice.
Should the data be structured in a different way such that the forced choice and unforced choice are in two separate datasets or two different rows?
Thank you for your assistance!
My-Linh
I am not sure if I have understood your comments. As the data are currently structured, I do have one row per choice (Column C: task) with the choice between two programs (Column D: choice_forced) and the choice between the selected program and the treatment (Column E: choice_tx). At present I have the unforced choice/opt-out coded in the data as "SQ" across the different levels of the attribute (Columns O:R), which I presume is not appropriate. [please see the attached screenshot]. I have been able to successfully run an MNL model for the forced choice for the choice between, but wasn't clear on how to run the MNL model for the unforced choice for the participation in each since I don't believe I've coded the data correctly to reflect the opt-out choice.
Should the data be structured in a different way such that the forced choice and unforced choice are in two separate datasets or two different rows?
Thank you for your assistance!
My-Linh
- Attachments
-
- DCE_data structure.png (79.11 KiB) Viewed 13794 times
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi
please also show us your code
Thanks
please also show us your code
Thanks
-
- Posts: 7
- Joined: 05 May 2021, 00:05
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi Stephane,
This is the code I used for an MNL with the forced choice model, using these data https://osf.io/jfz5h
This is the code I used for an MNL with the forced choice model, using these data https://osf.io/jfz5h
Code: Select all
### Load libraries
library(here)
library(readr)
library(apollo)
library(dplyr)
### Initialise code
apollo_initialise()
### Set core controls
apollo_control<-list(
modelName= "dce_model1",
modelDescr="MNL model on SP data",
indivID= "ID"
)
# ################################################################# #
#### 2. Data loading and apply any transformations ####
# ################################################################# #
database<-read_rds(here("01_data","02_processed", "00_data_processed.rds"))
# ####################################################### #
#### 3. Parameter definition ####
# ####################################################### #
### Vector of parameters, including any that are kept fixed
### during estimation
apollo_beta = c(
b_goal_30=0,
b_goal_60=0,
b_goal_90=0,
b_form_cash=0,
b_form_voucher=0,
b_form_donate=0,
b_mag_160=0,
b_mag_300=0,
b_mag_500=0,
b_dir_pos=0,
b_dir_neg=0
)
### Vector with names (in quotes) of parameters to be
### kept fixed at their starting value in apollo_beta.
### Use apollo_beta_fixed = c() for no fixed parameters.
apollo_fixed<-c("b_goal_30","b_form_cash", "b_mag_160", "b_dir_pos")
#apollo_fixed <- c()
# ####################################################### #
#### 4. Input validation ####
# ####################################################### #
apollo_inputs = apollo_validateInputs()
# Several observations per individual detected based on the value of ID. Setting panelData in apollo_control set
# to TRUE.
# All checks on apollo_control completed.
# All checks on database completed.
# ####################################################### #
#### 5. Define Model and Likelihood definition ####
# ####################################################### #
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### List of utilities: these must use the same names as
### in mnl_settings, order is irrelevant.
V = list()
V[['alt1']] = b_goal_30*(goal1==30) + b_goal_60*(goal1==60)+ b_goal_90*(goal1==90)+
b_form_cash*(form1=="cash")+b_form_donate*(form1=="donate")+b_form_voucher*(form1=="voucher")+
b_mag_160*(mag1==160)+b_mag_300*(mag1==300)+b_mag_500*(mag1==500)+
b_dir_pos*(dir1=="pos")+b_dir_neg*(dir1=="neg")
V[['alt2']] = b_goal_30*(goal2==30) + b_goal_60*(goal2==60)+ b_goal_90*(goal2==90)+
b_form_cash*(form2=="cash")+b_form_donate*(form2=="donate")+b_form_voucher*(form2=="voucher")+
b_mag_160*(mag2==160)+b_mag_300*(mag2==300)+b_mag_500*(mag2==500)+
b_dir_pos*(dir2=="pos")+b_dir_neg*(dir2=="neg")
# asc_1 +
# asc_2 +
### Define settings for MNL model component
mnl_settings = list(
alternatives = c(alt1=1, alt2=2),
avail = 1,
choiceVar = choice_forced,
# explanators = database[,c("risk_score","loss_score","intrinsic","extrinsic", "pain_NRS", "function_NRS",
# "IPAQ_cat",
# "gender", "age", "BMI_calc", "income")],
V=V)
### Compute probabilities using MNL model
P[['model']] = apollo_mnl(mnl_settings, functionality)
### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ####################################################### #
#### 6. Model estimation and reporting ####
# ####################################################### #
model = apollo_estimate(apollo_beta, apollo_fixed,
apollo_probabilities,
apollo_inputs)
# Testing likelihood function...
# WARNING: Availability not provided (or some elements are NA). Full availability assumed.
#
# Overview of choices for MNL model component :
# alt1 alt2
# Times available 2294.00 2294.00
# Times chosen 1067.00 1227.00
# Percentage chosen overall 46.51 53.49
# Percentage chosen when available 46.51 53.49
#
# Pre-processing likelihood function...
# Preparing pre-processing report
#
# Testing influence of parameters.......
# Starting main estimation
# Initial function value: -1590.08
# Initial gradient value:
# b_goal_60 b_goal_90 b_form_voucher b_form_donate b_mag_300 b_mag_500 b_dir_neg
# 30.5 -186.5 138.0 -485.0 22.5 152.5 8.0
# initial value 1590.079632
# iter 2 value 1268.441533
# iter 3 value 1202.052010
# iter 4 value 1200.157902
# iter 5 value 1195.722906
# iter 6 value 1191.905540
# iter 7 value 1157.219204
# iter 8 value 1156.673704
# iter 9 value 1139.146684
# iter 10 value 1132.782311
# iter 11 value 1099.779611
# iter 12 value 1099.463659
# iter 13 value 1099.453357
# iter 14 value 1099.451959
# iter 15 value 1099.451820
# iter 15 value 1099.451818
# iter 15 value 1099.451818
# final value 1099.451818
# converged
# Estimated parameters:
# Estimate
# b_goal_30 0.00000
# b_goal_60 -0.19134
# b_goal_90 -0.86738
# b_form_cash 0.00000
# b_form_voucher -0.51883
# b_form_donate -1.96009
# b_mag_160 0.00000
# b_mag_300 0.63871
# b_mag_500 0.84113
# b_dir_pos 0.00000
# b_dir_neg -0.06212
#
# Computing covariance matrix using analytical gradient.
# 0%....25%....50%....75%....100%
# Negative definite Hessian with maximum eigenvalue: -77.850397
# Computing score matrix...
# Calculating LL(0) for applicable models...
# Calculating LL of each model component...
apollo_modelOutput(model, list (printPVal=TRUE))
# Model run using Apollo for R, version 0.2.5 on Windows by My-Linh
# www.ApolloChoiceModelling.com
#
# Model name : dce_model1
# Model description : MNL model on SP data
# Model run at : 2021-06-11 16:53:46
# Estimation method : bfgs
# Model diagnosis : successful convergence
# Number of individuals : 288
# Number of rows in database : 2294
# Number of modelled outcomes : 2294
#
# Number of cores used : 1
# Model without mixing
#
# LL(start) : -1590.08
# LL(0) : -1590.08
# LL(final) : -1099.452
# Rho-square (0) : 0.3086
# Adj.Rho-square (0) : 0.3042
# AIC : 2212.9
# BIC : 2253.07
#
#
# Estimated parameters : 7
# Time taken (hh:mm:ss) : 00:00:2.07
# pre-estimation : 00:00:0.8
# estimation : 00:00:0.71
# post-estimation : 00:00:0.56
# Iterations : 17
# Min abs eigenvalue of Hessian : 77.8504
#
# Estimates:
# Estimate s.e. t.rat.(0) p(1-sided) Rob.s.e. Rob.t.rat.(0) p(1-sided)
# b_goal_30 0.00000 NA NA NA NA NA NA
# b_goal_60 -0.19134 0.07720 -2.478 0.006599 0.08397 -2.279 0.01134
# b_goal_90 -0.86738 0.07617 -11.388 0.000000 0.09436 -9.192 0.00000
# b_form_cash 0.00000 NA NA NA NA NA NA
# b_form_voucher -0.51883 0.07106 -7.301 1.428e-13 0.07595 -6.831 4.206e-12
# b_form_donate -1.96009 0.08633 -22.705 0.000000 0.11333 -17.295 0.00000
# b_mag_160 0.00000 NA NA NA NA NA NA
# b_mag_300 0.63871 0.08125 7.861 1.887e-15 0.07983 8.001 6.661e-16
# b_mag_500 0.84113 0.07464 11.269 0.000000 0.08854 9.500 0.00000
# b_dir_pos 0.00000 NA NA NA NA NA NA
# b_dir_neg -0.06212 0.05372 -1.157 0.123727 0.05476 -1.135 0.12828
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Thanks. Two more questions
1. Do you want to model both follow-up questions, i.e. for each alternative? If so, what are the columns?
2. Do you want to allow for a scale difference in the utility for the follow-up question compared to the forced choice?
Stephane
1. Do you want to model both follow-up questions, i.e. for each alternative? If so, what are the columns?
2. Do you want to allow for a scale difference in the utility for the follow-up question compared to the forced choice?
Stephane
-
- Posts: 7
- Joined: 05 May 2021, 00:05
Re: Structuring a dataset with forced choice followed by unforced/opt-out
Hi Stephane,
(1) I don't think it makes sense to model both follow-up questions, since participants only saw one of the questions, based on which Program they selected (i.e. based on the question logic, they would only see Q5.11 if in the previous question they selected Program A, and only see Q5.12 if they selected Program B in the previous question). Since not all options A, B and opt-out were presented at one time (intentionally, to avoid a participant opting-out for all questions & not providing any preference data), there aren't enough data for ranking (e.g. best-worst DCE) but having the data for the preferred treatment (Program A or Program B) & status quo data available in one column might make more sense to model as opposed to treatment v no treatment. Does it make more sense to use the "choice_best" column to model the unforced choice? In the "choice_best" column 0= status quo, 1= Program A, 2= Program B, whereas in the "choice_tx" column 0=status quo, 1= treatment (Program A or Program B) (2) I think it makes sense to allow for a scale difference for the follow-up question
Thanks for helping me think this through carefully!
My-Linh
(1) I don't think it makes sense to model both follow-up questions, since participants only saw one of the questions, based on which Program they selected (i.e. based on the question logic, they would only see Q5.11 if in the previous question they selected Program A, and only see Q5.12 if they selected Program B in the previous question). Since not all options A, B and opt-out were presented at one time (intentionally, to avoid a participant opting-out for all questions & not providing any preference data), there aren't enough data for ranking (e.g. best-worst DCE) but having the data for the preferred treatment (Program A or Program B) & status quo data available in one column might make more sense to model as opposed to treatment v no treatment. Does it make more sense to use the "choice_best" column to model the unforced choice? In the "choice_best" column 0= status quo, 1= Program A, 2= Program B, whereas in the "choice_tx" column 0=status quo, 1= treatment (Program A or Program B) (2) I think it makes sense to allow for a scale difference for the follow-up question
Thanks for helping me think this through carefully!
My-Linh
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: Structuring a dataset with forced choice followed by unforced/opt-out
sorry, not sure I follow, but it seems straightforward to me to just model what is in the survey.
So you would have a utility for A and for B, and then you would model the preferred option out of A and B, followed by the choice between the chosen one and the status quo.
Something a bit like this
So you would have a utility for A and for B, and then you would model the preferred option out of A and B, followed by the choice between the chosen one and the status quo.
Something a bit like this
Code: Select all
### Create list of probabilities P
P = list()
### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
V = list()
V[['A']] = ...
V[['B']] = ...
V[['SQ']] = 0
### Compute probabilities for preferred out of A and B
mnl_settings = list(
alternatives = c(A=1, B=2, SQ=3),
avail = list(A=1, B=1, SQ=0),
choiceVar = choice_best,
V = V
)
P[['choice_best']] = apollo_mnl(mnl_settings, functionality)
### Compute probabilities for 'worst' choice using MNL model
mnl_settings$avail = list(A=(choice_best==1), B=(choice_best=2), SQ=1)
mnl_settings$choiceVar = (choice_forced==1)*choice_best+(choice_forced==2)*3
mnl_settings$V = lapply(V,"*",mu_forced)
P[['forced']] = apollo_mnl(mnl_settings, functionality)
### Combined model
P = apollo_combineModels(P, apollo_inputs, functionality)
### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)