Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Structuring a dataset with forced choice followed by unforced/opt-out

Ask questions about data format and processing of data, including the use of pre-estimation functions in Apollo. If your question relates to a specific error you are getting, please provide some of the output.
Post Reply
theycallmemylinh
Posts: 7
Joined: 05 May 2021, 00:05

Structuring a dataset with forced choice followed by unforced/opt-out

Post by theycallmemylinh »

Hello there,

I was interested to know how to code a dataset that presents the participant with a forced choice (Program A or Program B), followed by an opportunity to opt-out (Program A or not participate at all, Program B or not participate at all). The DCE_forced and opt-out image shows what the participant might see in the survey (--only one of the opt-out questions is shown to the participant based on the previous forced choice of Program A or Program B)
DCE_forced and opt-out.png
DCE_forced and opt-out.png (47.72 KiB) Viewed 12568 times
My full dataset is available at the OSF: https://osf.io/jfz5h, and I've also included a screenshot of the dataset to give you a sense of how I've currently structured the dataset.

For example Program A is represented by: goal1, form1, mag1, dir1; Program B is represented by: goal2, form2, mag2, dir2. At present if the participant opted out of participation, I have that represented by goal3, form3, mag3, dir3 (all with the values are "SQ"). Column D: choice_forced represents whether the participant selected Program A=1 or Program B=2, Column E: choice_tx represents whether the participant selected to participate in either Program (=1) or opt-out (=0), and Column F: choice_best represent the first choice of the participant (Program A=1, Program B=2, opt-out=0)
DCE_dataset_screenshot.png
DCE_dataset_screenshot.png (101.3 KiB) Viewed 12568 times
I appreciate any guidance you can provide on how to code these opt-out data in the dataset.

Thanks!
My-Linh
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by stephanehess »

Hi

this depends entirely on what you want to do in your model. Are you wanting to jointly model the choice of the preferred programme and then for each programme the decision on whether they would participate or not? This is not difficult to do, but it would mean you needing to make assumptions about whether the preferences that drive the choice between are the same as those that determine acceptance or not of either programme. If you're happy with that, you could model this as having three dependent variables, the choice between, and then the participation in each, and all three would be binary models. Is that you want, then I can help you set it up?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
theycallmemylinh
Posts: 7
Joined: 05 May 2021, 00:05

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by theycallmemylinh »

Hi Stephane!

Thanks for your response; yes, the suggestion you made below is the direction we would like to go in. Greatly appreciate your assistance

My-Linh
stephanehess wrote: 12 Jun 2021, 13:58 Hi

this depends entirely on what you want to do in your model. Are you wanting to jointly model the choice of the preferred programme and then for each programme the decision on whether they would participate or not? This is not difficult to do, but it would mean you needing to make assumptions about whether the preferences that drive the choice between are the same as those that determine acceptance or not of either programme. If you're happy with that, you could model this as having three dependent variables, the choice between, and then the participation in each, and all three would be binary models. Is that you want, then I can help you set it up?

Stephane
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by stephanehess »

Hi

the easiest way to prepare the data would be one row per choice card, with the choice between and the participation questions all in the same row. Then you would just have a model with three components, probably all MNL to start with

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
theycallmemylinh
Posts: 7
Joined: 05 May 2021, 00:05

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by theycallmemylinh »

Hi Stephane,

I am not sure if I have understood your comments. As the data are currently structured, I do have one row per choice (Column C: task) with the choice between two programs (Column D: choice_forced) and the choice between the selected program and the treatment (Column E: choice_tx). At present I have the unforced choice/opt-out coded in the data as "SQ" across the different levels of the attribute (Columns O:R), which I presume is not appropriate. [please see the attached screenshot]. I have been able to successfully run an MNL model for the forced choice for the choice between, but wasn't clear on how to run the MNL model for the unforced choice for the participation in each since I don't believe I've coded the data correctly to reflect the opt-out choice.

Should the data be structured in a different way such that the forced choice and unforced choice are in two separate datasets or two different rows?

Thank you for your assistance!
My-Linh
Attachments
DCE_data structure.png
DCE_data structure.png (79.11 KiB) Viewed 12113 times
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by stephanehess »

Hi

please also show us your code

Thanks
--------------------------------
Stephane Hess
www.stephanehess.me.uk
theycallmemylinh
Posts: 7
Joined: 05 May 2021, 00:05

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by theycallmemylinh »

Hi Stephane,

This is the code I used for an MNL with the forced choice model, using these data https://osf.io/jfz5h

Code: Select all

### Load libraries
library(here)
library(readr)
library(apollo)
library(dplyr)
### Initialise code
apollo_initialise()

### Set core controls
apollo_control<-list(
  modelName= "dce_model1",
  modelDescr="MNL model on SP data",
  indivID= "ID"
)

# ################################################################# #
#### 2. Data loading and apply any transformations               ####
# ################################################################# #
database<-read_rds(here("01_data","02_processed", "00_data_processed.rds")) 
 

# ####################################################### #
#### 3. Parameter definition                           ####
# ####################################################### #

### Vector of parameters, including any that are kept fixed 
### during estimation

apollo_beta = c(
  b_goal_30=0,
  b_goal_60=0,
  b_goal_90=0,
  b_form_cash=0,
  b_form_voucher=0,
  b_form_donate=0,
  b_mag_160=0,
  b_mag_300=0,
  b_mag_500=0,
  b_dir_pos=0,
  b_dir_neg=0
  )

### Vector with names (in quotes) of parameters to be
###  kept fixed at their starting value in apollo_beta.
### Use apollo_beta_fixed = c() for no fixed parameters.
apollo_fixed<-c("b_goal_30","b_form_cash", "b_mag_160", "b_dir_pos")

#apollo_fixed <- c()
# ####################################################### #
#### 4. Input validation                               ####
# ####################################################### #

apollo_inputs = apollo_validateInputs()
# Several observations per individual detected based on the value of ID. Setting panelData in apollo_control set
# to TRUE.
# All checks on apollo_control completed.
# All checks on database completed.

# ####################################################### #
#### 5. Define Model and Likelihood definition                          ####
# ####################################################### #

apollo_probabilities=function(apollo_beta, apollo_inputs, 
                              functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: these must use the same names as
  ### in mnl_settings, order is irrelevant.
  V = list()
  V[['alt1']] = b_goal_30*(goal1==30) + b_goal_60*(goal1==60)+ b_goal_90*(goal1==90)+
    b_form_cash*(form1=="cash")+b_form_donate*(form1=="donate")+b_form_voucher*(form1=="voucher")+
    b_mag_160*(mag1==160)+b_mag_300*(mag1==300)+b_mag_500*(mag1==500)+
    b_dir_pos*(dir1=="pos")+b_dir_neg*(dir1=="neg")
  V[['alt2']] = b_goal_30*(goal2==30) + b_goal_60*(goal2==60)+ b_goal_90*(goal2==90)+
    b_form_cash*(form2=="cash")+b_form_donate*(form2=="donate")+b_form_voucher*(form2=="voucher")+
    b_mag_160*(mag2==160)+b_mag_300*(mag2==300)+b_mag_500*(mag2==500)+
    b_dir_pos*(dir2=="pos")+b_dir_neg*(dir2=="neg")
  
  # asc_1 + 
  # asc_2 +
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = c(alt1=1, alt2=2), 
    avail         = 1, 
    choiceVar     = choice_forced,
    # explanators  = database[,c("risk_score","loss_score","intrinsic","extrinsic", "pain_NRS", "function_NRS",
    #                            "IPAQ_cat", 
    #                            "gender", "age", "BMI_calc", "income")],
  V=V)
  
    ### Compute probabilities using MNL model
  P[['model']] = apollo_mnl(mnl_settings, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}


# ####################################################### #
#### 6. Model estimation and reporting                 ####
# ####################################################### #

model = apollo_estimate(apollo_beta, apollo_fixed, 
                        apollo_probabilities, 
                        apollo_inputs)

# Testing likelihood function...
# WARNING: Availability not provided (or some elements are NA). Full availability assumed.
# 
# Overview of choices for MNL model component :
#                                    alt1    alt2
# Times available                  2294.00 2294.00
# Times chosen                     1067.00 1227.00
# Percentage chosen overall          46.51   53.49
# Percentage chosen when available   46.51   53.49
# 
# Pre-processing likelihood function...
# Preparing pre-processing report
# 
# Testing influence of parameters.......
# Starting main estimation
# Initial function value: -1590.08 
# Initial gradient value:
#   b_goal_60      b_goal_90 b_form_voucher  b_form_donate      b_mag_300      b_mag_500      b_dir_neg 
#     30.5         -186.5          138.0         -485.0           22.5          152.5            8.0 
# initial  value 1590.079632 
# iter   2 value 1268.441533
# iter   3 value 1202.052010
# iter   4 value 1200.157902
# iter   5 value 1195.722906
# iter   6 value 1191.905540
# iter   7 value 1157.219204
# iter   8 value 1156.673704
# iter   9 value 1139.146684
# iter  10 value 1132.782311
# iter  11 value 1099.779611
# iter  12 value 1099.463659
# iter  13 value 1099.453357
# iter  14 value 1099.451959
# iter  15 value 1099.451820
# iter  15 value 1099.451818
# iter  15 value 1099.451818
# final  value 1099.451818 
# converged
# Estimated parameters:
#   Estimate
# b_goal_30          0.00000
# b_goal_60         -0.19134
# b_goal_90         -0.86738
# b_form_cash        0.00000
# b_form_voucher    -0.51883
# b_form_donate     -1.96009
# b_mag_160          0.00000
# b_mag_300          0.63871
# b_mag_500          0.84113
# b_dir_pos          0.00000
# b_dir_neg         -0.06212
# 
# Computing covariance matrix using analytical gradient.
# 0%....25%....50%....75%....100%
# Negative definite Hessian with maximum eigenvalue: -77.850397
# Computing score matrix...
# Calculating LL(0) for applicable models...
# Calculating LL of each model component...

apollo_modelOutput(model, list (printPVal=TRUE))
# Model run using Apollo for R, version 0.2.5 on Windows by My-Linh 
# www.ApolloChoiceModelling.com
# 
# Model name                       : dce_model1
# Model description                : MNL model on SP data
# Model run at                     : 2021-06-11 16:53:46
# Estimation method                : bfgs
# Model diagnosis                  : successful convergence 
# Number of individuals            : 288
# Number of rows in database       : 2294
# Number of modelled outcomes      : 2294
# 
# Number of cores used             :  1 
# Model without mixing
# 
# LL(start)                        : -1590.08
# LL(0)                            : -1590.08
# LL(final)                        : -1099.452
# Rho-square (0)                   :  0.3086 
# Adj.Rho-square (0)               :  0.3042 
# AIC                              :  2212.9 
# BIC                              :  2253.07 
# 
# 
# Estimated parameters             :  7
# Time taken (hh:mm:ss)            :  00:00:2.07 
# pre-estimation              :  00:00:0.8 
# estimation                  :  00:00:0.71 
# post-estimation             :  00:00:0.56 
# Iterations                       :  17  
# Min abs eigenvalue of Hessian    :  77.8504 
# 
# Estimates:
#                   Estimate        s.e.   t.rat.(0)  p(1-sided)    Rob.s.e. Rob.t.rat.(0)  p(1-sided)
# b_goal_30          0.00000          NA          NA          NA          NA            NA          NA
# b_goal_60         -0.19134     0.07720      -2.478    0.006599     0.08397        -2.279     0.01134
# b_goal_90         -0.86738     0.07617     -11.388    0.000000     0.09436        -9.192     0.00000
# b_form_cash        0.00000          NA          NA          NA          NA            NA          NA
# b_form_voucher    -0.51883     0.07106      -7.301   1.428e-13     0.07595        -6.831   4.206e-12
# b_form_donate     -1.96009     0.08633     -22.705    0.000000     0.11333       -17.295     0.00000
# b_mag_160          0.00000          NA          NA          NA          NA            NA          NA
# b_mag_300          0.63871     0.08125       7.861   1.887e-15     0.07983         8.001   6.661e-16
# b_mag_500          0.84113     0.07464      11.269    0.000000     0.08854         9.500     0.00000
# b_dir_pos          0.00000          NA          NA          NA          NA            NA          NA
# b_dir_neg         -0.06212     0.05372      -1.157    0.123727     0.05476        -1.135     0.12828
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by stephanehess »

Thanks. Two more questions

1. Do you want to model both follow-up questions, i.e. for each alternative? If so, what are the columns?
2. Do you want to allow for a scale difference in the utility for the follow-up question compared to the forced choice?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
theycallmemylinh
Posts: 7
Joined: 05 May 2021, 00:05

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by theycallmemylinh »

Hi Stephane,

(1) I don't think it makes sense to model both follow-up questions, since participants only saw one of the questions, based on which Program they selected (i.e. based on the question logic, they would only see Q5.11 if in the previous question they selected Program A, and only see Q5.12 if they selected Program B in the previous question).
DCE_forced and opt-out.png
DCE_forced and opt-out.png (47.72 KiB) Viewed 11817 times
Since not all options A, B and opt-out were presented at one time (intentionally, to avoid a participant opting-out for all questions & not providing any preference data), there aren't enough data for ranking (e.g. best-worst DCE) but having the data for the preferred treatment (Program A or Program B) & status quo data available in one column might make more sense to model as opposed to treatment v no treatment. Does it make more sense to use the "choice_best" column to model the unforced choice? In the "choice_best" column 0= status quo, 1= Program A, 2= Program B, whereas in the "choice_tx" column 0=status quo, 1= treatment (Program A or Program B)
DCE_data structure.png
DCE_data structure.png (84.24 KiB) Viewed 11817 times
(2) I think it makes sense to allow for a scale difference for the follow-up question

Thanks for helping me think this through carefully!
My-Linh
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Structuring a dataset with forced choice followed by unforced/opt-out

Post by stephanehess »

sorry, not sure I follow, but it seems straightforward to me to just model what is in the survey.

So you would have a utility for A and for B, and then you would model the preferred option out of A and B, followed by the choice between the chosen one and the status quo.

Something a bit like this

Code: Select all

  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[['A']] = ...
  V[['B']] = ...
  V[['SQ']] = 0

  ### Compute probabilities for preferred out of A and B
  mnl_settings = list(
    alternatives = c(A=1, B=2, SQ=3),
    avail        = list(A=1, B=1, SQ=0),
    choiceVar    = choice_best,
    V            = V
  )
  P[['choice_best']] = apollo_mnl(mnl_settings, functionality)
  
  ### Compute probabilities for 'worst' choice using MNL model
  mnl_settings$avail        = list(A=(choice_best==1), B=(choice_best=2), SQ=1)
  mnl_settings$choiceVar    = (choice_forced==1)*choice_best+(choice_forced==2)*3
  mnl_settings$V            = lapply(V,"*",mu_forced)
  
  P[['forced']] = apollo_mnl(mnl_settings, functionality)

  ### Combined model
  P = apollo_combineModels(P, apollo_inputs, functionality)

  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply