Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Position-specific constants for best vs worst choices

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
jayb
Posts: 3
Joined: 31 Jan 2023, 12:24

Position-specific constants for best vs worst choices

Post by jayb »

Hi!

I have specified a sequential BW LC (no covariates). It is a case 1 BW with 16 items in total, 8 choice-sets and 4 items per choice set.

In the results, there is a non-sig trend for a lower value estimate of the position-specific constant across alternatives (i.e. a hint of position bias!).

I'd like to explore this further by separating out the ASC for B & W choices, to see if there is a difference in the position effect between best and worst choices, but I can't figure out how to specify it.

Any help appreciated,

Best wishes,

Jay



Code: Select all

# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
# ################################################################# #

### Clear memory
rm(list = ls())

### Load Apollo library
library(apollo)

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName       = "BW_2_LC_no_covariates",
  modelDescr      = "Simple LC model on BW choice data, no covariates in class allocation model",
  indivID         = "uuid",
  nCores          = 2,
  outputDirectory = "output"
)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

### Loading data from package
### if data is to be loaded from a file (e.g. called data.csv),
### the code would be: database = read.csv("data.csv",header=TRUE)
database <- read_csv("apollo/BW_LC_no_covariates.csv") %>%
  select(-c(cset, setno, alt1_avail:alt4_avail, csn)) %>%
  mutate(across(T1_1:T16_4,  ~ if_else(bw == 2, . * -1, .)),
         bw = if_else(bw == 1, "best", "worst")) %>%
  pivot_wider(names_from = bw, values_from = choice, names_prefix = "choice_") %>%
  relocate(c("choice_best", "choice_worst"), .before = "T1_1")

### for data dictionary, use ?apollo_swissRouteChoiceData

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(set_names(rep(0, 38),
                        c(paste0("asc_alt", 1:4),
                          paste0("beta_T", rep(1:16, each = 2), "_", letters[1:2]),
                          paste0("delta_", letters[1:2]))),
                mu_worst = 1)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_alt1", "beta_T2_a", "beta_T2_b", "delta_b")

# ################################################################# #
#### DEFINE LATENT CLASS COMPONENTS                              ####
# ################################################################# #

apollo_lcPars=function(apollo_beta, apollo_inputs){
  lcpars = list()

  lcpars[["beta_T1"]]  = list(beta_T1_a, beta_T1_b)
  lcpars[["beta_T2"]]  = list(beta_T2_a, beta_T2_b)
  lcpars[["beta_T3"]]  = list(beta_T3_a, beta_T3_b)
  lcpars[["beta_T4"]]  = list(beta_T4_a, beta_T4_b)
  lcpars[["beta_T5"]]  = list(beta_T5_a, beta_T5_b)
  lcpars[["beta_T6"]]  = list(beta_T6_a, beta_T6_b)
  lcpars[["beta_T7"]]  = list(beta_T7_a, beta_T7_b)
  lcpars[["beta_T8"]]  = list(beta_T8_a, beta_T8_b)
  lcpars[["beta_T9"]]  = list(beta_T9_a, beta_T9_b)
  lcpars[["beta_T10"]] = list(beta_T10_a, beta_T10_b)
  lcpars[["beta_T11"]] = list(beta_T11_a, beta_T11_b)
  lcpars[["beta_T12"]] = list(beta_T12_a, beta_T12_b)
  lcpars[["beta_T13"]] = list(beta_T13_a, beta_T13_b)
  lcpars[["beta_T14"]] = list(beta_T14_a, beta_T14_b)
  lcpars[["beta_T15"]] = list(beta_T15_a, beta_T15_b)
  lcpars[["beta_T16"]] = list(beta_T16_a, beta_T16_b)

  V=list()
  V[["class_a"]] = delta_a
  V[["class_b"]] = delta_b

  classAlloc_settings = list(
    classes      = c(class_a = 1, class_b = 2),
    utilities    = V
  )

  lcpars[["pi_values"]] = apollo_classAlloc(classAlloc_settings)

  return(lcpars)
}

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality = "estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))

  ### Create list of probabilities P
  P = list()
  P_bw = list()

  ### Define settings for MNL model component that are generic across classes
  mnl_settings = list(
    alternatives = c(alt1 = 1, alt2 = 2, alt3 = 3, alt4 = 4)
  )

  ### Loop over classes
  for(s in 1:2){

    ### Compute class-specific utilities
    V=list()
    V[["alt1"]] = asc_alt1 + beta_T1[[s]]*T1_1 + beta_T2[[s]]*T2_1 + beta_T3[[s]]*T3_1 + beta_T4[[s]]*T4_1 +
      beta_T5[[s]]*T5_1 + beta_T6[[s]]*T6_1 + beta_T7[[s]]*T7_1 +
      beta_T8[[s]]*T8_1 + beta_T9[[s]]*T9_1 + beta_T10[[s]]*T10_1 +
      beta_T11[[s]]*T11_1 + beta_T12[[s]]*T12_1 + beta_T13[[s]]*T13_1 +
      beta_T14[[s]]*T14_1 + beta_T15[[s]]*T15_1 + beta_T16[[s]]*T16_1
    V[["alt2"]] = asc_alt2 + beta_T1[[s]]*T1_2 + beta_T2[[s]]*T2_2 + beta_T3[[s]]*T3_2 + beta_T4[[s]]*T4_2 +
      beta_T5[[s]]*T5_2 + beta_T6[[s]]*T6_2 + beta_T7[[s]]*T7_2 +
      beta_T8[[s]]*T8_2 + beta_T9[[s]]*T9_2 + beta_T10[[s]]*T10_2 +
      beta_T11[[s]]*T11_2 + beta_T12[[s]]*T12_2 + beta_T13[[s]]*T13_2 +
      beta_T14[[s]]*T14_2 + beta_T15[[s]]*T15_2 + beta_T16[[s]]*T16_2
    V[["alt3"]] = asc_alt3 + beta_T1[[s]]*T1_3 + beta_T2[[s]]*T2_3 + beta_T3[[s]]*T3_3 + beta_T4[[s]]*T4_3 +
      beta_T5[[s]]*T5_3 + beta_T6[[s]]*T6_3 + beta_T7[[s]]*T7_3 +
      beta_T8[[s]]*T8_3 + beta_T9[[s]]*T9_3 + beta_T10[[s]]*T10_3 +
      beta_T11[[s]]*T11_3 + beta_T12[[s]]*T12_3 + beta_T13[[s]]*T13_3 +
      beta_T14[[s]]*T14_3 + beta_T15[[s]]*T15_3 + beta_T16[[s]]*T16_3
    V[["alt4"]] = asc_alt4 + beta_T1[[s]]*T1_4 + beta_T2[[s]]*T2_4 + beta_T3[[s]]*T3_4 + beta_T4[[s]]*T4_4 +
      beta_T5[[s]]*T5_4 + beta_T6[[s]]*T6_4 + beta_T7[[s]]*T7_4 +
      beta_T8[[s]]*T8_4 + beta_T9[[s]]*T9_4 + beta_T10[[s]]*T10_4 +
      beta_T11[[s]]*T11_4 + beta_T12[[s]]*T12_4 + beta_T13[[s]]*T13_4 +
      beta_T14[[s]]*T14_4 + beta_T15[[s]]*T15_4 + beta_T16[[s]]*T16_4

    ### Compute probabilities for "best" choice using MNL model
    mnl_settings$avail = list(alt1 = 1, alt2 = 1, alt3 = 1, alt4 = 1)
    mnl_settings$choiceVar = choice_best
    mnl_settings$V = V
    mnl_settings$componentName = paste0("Best_Class_", s)

    ## Compute within-class choice probabilities using MNL model
    P_bw[["best"]] = apollo_mnl(mnl_settings, functionality)

    ### Take product across observations for same individual
    P_bw[["best"]] = apollo_panelProd(P_bw[["best"]], apollo_inputs, functionality)

    ### Compute probabilities for "worst" choice using MNL model
    mnl_settings$avail = list(alt1 = (choice_best != 1), alt2 = (choice_best != 2), alt3 = (choice_best != 3), alt4 = (choice_best != 4))
    mnl_settings$choiceVar = choice_worst
    mnl_settings$V = lapply(V, "*", -mu_worst)
    mnl_settings$componentName = paste0("Worst_Class_", s)

    ###Compute within-class choice probabilities using MNL model
    P_bw[["worst"]] = apollo_mnl(mnl_settings, functionality)

    ##Take product across observations for same individual
    P_bw[["worst"]] = apollo_panelProd(P_bw[["worst"]], apollo_inputs, functionality)

    P[[paste0("Class_", s)]] = apollo_combineModels(P_bw, apollo_inputs, functionality)$model

  }

  ### Compute latent class model probabilities
  lc_settings   = list(inClassProb = P, classProb = pi_values)
  P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)

  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

### Optional starting values search
# apollo_beta=apollo_searchStart(apollo_beta, apollo_fixed,apollo_probabilities, apollo_inputs)

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN)                               ----
# ----------------------------------------------------------------- #

apollo_modelOutput(model, modelOutput)

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name)               ----
# ----------------------------------------------------------------- #

apollo_saveOutput(model)
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Position-specific constants for best vs worst choices

Post by stephanehess »

Hi

so the easiest thing would be to write the utility functions separately for best and worst, with the only difference being the ASCs and obviously the multiplication by -mu_worst

Best wishes

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
jayb
Posts: 3
Joined: 31 Jan 2023, 12:24

Re: Position-specific constants for best vs worst choices

Post by jayb »

**EDIT: I've solved this by creating a new set of dummy variables for each attribute and position, and recoding the worst choice variable. I then used these new variables to define class specific utilities for the worst choices, and set the `alternatives` argument of mnl_settings to have only 3 alternatives for the worst choice.


Hi Stephane

Thanks for the reply.

That worked but I'm wondering if there's a different way. I can't make sense of why there'd be four ASCs for the worst choices (reminder: I'm assuming a sequential model). Would it make (more) sense to have three ASCs for worst, where ASC1 is the highest positioned alternative after the best choice has been made?

For example:

[BEST] = best choice
[ASCi_W] = ASC for the remaining alternatives

CS1
ALT_1 [BEST]
ALT_2 [ASC1_W]
ALT_3 [ASC2_W]
ALT_4 [ASC3_W]

CS2
ALT_1 [ASC1_W]
ALT_2 [BEST]
ALT_3 [ASC2_W]
ALT_4 [ASC3_W]

CS3
ALT_1 [ASC1_W]
ALT_2 [ASC2_W]
ALT_3 [BEST]
ALT_4 [ASC3_W]

CS4
ALT_1 [ASC1_W]
ALT_2 [ASC2_W]
ALT_3 [ASC3_W]
ALT_4 [BEST]


Does this make sense? If so, is there a way I can specify it?

Thanks!

Jay
Last edited by jayb on 14 Apr 2023, 10:31, edited 2 times in total.
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Position-specific constants for best vs worst choices

Post by stephanehess »

Hi

sorry, this is not completely clear. Can you explain your data a bit better again. You have four alternatives, and then do you do BWB, or BWW?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
jayb
Posts: 3
Joined: 31 Jan 2023, 12:24

Re: Position-specific constants for best vs worst choices

Post by jayb »

Yes, there's four alternatives for each choice set (16 items in total and 8 choice sets). If I assume a sequential model (respondents choose best item first, then worst item) and want ASCs for best and worst choices separately, then there should be 4 position-specific constants for the best choice and 3 position-specific constants for the worst choice. The question was about specifying the model to handle that. I got around it by reformating the dataset and creating a new set of variables to describe the worst choices (recoded according to their new positions, e.g. in the previous example CS2 if alternative 2 is chosen as best, then, for the worst choice, alternative 3 becomes alternative 2 because the real alternative 2 isn't available since it was chosen as best). I then used the original variables for the best class specific utilities and the modified variables for the worst class specific utilities.
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Position-specific constants for best vs worst choices

Post by stephanehess »

You can just specify different utilities for the different stages, see http://apollochoicemodelling.com/files/ ... f_params.r
--------------------------------
Stephane Hess
www.stephanehess.me.uk
jayb
Posts: 3
Joined: 31 Jan 2023, 12:24

Re: Position-specific constants for best vs worst choices

Post by jayb »

Thanks Stephane, I specified it as below - could you check if that's ok please?

Another question I have is I recorded the choice order between best and worst choices - is it possible to specify the model to handle observed choice rather than making an assumption (i.e. EL with best then worst or MaxDiff with joint)?

Thanks,

Jay

Code: Select all

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality = "estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))

  ### Create list of probabilities P
  P = list()
  P_bw = list()

  ### Define settings for MNL model component that are generic across classes
  mnl_settings = list(
    alternatives = c(alt1B = 1, alt2B = 2, alt3B = 3, alt4B = 4,
                     alt1W = 1, alt2W = 2, alt3W = 3)
  )

  ### Loop over classes
  for(s in 1:2){

    ### Compute class-specific utilities
    V=list()
    V[["alt1B"]] = asc_alt1_B + beta_T1[[s]]*T1_1 + beta_T2[[s]]*T2_1 + beta_T3[[s]]*T3_1 + beta_T4[[s]]*T4_1 +
      beta_T5[[s]]*T5_1 + beta_T6[[s]]*T6_1 + beta_T7[[s]]*T7_1 +
      beta_T8[[s]]*T8_1 + beta_T9[[s]]*T9_1 + beta_T10[[s]]*T10_1 +
      beta_T11[[s]]*T11_1 + beta_T12[[s]]*T12_1 + beta_T13[[s]]*T13_1 +
      beta_T14[[s]]*T14_1 + beta_T15[[s]]*T15_1 + beta_T16[[s]]*T16_1
    V[["alt2B"]] = asc_alt2_B + beta_T1[[s]]*T1_2 + beta_T2[[s]]*T2_2 + beta_T3[[s]]*T3_2 + beta_T4[[s]]*T4_2 +
      beta_T5[[s]]*T5_2 + beta_T6[[s]]*T6_2 + beta_T7[[s]]*T7_2 +
      beta_T8[[s]]*T8_2 + beta_T9[[s]]*T9_2 + beta_T10[[s]]*T10_2 +
      beta_T11[[s]]*T11_2 + beta_T12[[s]]*T12_2 + beta_T13[[s]]*T13_2 +
      beta_T14[[s]]*T14_2 + beta_T15[[s]]*T15_2 + beta_T16[[s]]*T16_2
    V[["alt3B"]] = asc_alt3_B + beta_T1[[s]]*T1_3 + beta_T2[[s]]*T2_3 + beta_T3[[s]]*T3_3 + beta_T4[[s]]*T4_3 +
      beta_T5[[s]]*T5_3 + beta_T6[[s]]*T6_3 + beta_T7[[s]]*T7_3 +
      beta_T8[[s]]*T8_3 + beta_T9[[s]]*T9_3 + beta_T10[[s]]*T10_3 +
      beta_T11[[s]]*T11_3 + beta_T12[[s]]*T12_3 + beta_T13[[s]]*T13_3 +
      beta_T14[[s]]*T14_3 + beta_T15[[s]]*T15_3 + beta_T16[[s]]*T16_3
    V[["alt4B"]] = asc_alt4_B + beta_T1[[s]]*T1_4 + beta_T2[[s]]*T2_4 + beta_T3[[s]]*T3_4 + beta_T4[[s]]*T4_4 +
      beta_T5[[s]]*T5_4 + beta_T6[[s]]*T6_4 + beta_T7[[s]]*T7_4 +
      beta_T8[[s]]*T8_4 + beta_T9[[s]]*T9_4 + beta_T10[[s]]*T10_4 +
      beta_T11[[s]]*T11_4 + beta_T12[[s]]*T12_4 + beta_T13[[s]]*T13_4 +
      beta_T14[[s]]*T14_4 + beta_T15[[s]]*T15_4 + beta_T16[[s]]*T16_4

    V[["alt1W"]] = -mu_worst * (asc_alt1_W + beta_T1[[s]]*modified_T1_1 + beta_T2[[s]]*modified_T2_1 + beta_T3[[s]]*modified_T3_1 + beta_T4[[s]]*modified_T4_1 +
                                  beta_T5[[s]]*modified_T5_1 + beta_T6[[s]]*modified_T6_1 + beta_T7[[s]]*modified_T7_1 +
                                  beta_T8[[s]]*modified_T8_1 + beta_T9[[s]]*modified_T9_1 + beta_T10[[s]]*modified_T10_1 +
                                  beta_T11[[s]]*modified_T11_1 + beta_T12[[s]]*modified_T12_1 + beta_T13[[s]]*modified_T13_1 +
                                  beta_T14[[s]]*modified_T14_1 + beta_T15[[s]]*modified_T15_1 + beta_T16[[s]]*modified_T16_1)
    V[["alt2W"]] = -mu_worst * (asc_alt2_W + beta_T1[[s]]*modified_T1_2 + beta_T2[[s]]*modified_T2_2 + beta_T3[[s]]*modified_T3_2 + beta_T4[[s]]*modified_T4_2 +
                                  beta_T5[[s]]*modified_T5_2 + beta_T6[[s]]*modified_T6_2 + beta_T7[[s]]*modified_T7_2 +
                                  beta_T8[[s]]*modified_T8_2 + beta_T9[[s]]*modified_T9_2 + beta_T10[[s]]*modified_T10_2 +
                                  beta_T11[[s]]*modified_T11_2 + beta_T12[[s]]*modified_T12_2 + beta_T13[[s]]*modified_T13_2 +
                                  beta_T14[[s]]*modified_T14_2 + beta_T15[[s]]*modified_T15_2 + beta_T16[[s]]*modified_T16_2)
    V[["alt3W"]] = -mu_worst * (asc_alt3_W + beta_T1[[s]]*modified_T1_3 + beta_T2[[s]]*modified_T2_3 + beta_T3[[s]]*modified_T3_3 + beta_T4[[s]]*modified_T4_3 +
                                  beta_T5[[s]]*modified_T5_3 + beta_T6[[s]]*modified_T6_3 + beta_T7[[s]]*modified_T7_3 +
                                  beta_T8[[s]]*modified_T8_3 + beta_T9[[s]]*modified_T9_3 + beta_T10[[s]]*modified_T10_3 +
                                  beta_T11[[s]]*modified_T11_3 + beta_T12[[s]]*modified_T12_3 + beta_T13[[s]]*modified_T13_3 +
                                  beta_T14[[s]]*modified_T14_3 + beta_T15[[s]]*modified_T15_3 + beta_T16[[s]]*modified_T16_3)

    ### Compute probabilities for "best" choice using MNL model
    mnl_settings$avail = list(alt1B = 1, alt2B = 1, alt3B = 1, alt4B = 1,
                              alt1W = 0, alt2W = 0, alt3W = 0)
    mnl_settings$choiceVar = choice_best
    mnl_settings$V = V
    mnl_settings$componentName = paste0("Best_Class_", s)

    ## Compute within-class choice probabilities using MNL model
    P_bw[["best"]] = apollo_mnl(mnl_settings, functionality)

    ### Take product across observations for same individual
    P_bw[["best"]] = apollo_panelProd(P_bw[["best"]], apollo_inputs, functionality)

    ### Compute probabilities for "worst" choice using MNL model
    mnl_settings$avail = list(alt1B = 0, alt2B = 0, alt3B = 0, alt4B = 0,
                              alt1W = 1, alt2W = 1, alt3W = 1)
    mnl_settings$choiceVar = choice_worst
    mnl_settings$componentName = paste0("Worst_Class_", s)

    ###Compute within-class choice probabilities using MNL model
    P_bw[["worst"]] = apollo_mnl(mnl_settings, functionality)

    ##Take product across observations for same individual
    P_bw[["worst"]] = apollo_panelProd(P_bw[["worst"]], apollo_inputs, functionality)

    P[[paste0("Class_", s)]] = apollo_combineModels(P_bw, apollo_inputs, functionality)$model

  }

  ### Compute latent class model probabilities
  lc_settings   = list(inClassProb = P, classProb = pi_values)
  P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)

  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}
stephanehess
Site Admin
Posts: 974
Joined: 24 Apr 2020, 16:29

Re: Position-specific constants for best vs worst choices

Post by stephanehess »

Hi

you could just use the best only alternatives for the utilities in the best model, and the worst only ones for worst, rather than working with availabilities

Also, I would suggest using apollo_panelprod only after apollo_combineModels

regarding your other question, what you could do is to specify the different models (EL, maxdiff) and use the rows setting to ensure that only one of the models is used for a given observation, using the observed selection process. Does that make sense?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply