Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.

Hybrid Choice Model - Latent Variable Affecting Only One Attribute

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
marta bonioli
Posts: 1
Joined: 04 Feb 2025, 09:03

Hybrid Choice Model - Latent Variable Affecting Only One Attribute

Post by marta bonioli »

Dear Apollo community,

I am working on a Hybrid Choice Model (HCM) applied to a choice experiment with an unlabeled design. My experiment involves a choice between two alternative products and a no-buy option. A set of attributes describes each alternative, and my goal is to investigate how a latent variable (LV) representing consumer beliefs influences the choices.
Currently, I am estimating this LV using a structural equation that depends on socio-demographic characteristics. I am measuring it through multiple indicators in a measurement model using normal density distributions.

I have collected attitudinal responses from participants to measure their beliefs about the product. These responses indicate the perceived probability (0-100%) that consuming the product:
1. Contributes to reducing food waste.
2. Supports a circular economy.
3. Is safe for health.
4. Brings environmental benefits.
To integrate these beliefs into my choice model, I normalized the responses (0-1 scale) and used them to estimate a latent variable (LV) via a continuous measurement model.

I would like to investigate whether this LV influences only one specific attribute in my choice model—"upcycled".

My current utility specification (excluding LV) is as follows:

Code: Select all

V[['alt1']] <- b_price * alt1.price + 
               b_olivo * alt1.olivo + 
               b_upcycled * alt1.upcycled + 
               b_waste * alt1.waste + 
               b_bio * alt1.bio + 
               b_int1 * alt1.upcycled * alt1.waste + 
               b_int3 * alt1.upcycled * alt1.bio  

V[['alt2']] <- b_price * alt2.price + 
               b_olivo * alt2.olivo + 
               b_upcycled * alt2.upcycled + 
               b_waste * alt2.waste + 
               b_bio * alt2.bio + 
               b_int1 * alt2.upcycled * alt2.waste + 
               b_int3 * alt2.upcycled * alt2.bio  

V[['no_buy']] <- nobuy
My question is: Does it make sense, from a theoretical and practical standpoint, to include the latent variable directly in the utility function affecting only the "upcycled" attribute?

For example, could I modify my model as follows?

Code: Select all

V[['alt1']] <- b_price * alt1.price + 
               b_olivo * alt1.olivo + 
               (b_upcycled + lambda_LV * LV) * alt1.upcycled + 
               b_waste * alt1.waste + 
               b_bio * alt1.bio + 
               b_int1 * alt1.upcycled * alt1.waste + 
               b_int3 * alt1.upcycled * alt1.bio 
               
         V[['alt2']] <- b_price * alt2.price + 
    		b_olivo * alt2.olivo + 
    		(b_upcycled + + lambda*LV) * alt2.upcycled + 
   		 b_waste * alt2.waste + 
   		 b_bio * alt2.bio + 
   		 b_int1 * alt2.upcycled * alt2.waste + 
    		 b_int3 * alt2.upcycled * alt2.bio  
  
  V[['no_buy']] <- nobuy
Would this approach be correct in representing the effect of the latent variable on only the "upcycled" attribute?
Or is there a more appropriate way to model this relationship within Apollo?

Any feedback or suggestions would be greatly appreciated.

Thank you in advance for your help!








##################################
############ CODE ################
##################################

Code: Select all

### Clear memory
rm(list = ls())

library(apollo)
library(readr)
library(tidyr)
library(dplyr)
library(readxl)
library(writexl)
library(ggplot2)
library(reshape2)

# DATA
#file_path <- "....."
data <- read_csv(file_path)

data$food_waste <- data$belief_1
data$circular_ec <- data$belief_2
data$health_safe <- data$belief_4
data$env_benefit <- data$belief_10
data$age <- data$eta_scr

database <- data[, c("y", "block", "id", "cset", "pid", 
                     "alt1.price", "alt1.upcycled", "alt1.waste","alt1.olivo", "alt1.bio", "alt2.price", "alt2.ingredients", "alt2.waste", "alt2.upcycled", "alt2.olivo" ,"alt2.bio", "food_waste", "circular_ec", "health_safe", "env_benefit", "age","gen", "reddito","istr","reddito", "neoph_total", "nepscale_total", "health_consc")]

# Normalize belief variables
database$food_waste <- database$food_waste / 100
database$circular_ec <- database$circular_ec / 100
database$health_safe <- database$health_safe / 100
database$env_benefit <- database$env_benefit / 100

# Create the 'female' variable (1 if gen = 2, 0 otherwise)
database$female <- ifelse(database$gen == 2, 1, 0)

# Create the 'high_education' variable (1 if istr is among 4, 5, 6, 0 otherwise)
database$high_istr <- ifelse(database$istr %in% c(4, 5, 6), 1, 0)

# Create the 'high_income' variable
database$high_income <- ifelse(database$reddito %in% c(6, 7, 8,9,10), 1, 0)

# Create the 'under35' variable (1 if age is among 2,3, 0 otherwise)
database$under35 <- ifelse(database$age  %in% c(2,3), 1, 0)


################### HYBRID MMNL ####################################

# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
# ################################################################# #

### Initialize code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName       = "Hybrid_MMNL_CREMA_gen_age_istr_income_beliefs_up1",
  modelDescr      = "Hybrid choice model using continuous measurement model for indicators",
  indivID         = "id",
  nCores          = 4, 
  outputDirectory = "output"
)

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c( b_price = -0.21183826,  # Price coefficient
                 nobuy   = -3.36746925,  # "no buy" alternative
                 
                 # Coefficients for random distributions
                 b_olivo_mu = 0.68344044,    # Estimated mean for b_olivo
                 b_olivo_sig = 0.1,        # Initial estimate for standard deviation
                 b_upcycled_mu = 0.14579275, # Estimated mean for b_upcycled
                 b_upcycled_sig = 0.1,     # Initial estimate for standard deviation
                 b_waste_mu = 0.63220937,    # Estimated mean for b_waste
                 b_waste_sig = 0.1,        # Initial estimate for standard deviation
                 b_bio_mu = 0.39202152,      # Estimated mean for b_bio
                 b_bio_sig = 0.1,          # Initial estimate for standard deviation
                 
                 b_int1 = 0.50219241	,
                 b_int3 = 0.02165156,
                 
                 lambda  = 0,   # Indicates how much the latent variable (LV) influences choices
                 gamma_female  = 1.201601935, 
                 gamma_high_istr  = 0.875568049,
                 gamma_under35  = 0.301690559, 
                 gamma_high_income  = 0.581571635,
                 zeta_food_waste	=	1,
                 zeta_circular_ec	=	1,
                 zeta_health_safe	=	1,
                 zeta_env_benefit	=	1,
                 sigma_food_waste	=	1,
                 sigma_circular_ec	=	1,
                 sigma_health_safe	=	1,
                 sigma_env_benefit	=	1)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()

# ################################################################# #
#### DEFINE RANDOM COMPONENTS                                    ####
# ################################################################# #

### Set parameters for generating draws
apollo_draws = list(
  interDrawsType="halton", 
  interNDraws=1000,
  interUnifDraws=c(),      
  interNormDraws=c("eta","draws_olivo","draws_upcycled","draws_waste", "draws_bio"), 
  
  intraDrawsType="",
  intraNDraws=0,          
  intraUnifDraws=c(),     
  intraNormDraws=c()     
)

### Create random parameters
apollo_randCoeff=function(apollo_beta, apollo_inputs){
  randcoeff = list()

  randcoeff[["LV"]] = gamma_high_istr * high_istr + gamma_female * female + gamma_under35*under35  + gamma_high_income * high_income + eta
  
  randcoeff[["b_olivo"]] = b_olivo_mu + b_olivo_sig * draws_olivo 
  randcoeff[["b_upcycled"]] = b_upcycled_mu + b_upcycled_sig * draws_upcycled
  randcoeff[["b_waste"]] = b_waste_mu + b_waste_sig * draws_waste
  randcoeff[["b_bio"]] = b_bio_mu + b_bio_sig * draws_bio
  
  return(randcoeff)
}

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P

  P = list()
  
  ### Likelihood of indicators
  normalDensity_settings1 = list(outcomeNormal = food_waste, 
                                 xNormal       = zeta_food_waste*LV, 
                                 mu            = 0, 
                                 sigma         = sigma_food_waste, 
                                 rows          = (pid==1), 
                                 # Since each belief question is answered once per respondent
                                 # but repeated in each row of the data for that respondent, we include rows=(pid==1) which
                                 # ensures that the measurement model is only used once for each belief statement and for each
                                 # individual, rather than contributing to the overall model likelihood in each row for that person.
                                 componentName = "Food_Waste")
  
  normalDensity_settings2 = list(outcomeNormal = circular_ec, 
                                 xNormal       = zeta_circular_ec*LV, 
                                 mu            = 0, 
                                 sigma         = sigma_circular_ec, 
                                 rows          = (pid==1),
                                 componentName = "Circular_economy")
  
  normalDensity_settings4 = list(outcomeNormal = health_safe, 
                                 xNormal       = zeta_health_safe*LV, 
                                 mu            = 0, 
                                 sigma         = sigma_health_safe, 
                                 rows          = (pid==1),
                                 componentName = "Health_safety")
  
  normalDensity_settings5 = list(outcomeNormal = env_benefit, 
                                 xNormal       = zeta_env_benefit*LV, 
                                 mu            = 0, 
                                 sigma         = sigma_env_benefit, 
                                 rows          = (pid==1),
                                 componentName = "Env_benefit")

  
  # 4 INDICATORS of the model
 
  P[["Food_Waste"]]     = apollo_normalDensity(normalDensity_settings1, functionality)
  P[["Circular_economy"]] = apollo_normalDensity(normalDensity_settings2, functionality)
  P[["Health_safety"]]   = apollo_normalDensity(normalDensity_settings4, functionality)
  P[["Env_benefit"]]      = apollo_normalDensity(normalDensity_settings5, functionality)
  
  ### Likelihood of choices

  V <- list()
  V[['alt1']] <- b_price * alt1.price + 
    b_olivo * alt1.olivo + 
    (b_upcycled + lambda*LV) * alt1.upcycled + 
    b_waste * alt1.waste + 
    b_bio * alt1.bio + 
    b_int1 * alt1.upcycled * alt1.waste + 
    b_int3 * alt1.upcycled * alt1.bio  
  
  V[['alt2']] <- b_price * alt2.price + 
    b_olivo * alt2.olivo + 
    (b_upcycled + + lambda*LV) * alt2.upcycled + 
    b_waste * alt2.waste + 
    b_bio * alt2.bio + 
    b_int1 * alt2.upcycled * alt2.waste + 
    b_int3 * alt2.upcycled * alt2.bio  
  
  V[['no_buy']] <- nobuy
  
  
  ### Define settings for MNL model component
  mnl_settings <- list(
    alternatives = c(alt1 = 1, alt2 = 2, no_buy = 3),
    avail = list(alt1 = 1, alt2 = 1, no_buy = 1),
    choiceVar = y,
    utilities = V
  )
  
  
  ### Compute probabilities for MNL model component
  P[["choice"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Likelihood of the whole model
  #This function takes the list P which contains several individual model components and produces
  #a combined model
  P = apollo_combineModels(P, apollo_inputs, functionality)
  
  ### Take product across observation for same individual
  #apollo_panelProd which multiplies the probabilities across individual choice observations for the same individual,
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Average across inter-individual draws
  P = apollo_avgInterDraws(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

apollo_modelOutput(model)

apollo_saveOutput(model)

apollo_sink()
stephanehess
Site Admin
Posts: 1338
Joined: 24 Apr 2020, 16:29

Re: Hybrid Choice Model - Latent Variable Affecting Only One Attribute

Post by stephanehess »

Hi

the way you enter the LV is correct. However, you should be mindful that if you are only allowing for heterogeneity via the LV, you might be misattributing the source of heterogeneity. See e.g. https://doi.org/10.1016/j.jocm.2020.100252

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply