Page 1 of 1
Handling the dichotomous latent variable
Posted: 09 Apr 2025, 10:57
by Peter_C
Dear Apollo team!
I would like to ask you about a specific case of the hybrid choice model.
What should I do in case my latent variable is a categorical dichotomous variable?
I understand that the choice model part can be easily solved through an interaction, e.g:
b_price_new=b_price + lambda*LV
Alternatively, it is also clear that the structural equation part works the same as in other cases, e.g.:
LV=b_female * Gender_female + b_higher_education * Education_higher_education + eta
However, it is not clear to me how I will handle the measurement equation part in this case (when my latent variable is a variable with yes/no options).
Can you help me with this please?
Thanks,
Peter
Re: Handling the dichotomous latent variable
Posted: 10 Apr 2025, 20:05
by dpalma
Hi,
I am not sure I fully understand your question. According to your equations, your latent variable is continuous, because you define it as:
LV = b_female * Gender_female + b_higher_education * Education_higher_education + eta
Do you mean that the indicators of your latent variable are dichotomous? So let’s imagine your LV is measuring how “green” a person is, in other words how much they do to take care of the environment. And you have three indicators: (i) whether they recycle (yes/no), (ii) what their diet is (omnivore / vegetarian / vegan), and (iii) what their monthly carbon footprint is (continuous value).
For the first indicator (recycle), which is dichotomous, you can use a binary logit for the measurement equation, where:
U_recycle = lambda0 + lambda1*LV + epsilon1
And the dependent variable is recycle =1 if U_recycle >0, and recycle=0 otherwise.
For the second indicator (diet), you can use an ordered logit for the measurement equation, where:
U_ diet = lambda2*LV + epsilon2
And you also calculate thresholds tau1 and tau2, where diet=omnivore if U_ diet < tau1, diet=vegetarian if tau1 < U_ diet < tau2, and diet=vegan if tau2 < U_ diet.
Finally, for the third indicator (co2) you can use a linear measurement equation:
co2 = lambda3 + lambda4*LV + epsilon3
If instead you want LV itself to be dichotomous, things may be more complicated. For example, imagine you don’t know if respondents have high income or not, so your LV is dichotomous, where LV=0 means low income, and LV=1 means high income. The way we usually work with dichotomous variables is we assume there is an underlying continuous latent variable, which we’ll call LV* (you can interpret this as some kind of normalised income), where LV=1 if LV*>0 and LV=0 if LV*<=0. Then the structural equation for LV* would be LV*= f(socios) + eta, while LV is completely determined by LV*, as discussed before. The question then, is why would you want to use LV in the utility of your choice when you can simply use LV*?
I hope this helps, otherwise please provide more details on what it is you are modelling, and I may be able to provide more detailed help.
Best wishes,
David
Re: Handling the dichotomous latent variable
Posted: 23 Apr 2025, 07:31
by Xin-Yu Zuo
Hello, I have a related question regarding this issue.
As you mentioned, I need to use a binary logit model for the measurement equation, since the indicators are 0-1 binary variables.
For the structural equation, I’m expressing it as:
LV = f(socio-economic variables) + η*
The issue I’m facing is:
How can I implement a binary logit model for the measurement equation in Apollo, given that the indicators are binary (0/1) variables? Is programming like this correct?
Code: Select all
# ----------------------------------------------------------------- #
#---- Likelihood of binary Probit indicators for LV_metrohabit
# ----------------------------------------------------------------- #
mnl_settings1 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MaxFrequenceModeMetro,
V = list(
chosen = int_MaxFrequenceModeMetro + zeta_MaxFrequenceModeMetro * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MaxFrequenceModeMetro"
)
mnl_settings2 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MetroFrequence1_2,
V = list(
chosen = int_MetroFrequence1_2 + zeta_MetroFrequence1_2 * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MetroFrequence1_2"
)
mnl_settings3 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MetroFrequence7_10,
V = list(
chosen = int_MetroFrequence7_10 + zeta_MetroFrequence7_10 * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MetroFrequence7_10"
)
mnl_settings4 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = TravelPurposeMetroWork,
V = list(
chosen = int_TravelPurposeMetroWork + zeta_TravelPurposeMetroWork * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_TravelPurposeMetroWork"
)
mnl_settings5 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = TravelPurposeMetroLeisure,
V = list(
chosen = int_TravelPurposeMetroLeisure + zeta_TravelPurposeMetroLeisure * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_TravelPurposeMetroLeisure"
)
P[["indic_MaxFrequenceModeMetro"]] = apollo_mnl(mnl_settings1, functionality)
P[["indic_MetroFrequence1_2"]] = apollo_mnl(mnl_settings2, functionality)
P[["indic_MetroFrequence7_10"]] = apollo_mnl(mnl_settings3, functionality)
P[["indic_TravelPurposeMetroWork"]] = apollo_mnl(mnl_settings4, functionality)
P[["indic_TravelPurposeMetroLeisure"]] = apollo_mnl(mnl_settings5, functionality)
### Combined model
P = apollo_combineModels(P, apollo_inputs, functionality)
### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)
Any guidance or example code would be greatly appreciated. Thank you!
Re: Handling the dichotomous latent variable
Posted: 19 May 2025, 19:07
by stephanehess
Hi
sorry about the slow reply
Yes, that code should work
Stephane