Dear Apollo team!
I would like to ask you about a specific case of the hybrid choice model.
What should I do in case my latent variable is a categorical dichotomous variable?
I understand that the choice model part can be easily solved through an interaction, e.g:
b_price_new=b_price + lambda*LV
Alternatively, it is also clear that the structural equation part works the same as in other cases, e.g.:
LV=b_female * Gender_female + b_higher_education * Education_higher_education + eta
However, it is not clear to me how I will handle the measurement equation part in this case (when my latent variable is a variable with yes/no options).
Can you help me with this please?
Thanks,
Peter
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.
Handling the dichotomous latent variable
Re: Handling the dichotomous latent variable
Hi,
I am not sure I fully understand your question. According to your equations, your latent variable is continuous, because you define it as:
LV = b_female * Gender_female + b_higher_education * Education_higher_education + eta
Do you mean that the indicators of your latent variable are dichotomous? So let’s imagine your LV is measuring how “green” a person is, in other words how much they do to take care of the environment. And you have three indicators: (i) whether they recycle (yes/no), (ii) what their diet is (omnivore / vegetarian / vegan), and (iii) what their monthly carbon footprint is (continuous value).
For the first indicator (recycle), which is dichotomous, you can use a binary logit for the measurement equation, where:
U_recycle = lambda0 + lambda1*LV + epsilon1
And the dependent variable is recycle =1 if U_recycle >0, and recycle=0 otherwise.
For the second indicator (diet), you can use an ordered logit for the measurement equation, where:
U_ diet = lambda2*LV + epsilon2
And you also calculate thresholds tau1 and tau2, where diet=omnivore if U_ diet < tau1, diet=vegetarian if tau1 < U_ diet < tau2, and diet=vegan if tau2 < U_ diet.
Finally, for the third indicator (co2) you can use a linear measurement equation:
co2 = lambda3 + lambda4*LV + epsilon3
If instead you want LV itself to be dichotomous, things may be more complicated. For example, imagine you don’t know if respondents have high income or not, so your LV is dichotomous, where LV=0 means low income, and LV=1 means high income. The way we usually work with dichotomous variables is we assume there is an underlying continuous latent variable, which we’ll call LV* (you can interpret this as some kind of normalised income), where LV=1 if LV*>0 and LV=0 if LV*<=0. Then the structural equation for LV* would be LV*= f(socios) + eta, while LV is completely determined by LV*, as discussed before. The question then, is why would you want to use LV in the utility of your choice when you can simply use LV*?
I hope this helps, otherwise please provide more details on what it is you are modelling, and I may be able to provide more detailed help.
Best wishes,
David
I am not sure I fully understand your question. According to your equations, your latent variable is continuous, because you define it as:
LV = b_female * Gender_female + b_higher_education * Education_higher_education + eta
Do you mean that the indicators of your latent variable are dichotomous? So let’s imagine your LV is measuring how “green” a person is, in other words how much they do to take care of the environment. And you have three indicators: (i) whether they recycle (yes/no), (ii) what their diet is (omnivore / vegetarian / vegan), and (iii) what their monthly carbon footprint is (continuous value).
For the first indicator (recycle), which is dichotomous, you can use a binary logit for the measurement equation, where:
U_recycle = lambda0 + lambda1*LV + epsilon1
And the dependent variable is recycle =1 if U_recycle >0, and recycle=0 otherwise.
For the second indicator (diet), you can use an ordered logit for the measurement equation, where:
U_ diet = lambda2*LV + epsilon2
And you also calculate thresholds tau1 and tau2, where diet=omnivore if U_ diet < tau1, diet=vegetarian if tau1 < U_ diet < tau2, and diet=vegan if tau2 < U_ diet.
Finally, for the third indicator (co2) you can use a linear measurement equation:
co2 = lambda3 + lambda4*LV + epsilon3
If instead you want LV itself to be dichotomous, things may be more complicated. For example, imagine you don’t know if respondents have high income or not, so your LV is dichotomous, where LV=0 means low income, and LV=1 means high income. The way we usually work with dichotomous variables is we assume there is an underlying continuous latent variable, which we’ll call LV* (you can interpret this as some kind of normalised income), where LV=1 if LV*>0 and LV=0 if LV*<=0. Then the structural equation for LV* would be LV*= f(socios) + eta, while LV is completely determined by LV*, as discussed before. The question then, is why would you want to use LV in the utility of your choice when you can simply use LV*?
I hope this helps, otherwise please provide more details on what it is you are modelling, and I may be able to provide more detailed help.
Best wishes,
David
-
- Posts: 2
- Joined: 22 Apr 2025, 10:24
Re: Handling the dichotomous latent variable
Hello, I have a related question regarding this issue.
As you mentioned, I need to use a binary logit model for the measurement equation, since the indicators are 0-1 binary variables.
For the structural equation, I’m expressing it as:
LV = f(socio-economic variables) + η*
The issue I’m facing is:
How can I implement a binary logit model for the measurement equation in Apollo, given that the indicators are binary (0/1) variables? Is programming like this correct?
Any guidance or example code would be greatly appreciated. Thank you!
As you mentioned, I need to use a binary logit model for the measurement equation, since the indicators are 0-1 binary variables.
For the structural equation, I’m expressing it as:
LV = f(socio-economic variables) + η*
The issue I’m facing is:
How can I implement a binary logit model for the measurement equation in Apollo, given that the indicators are binary (0/1) variables? Is programming like this correct?
Code: Select all
# ----------------------------------------------------------------- #
#---- Likelihood of binary Probit indicators for LV_metrohabit
# ----------------------------------------------------------------- #
mnl_settings1 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MaxFrequenceModeMetro,
V = list(
chosen = int_MaxFrequenceModeMetro + zeta_MaxFrequenceModeMetro * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MaxFrequenceModeMetro"
)
mnl_settings2 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MetroFrequence1_2,
V = list(
chosen = int_MetroFrequence1_2 + zeta_MetroFrequence1_2 * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MetroFrequence1_2"
)
mnl_settings3 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = MetroFrequence7_10,
V = list(
chosen = int_MetroFrequence7_10 + zeta_MetroFrequence7_10 * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_MetroFrequence7_10"
)
mnl_settings4 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = TravelPurposeMetroWork,
V = list(
chosen = int_TravelPurposeMetroWork + zeta_TravelPurposeMetroWork * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_TravelPurposeMetroWork"
)
mnl_settings5 = list(
alternatives = c(chosen=1, not_chosen=0),
avail = list(chosen=1, not_chosen=1),
choiceVar = TravelPurposeMetroLeisure,
V = list(
chosen = int_TravelPurposeMetroLeisure + zeta_TravelPurposeMetroLeisure * LV_metrohabit,
not_chosen = 0
),
componentName = "indic_TravelPurposeMetroLeisure"
)
P[["indic_MaxFrequenceModeMetro"]] = apollo_mnl(mnl_settings1, functionality)
P[["indic_MetroFrequence1_2"]] = apollo_mnl(mnl_settings2, functionality)
P[["indic_MetroFrequence7_10"]] = apollo_mnl(mnl_settings3, functionality)
P[["indic_TravelPurposeMetroWork"]] = apollo_mnl(mnl_settings4, functionality)
P[["indic_TravelPurposeMetroLeisure"]] = apollo_mnl(mnl_settings5, functionality)
### Combined model
P = apollo_combineModels(P, apollo_inputs, functionality)
### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)
Any guidance or example code would be greatly appreciated. Thank you!
-
- Site Admin
- Posts: 1295
- Joined: 24 Apr 2020, 16:29
Re: Handling the dichotomous latent variable
Hi
sorry about the slow reply
Yes, that code should work
Stephane
sorry about the slow reply
Yes, that code should work
Stephane