Hi Stephane and David,
Hope this post finds you well. I have a query regarding the Log-Likelihood value of an ICLV model and the same model without any latent variables. The code of both models is given below. My query is after estimating the ICLV model with latent variable from the simple model, I do not observe an improvement over the LL value. LL value becomes worse in the case of the ICLV model which is a bit surprising for me. I have observed this type of phenomenon using the ICLV model for another project where I also did not find an improvement over the LL value. However, the latent variable was significant in the model. Am I doing anything wrong in the coding portion or in the other portion? I appreciate your time on this. I will be waiting for a reply. Thank you.
Output of the Models:
LL(0) for the simple model: -275.1794
LL (final) of the simple model: -170.4397
LL(0) for the commute_mode portion of the model: -275.1794
LL(final) for the commute_mode portion of the model: -179.5322
Dependent variable: choice of bicycle for commuting trips (binary logit model)
Simple Model Code -------------------------------------------------------------------------------------------------------------------------------------------------------------
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
### Clear memory
rm(list = ls())
### Load Apollo library
library(apollo)
library(tidyverse)
### Initialise code
apollo_initialise()
### Set core controls
apollo_control = list(
modelName = "Modeling Bicycle Mode Choice for Commuting Trip without LVs",
modelDescr = "ICLV model",
indivID = "X",
mixing = FALSE,
nCores = 3
)
# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# ################################################################# #
database = read.csv("data_factor3.csv",header=TRUE)
View(database)
database <- na.omit(database)
View(database)
# ##################################################################
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(asc_yes = 0,
b_age =0,
b_gender =0,
b_income = 0,
b_student =0,
b_distance =0,
b_household =0,
b_bike = 0,
b_landuse =0,
b_walk =0,
b_cycle =0,
b_vehicle =0)
### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### Likelihood of choices
### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
V = list()
V[['No']] = 0
V[['Yes']] = asc_yes + b_age * Age+b_gender * Gender + b_income * Household.Income + b_landuse * Land.Use.Tye+b_student * student+b_bike *number.of.bi.cycle+
b_distance * commute_distance + b_household *number.of.household.people+ b_walk * walktime + b_cycle * cycletime + b_vehicle * motorvehicle
### Define settings for MNL model component
mnl_settings = list(
alternatives = c(No=0,Yes=1),
avail = 1,
choiceVar = commute_mode,
V = V
)
### Compute probabilities for MNL model component
P[["model"]] = apollo_mnl(mnl_settings, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
# ################################################################# #
#### MODEL OUTPUTS ####
# ################################################################# #
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN) ----
# ----------------------------------------------------------------- #
apollo_modelOutput(model)
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name) ----
# ----------------------------------------------------------------- #
apollo_saveOutput(model)
###Code for the ICLV Model-------------------------------------------------------------------------------------------------------------------------------------------------
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
### Clear memory
rm(list = ls())
### Load Apollo library
library(apollo)
library(tidyverse)
### Initialise code
apollo_initialise()
### Set core controls
apollo_control = list(
modelName = "Modeling Bicycle Mode Choice for Commuting Trip_v4",
modelDescr = "ICLV model",
indivID = "X",
mixing = TRUE,
nCores = 3
)
# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# ################################################################# #
database = read.csv("data_factor3.csv",header=TRUE)
View(database)
database <- na.omit(database)
View(database)
### Subtract mean of indicator variables to centre them on zero
database$Rating.of.walking.accessibility.to.go.to.work.business.school..university=database$Rating.of.walking.accessibility.to.go.to.work.business.school..university-mean(database$Rating.of.walking.accessibility.to.go.to.work.business.school..university)
database$Rating.of.walking.accessibility.to.go.to.grocery.stores=database$Rating.of.walking.accessibility.to.go.to.grocery.stores-mean(database$Rating.of.walking.accessibility.to.go.to.grocery.stores)
database$Rating.of.walking.accessibility.to.go.to.social.place=database$Rating.of.walking.accessibility.to.go.to.social.place-mean(database$Rating.of.walking.accessibility.to.go.to.social.place)
database$Rating.of.walking.accessibility.to.go.to.Recreation.place=database$Rating.of.walking.accessibility.to.go.to.Recreation.place-mean(database$Rating.of.walking.accessibility.to.go.to.Recreation.place)
database$safety.to.walk.in.the.main.roads= database$safety.to.walk.in.the.main.roads-mean(database$safety.to.walk.in.the.main.roads)
database$safety.to.ride.a.bi.cycle.in.the.main.roads= database$safety.to.ride.a.bi.cycle.in.the.main.roads-mean(database$safety.to.ride.a.bi.cycle.in.the.main.roads)
# ##################################################################
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(asc_yes = 0,
b_age =0,
b_gender =0,
b_income = 0,
b_student =0,
b_distance =0,
b_household =0,
b_bike = 0,
b_landuse =0,
b_walk =0,
b_cycle =0,
b_vehicle =0,
a1_gender =0,
gamma1 = 0,
d_Rating.of.walking.accessibility.to.go.to.work.business.school..university = 1,
d_Rating.of.walking.accessibility.to.go.to.grocery.stores = 1,
d_Rating.of.walking.accessibility.to.go.to.social.place = 1,
d_Rating.of.walking.accessibility.to.go.to.Recreation.place = 1,
d_safety.to.walk.in.the.main.roads =1,
d_safety.to.ride.a.bi.cycle.in.the.main.roads =1,
sigma_Rating.of.walking.accessibility.to.go.to.work.business.school..university = 1,
sigma_Rating.of.walking.accessibility.to.go.to.grocery.stores = 1,
sigma_Rating.of.walking.accessibility.to.go.to.social.place = 1,
sigma_Rating.of.walking.accessibility.to.go.to.Recreation.place = 1,
sigma_safety.to.walk.in.the.main.roads =1,
sigma_safety.to.ride.a.bi.cycle.in.the.main.roads =1)
### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()
# ################################################################# #
#### DEFINE RANDOM COMPONENTS ####
# ################################################################# #
### Set parameters for generating draws
apollo_draws = list(
interDrawsType="halton",
interNDraws=100,
interUnifDraws=c(),
interNormDraws=c("nu_n1"),
intraDrawsType='',
intraNDraws=0,
intraUnifDraws=c(),
intraNormDraws=c()
)
### Create random parameters
apollo_randCoeff=function(apollo_beta, apollo_inputs){
randcoeff = list()
randcoeff[["LV1"]] = nu_n1+ a1_gender * Gender
return(randcoeff)
}
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### Likelihood of indicators
normalDensity_settings1 = list(outcomeNormal=Rating.of.walking.accessibility.to.go.to.work.business.school..university,
xNormal=d_Rating.of.walking.accessibility.to.go.to.work.business.school..university*LV1,
mu=0,
sigma=sigma_Rating.of.walking.accessibility.to.go.to.work.business.school..university)
normalDensity_settings2 = list(outcomeNormal=Rating.of.walking.accessibility.to.go.to.grocery.stores,
xNormal=d_Rating.of.walking.accessibility.to.go.to.grocery.stores*LV1,
mu=0,
sigma=sigma_Rating.of.walking.accessibility.to.go.to.grocery.stores)
normalDensity_settings3 = list(outcomeNormal=Rating.of.walking.accessibility.to.go.to.social.place,
xNormal=d_Rating.of.walking.accessibility.to.go.to.social.place*LV1,
mu=0,
sigma=sigma_Rating.of.walking.accessibility.to.go.to.social.place)
normalDensity_settings4 = list(outcomeNormal=Rating.of.walking.accessibility.to.go.to.Recreation.place,
xNormal=d_Rating.of.walking.accessibility.to.go.to.Recreation.place*LV1,
mu=0,
sigma=sigma_Rating.of.walking.accessibility.to.go.to.Recreation.place)
normalDensity_settings5 = list(outcomeNormal=safety.to.walk.in.the.main.roads,
xNormal=d_safety.to.walk.in.the.main.roads*LV1,
mu=0,
sigma=sigma_safety.to.walk.in.the.main.roads)
normalDensity_settings6 = list(outcomeNormal=safety.to.ride.a.bi.cycle.in.the.main.roads,
xNormal=d_safety.to.ride.a.bi.cycle.in.the.main.roads*LV1,
mu=0,
sigma=sigma_safety.to.ride.a.bi.cycle.in.the.main.roads)
P[["indic_Rating.of.walking.accessibility.to.go.to.work.business.school..university"]] = apollo_normalDensity(normalDensity_settings1, functionality)
P[["indic_Rating.of.walking.accessibility.to.go.to.grocery.stores"]] = apollo_normalDensity(normalDensity_settings2, functionality)
P[["indic_Rating.of.walking.accessibility.to.go.to.social.place"]] = apollo_normalDensity(normalDensity_settings3, functionality)
P[["indic_Rating.of.walking.accessibility.to.go.to.Recreation.place"]] = apollo_normalDensity(normalDensity_settings4, functionality)
P[["indic_safety.to.walk.in.the.main.roads"]] = apollo_normalDensity(normalDensity_settings5, functionality)
P[["indic_safety.to.ride.a.bi.cycle.in.the.main.roads"]] = apollo_normalDensity(normalDensity_settings6, functionality)
### Likelihood of choices
### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
V = list()
V[['No']] = 0
V[['Yes']] = asc_yes + b_age * Age+b_gender * Gender + b_income * Household.Income + b_landuse * Land.Use.Tye+b_student * student+b_bike *number.of.bi.cycle+
b_distance * commute_distance + b_household *number.of.household.people+ b_walk * walktime + b_cycle * cycletime + b_vehicle * motorvehicle+
gamma1 * LV1
### Define settings for MNL model component
mnl_settings = list(
alternatives = c(No=0,Yes=1),
avail = 1,
choiceVar = commute_mode,
V = V
)
### Compute probabilities for MNL model component
P[["commute_mode"]] = apollo_mnl(mnl_settings, functionality)
### Likelihood of the whole model
P = apollo_combineModels(P, apollo_inputs, functionality)
### Average across inter-individual draws
P = apollo_avgInterDraws(P, apollo_inputs, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
# ################################################################# #
#### MODEL OUTPUTS ####
# ################################################################# #
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN) ----
# ----------------------------------------------------------------- #
apollo_modelOutput(model)
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name) ----
# ----------------------------------------------------------------- #
apollo_saveOutput(model)
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.
LL value for models with and without latent variables
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: LL value for models with and without latent variables
Hi
The hybrid model includes additional model components, so the LL is going to be more negative by definition. It's the same thing as when you include more data.
In addition, you should not really consider comparing the LL, see the discussions in Vij & Walker
Stephane
The hybrid model includes additional model components, so the LL is going to be more negative by definition. It's the same thing as when you include more data.
In addition, you should not really consider comparing the LL, see the discussions in Vij & Walker
Stephane
Re: LL value for models with and without latent variables
Hi Stephane,
Thank you for your response. I appreciate it.
I have an additional question. If I extend the simple model with an additional factor score variable (extracted from the exploratory factor analysis using the same statements as is done in the example) like the following,
V[['Yes']] = asc_yes + b_age * Age+b_gender * Gender + b_income * Household.Income + b_landuse * Land.Use.Tye+b_student * student+b_bike *number.of.bi.cycle+ b_distance * commute_distance + b_household *number.of.household.people+ b_walk * walktime + b_cycle * cycletime + b_vehicle * motorvehicle + b_factor1 * Factor1_score
Can I compare the LL of that model with the choice portion LL of the ICLV model in the example (excluding the measurement models which apollo reports separately)?
Thank you!
Thank you for your response. I appreciate it.
I have an additional question. If I extend the simple model with an additional factor score variable (extracted from the exploratory factor analysis using the same statements as is done in the example) like the following,
V[['Yes']] = asc_yes + b_age * Age+b_gender * Gender + b_income * Household.Income + b_landuse * Land.Use.Tye+b_student * student+b_bike *number.of.bi.cycle+ b_distance * commute_distance + b_household *number.of.household.people+ b_walk * walktime + b_cycle * cycletime + b_vehicle * motorvehicle + b_factor1 * Factor1_score
Can I compare the LL of that model with the choice portion LL of the ICLV model in the example (excluding the measurement models which apollo reports separately)?
Thank you!
-
- Site Admin
- Posts: 1050
- Joined: 24 Apr 2020, 16:29
Re: LL value for models with and without latent variables
This has a number of issues.
Firstly, you're be treating factor scores as explanatory and error free variables, which is exactly the reason why we're using hybrid choice models instead and use the types of questions that you created your factors from as dependent variables.
Second, any comparison of the LL for the choice model component of a hybrid model (i.e. without the measurement model) is not useful. As shown in many papers, the LL for a reduced form model (i.e. not hybrid) with the same degree of flexibility as the hybrid model cannot have a model fit that is worse than that of the choice model component of a hybrid model.
Stephane
Firstly, you're be treating factor scores as explanatory and error free variables, which is exactly the reason why we're using hybrid choice models instead and use the types of questions that you created your factors from as dependent variables.
Second, any comparison of the LL for the choice model component of a hybrid model (i.e. without the measurement model) is not useful. As shown in many papers, the LL for a reduced form model (i.e. not hybrid) with the same degree of flexibility as the hybrid model cannot have a model fit that is worse than that of the choice model component of a hybrid model.
Stephane
Re: LL value for models with and without latent variables
Hi Stephane,
Thank you for your prompt response. This information is helpful. I appreciate it.
Thank you for your prompt response. This information is helpful. I appreciate it.