I am currently learning choice modelling. As I will do a quite similar project, I have tried to expand the example on mode choice (with socio-demographics) to a Mixed-Logit model. I would greatly appreciate if you could answer a few conceptual questions whether I am on the right track. Maybe this will also help others to get started with choice modelling.
1) My motivation to extend the model would be the following:
- Allow for a potential nest structure (especially between train and bus, as both a public transport)
- Capture random taste for specific options to allow for a better prediction
What kind of model would be suited best to achieve this? Is this MMNL_preference_space sufficient?
2) Does it make sense that the ACSs (except for car which is fixed) and the attribute coefficients will be made random, while the socio-demographics are not?
3) I tried to modify the code to estimate the model as described. However, I am experiencing the following issues:
- WARNING: Your model is using Halton draws in 9 dimensions. You should consider
a different type of draws to avoid issues with collinearity!
- The estimation of the scaled model failed, and the unscaled version
will be returned instead.
- WARNING: Singular Hessian, cannot calculate s.e.
Could not write hessian to a file.
WARNING: Some eigenvalues of the Hessian are complex, indicating that the
Hessian is not symmetrical.
How should I change this example to get a meaningful result that fulfills the motivation of using random coefficients?
Code: Select all
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
### Clear memory
rm(list = ls())
### Load Apollo library
library(apollo)
### Initialise code
apollo_initialise()
### Set core controls
apollo_control = list(
modelName = "MNL_SP_covariates",
modelDescr = "MNL model with socio-demographics on mode choice SP data",
indivID = "ID",
nCores = 4,
outputDirectory = "output"
)
# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# ################################################################# #
### Loading data from package
### if data is to be loaded from a file (e.g. called data.csv),
### the code would be: database = read.csv("data.csv",header=TRUE)
database = apollo_modeChoiceData
### for data dictionary, use ?apollo_modeChoiceData
### Use only SP data
database = subset(database,database$SP==1)
### Create new variable with average income
database$mean_income = mean(database$income)
# ################################################################# #
#### ANALYSIS OF CHOICES ####
# ################################################################# #
### Define settings for analysis of choice data to be conducted prior to model estimation
choiceAnalysis_settings <- list(
alternatives = c(car=1, bus=2, air=3, rail=4),
avail = list(car=database$av_car, bus=database$av_bus, air=database$av_air, rail=database$av_rail),
choiceVar = database$choice,
explanators = database[,c("female","business","income")],
rows = (database$SP==1)
)
### Run function to analyse choice data
apollo_choiceAnalysis(choiceAnalysis_settings, apollo_control, database)
# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(asc_car = 0,
mu_log_asc_bus = -3,
sigma_log_asc_bus = -0.01,
mu_log_asc_air = -3,
sigma_log_asc_air = -0.01,
mu_log_asc_rail = -3,
sigma_log_asc_rail = -0.01,
asc_bus_shift_female = 0,
asc_air_shift_female = 0,
asc_rail_shift_female = 0,
mu_log_b_tt_car = -3,
sigma_log_b_tt_car = -0.01,
mu_log_b_tt_bus = -3,
sigma_log_b_tt_bus = -0.01,
mu_log_b_tt_air = -3,
sigma_log_b_tt_air = -0.01,
mu_log_b_tt_rail = -3,
sigma_log_b_tt_rail = -0.01,
b_tt_shift_business = 0,
mu_log_b_access = -3,
sigma_log_b_access = -0.01,
mu_log_b_cost = -3,
sigma_log_b_cost = -0.01,
b_cost_shift_business = 0,
cost_income_elast = 0)
### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_car")
# ################################################################# #
#### DEFINE RANDOM COMPONENTS ####
# ################################################################# #
### Set parameters for generating draws
apollo_draws = list(
interDrawsType = "halton",
interNDraws = 500,
interUnifDraws = c(),
interNormDraws = c("draws_asc_bus","draws_asc_rail","draws_asc_air","draws_b_tt_car", "draws_b_tt_bus", "draws_b_tt_air", "draws_b_tt_rail", "draws_b_access", "draws_b_cost"),
intraDrawsType = "halton",
intraNDraws = 0,
intraUnifDraws = c(),
intraNormDraws = c()
)
### Create random parameters
apollo_randCoeff = function(apollo_beta, apollo_inputs){
randcoeff = list()
randcoeff[["asc_bus"]] = -exp( mu_log_asc_bus + sigma_log_asc_bus * draws_asc_bus)
randcoeff[["asc_rail"]] = -exp( mu_log_asc_rail + sigma_log_asc_rail * draws_asc_rail )
randcoeff[["asc_air"]] = -exp( mu_log_asc_air + sigma_log_asc_air * draws_asc_air )
randcoeff[["b_tt_car"]] = -exp( mu_log_b_tt_car + sigma_log_b_tt_car * draws_b_tt_car )
randcoeff[["b_tt_bus"]] = -exp( mu_log_b_tt_bus + sigma_log_b_tt_bus * draws_b_tt_bus )
randcoeff[["b_tt_air"]] = -exp( mu_log_b_tt_air + sigma_log_b_tt_air * draws_b_tt_air )
randcoeff[["b_tt_rail"]] = -exp( mu_log_b_tt_rail + sigma_log_b_tt_rail * draws_b_tt_rail )
randcoeff[["b_access"]] = -exp( mu_log_b_access + sigma_log_b_access * draws_b_access)
randcoeff[["b_cost"]] = -exp( mu_log_b_cost + sigma_log_b_cost * draws_b_cost)
return(randcoeff)
}
### Read in starting values for at least some parameters from existing model output file
### apollo_beta = apollo_readBeta(apollo_beta, apollo_fixed, "MNL_SP", overwriteFixed=FALSE)
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### Create alternative specific constants and coefficients using interactions with socio-demographics
asc_bus_value = asc_bus + asc_bus_shift_female * female
asc_air_value = asc_air + asc_air_shift_female * female
asc_rail_value = asc_rail + asc_rail_shift_female * female
b_tt_car_value = b_tt_car + b_tt_shift_business * business
b_tt_bus_value = b_tt_bus + b_tt_shift_business * business
b_tt_air_value = b_tt_air + b_tt_shift_business * business
b_tt_rail_value = b_tt_rail + b_tt_shift_business * business
b_cost_value = ( b_cost + b_cost_shift_business * business ) * ( income / mean_income ) ^ cost_income_elast
### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
V = list()
V[["car"]] = asc_car + b_tt_car_value * time_car + b_cost_value * cost_car
V[["bus"]] = asc_bus_value + b_tt_bus_value * time_bus + b_access * access_bus + b_cost_value * cost_bus
V[["air"]] = asc_air_value + b_tt_air_value * time_air + b_access * access_air + b_cost_value * cost_air
V[["rail"]] = asc_rail_value + b_tt_rail_value * time_rail + b_access * access_rail + b_cost_value * cost_rail
### Define settings for MNL model component
mnl_settings = list(
alternatives = c(car=1, bus=2, air=3, rail=4),
avail = list(car=av_car, bus=av_bus, air=av_air, rail=av_rail),
choiceVar = choice,
utilities = V
)
### Compute probabilities using MNL model
P[["model"]] = apollo_mnl(mnl_settings, functionality)
### Take product across observation for same individual
P = apollo_panelProd(P, apollo_inputs, functionality)
### Average across inter-individual draws
P = apollo_avgInterDraws(P, apollo_inputs, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
# ################################################################# #
#### MODEL OUTPUTS ####
# ################################################################# #
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN) ----
# ----------------------------------------------------------------- #
apollo_modelOutput(model)
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name) ----
# ----------------------------------------------------------------- #
apollo_saveOutput(model)
# ################################################################# #
##### POST-PROCESSING ####
# ################################################################# #
### Print outputs of additional diagnostics to new output file (remember to close file writing when complete)
apollo_sink()
Code: Select all
Model name : MMNL_SP_covariates
Model description : MMNL model with socio-demographics on mode choice SP data
Model run at : 2024-03-07 18:07:50.798708
Estimation method : bgw
Model diagnosis : Relative function convergence
Optimisation diagnosis : Non-symmetrical hessian
hessian properties : Complex eigenvalues
maximum eigenvalue : NA
reciprocal of condition number : not calculated (Hessian is not negative definite)
Number of individuals : 500
Number of rows in database : 7000
Number of modelled outcomes : 7000
Number of cores used : 4
Number of inter-individual draws : 500 (halton)
LL(start) : -18984.68
LL at equal shares, LL(0) : -8196.02
LL at observed shares, LL(C) : -6706.94
LL(final) : -5009.14
Rho-squared vs equal shares : 0.3888
Adj.Rho-squared vs equal shares : 0.3859
Rho-squared vs observed shares : 0.2531
Adj.Rho-squared vs observed shares : 0.25
AIC : 10066.28
BIC : 10230.77
Estimated parameters : 24
Time taken (hh:mm:ss) : 01:07:2.41
pre-estimation : 00:11:52.07
estimation : 00:29:14.94
post-estimation : 00:25:55.4
Iterations : 91
Unconstrained optimisation.
Estimates:
Estimate s.e. t.rat.(0) Rob.s.e.
asc_car 0.000000 NA NA NA
mu_log_asc_bus -38.842749 NA NA NA
sigma_log_asc_bus -5.059705 NA NA NA
mu_log_asc_air -48.856391 NA NA NA
sigma_log_asc_air -2.968060 NA NA NA
mu_log_asc_rail 0.491826 NA NA NA
sigma_log_asc_rail 1.9378e-04 NA NA NA
asc_bus_shift_female 0.368284 NA NA NA
asc_air_shift_female 0.254866 NA NA NA
asc_rail_shift_female 0.193771 NA NA NA
mu_log_b_tt_car -4.472520 NA NA NA
sigma_log_b_tt_car -0.072715 NA NA NA
mu_log_b_tt_bus -3.960583 NA NA NA
sigma_log_b_tt_bus -0.055931 NA NA NA
mu_log_b_tt_air -4.091477 NA NA NA
sigma_log_b_tt_air -0.001867 NA NA NA
mu_log_b_tt_rail -5.600530 NA NA NA
sigma_log_b_tt_rail 5.5566e-04 NA NA NA
b_tt_shift_business -0.005791 NA NA NA
mu_log_b_access -3.870151 NA NA NA
sigma_log_b_access -0.011702 NA NA NA
mu_log_b_cost -2.627106 NA NA NA
sigma_log_b_cost 0.015930 NA NA NA
b_cost_shift_business 0.034406 NA NA NA
cost_income_elast -0.647156 NA NA NA
Rob.t.rat.(0)
asc_car NA
mu_log_asc_bus NA
sigma_log_asc_bus NA
mu_log_asc_air NA
sigma_log_asc_air NA
mu_log_asc_rail NA
sigma_log_asc_rail NA
asc_bus_shift_female NA
asc_air_shift_female NA
asc_rail_shift_female NA
mu_log_b_tt_car NA
sigma_log_b_tt_car NA
mu_log_b_tt_bus NA
sigma_log_b_tt_bus NA
mu_log_b_tt_air NA
sigma_log_b_tt_air NA
mu_log_b_tt_rail NA
sigma_log_b_tt_rail NA
b_tt_shift_business NA
mu_log_b_access NA
sigma_log_b_access NA
mu_log_b_cost NA
sigma_log_b_cost NA
b_cost_shift_business NA
cost_income_elast NA
Overview of choices for MNL model component :
car bus air rail
Times available 5446.00 6314.00 5264.00 6118.00
Times chosen 1946.00 358.00 1522.00 3174.00
Percentage chosen overall 27.80 5.11 21.74 45.34
Percentage chosen when available 35.73 5.67 28.91 51.88