Hi,
I am working on a MNL model for residential choices. I have 1000 alternatives and each represent a municipality. This work is based on a survey I have been conducting, and all the respondent declared their municipality of residence, and also other information about this choice.
I started work on my model by using the MNL_iterative_coding example. It allowed me to code the utilities with the different attributes of the municipalities. I also want to add a fixed spatial effect and some socio-demo variables (that are not related to the municipalities but to individuals). Yet, when I do it, I obtain NA for the corresponding estimates.
Is there a specific way to do it ? Is there another example I should explore ?
Thank you for your help
Lola
Important: Read this before posting to this forum
- This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
- There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
- Before asking a question on the forum, users are kindly requested to follow these steps:
- Check that the same issue has not already been addressed in the forum - there is a search tool.
- Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
- Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
- Make sure that R is using the latest official release of Apollo.
- Users can check which version they are running by entering packageVersion("apollo").
- Then check what is the latest full release (not development version) at http://www.ApolloChoiceModelling.com/code.html.
- To update to the latest official version, just enter install.packages("apollo"). To update to a development version, download the appropriate binary file from http://www.ApolloChoiceModelling.com/code.html, and install the package from file
- If the above steps do not resolve the issue, then users should follow these steps when posting a question:
- provide full details on the issue, including the entire code and output, including any error messages
- posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.
MNL for residential choices
-
- Site Admin
- Posts: 998
- Joined: 24 Apr 2020, 16:29
Re: MNL for residential choices
Hi
we really need to see your code and output in order to be able to help
Stephane
we really need to see your code and output in order to be able to help
Stephane
Re: MNL for residential choices
Hi,
Thanks for your answer
Here is my code :
And the output :
If I use different socio-demo variables, sometimes there are no NA...
Thank you again
Lola
Thanks for your answer
Here is my code :
Code: Select all
###################################################################
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ##
###################################################################
#### Library and data
library(apollo)
### Clear memory
#rm(list = ls())
#Initialise code
apollo_initialise()
#Set core control
apollo_control = list(
modelName = "MNL_RP",
modelDescr = "Simple MNL model on mode choice RP data",
indivID = "Obs", #Identifiant ménage
outputDirectory = "output")
###################################################################
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ##
###################################################################
N = nrow(data)
J = 1316 #nb alternatives
K = 8 #nb attributes
database = read_rds("output/data_final_4.rds")#data_final
###################################################################
#### DEFINE MODEL PARAMETERS ##
###################################################################
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(alpha = 1,
#beta_ch = 0,
beta_ruc = 0,
beta_work = 0,
beta_mont = 0,
beta_mer = 0,
#beta_poll = 0,
#beta_icu = 0,
#beta_bruit = 0,
setNames(rep(0,K),paste0("beta_",1:K)))
### Vector with names (in quotes) of parameters to be kept fixed at their
# starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()
###################################################################
#### GROUP AND VALIDATE INPUTS ##
###################################################################
apollo_inputs = apollo_validateInputs()
apollo_inputs$J = J # need to retain J (number of alternatives) for use
# inside apollo_probabilities
apollo_inputs$K = K # need to retain K (number of attributes) for use
# inside apollo_probabilities
###################################################################
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ##
###################################################################
apollo_probabilities=function(apollo_beta, apollo_inputs,
functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### List of utilities: these must use the same names as in mnl_settings,
#order is irrelevant
V = list()
# for(j in 1:apollo_inputs$J){
# V[[paste0("alt_",j)]] = 0
# for(k in 1:apollo_inputs$K) V[[paste0("alt_",j)]] = V[[paste0("alt_",j)]] +
# get(paste0("beta_",k))*get(paste0("x_",j,"_",k))
# }
for(j in 1:apollo_inputs$J){
V[[paste0("alt_",j)]] =
alpha +
#beta_ch * nb_chambres + #taille_log
beta_ruc * log_ruc +
beta_work * log_dwork +
beta_mont * vue_montagne +
beta_mer * vue_mer +
#beta_poll * pollution +
#beta_icu * icu_1 +
#beta_bruit * bruit +
get(paste0("beta_",1))*get(paste0("x_",j,"_",1)) + #prix m2
get(paste0("beta_",2))*get(paste0("x_",j,"_",2)) + #log(pop)
get(paste0("beta_",3))*get(paste0("x_",j,"_",3)) + #densite
get(paste0("beta_",4))*get(paste0("x_",j,"_",4)) + #jobs
get(paste0("beta_",5))*get(paste0("x_",j,"_",5)) + #taille men
get(paste0("beta_",6))*get(paste0("x_",j,"_",6)) + #parc logements
get(paste0("beta_",7))*get(paste0("x_",j,"_",7)) + #ecoles
get(paste0("beta_",8))*get(paste0("x_",j,"_",8)) #distance cbd
}
### Define settings for MNL model component
mnl_settings = list(
alternatives = setNames(1:apollo_inputs$J, names(V)),
avail = setNames(apollo_inputs$database[,paste0("av_",1:apollo_inputs$J)], names(V)),
choiceVar = Choice, #Alternative choisie
utilities = V
)
### Compute probabilities using MNL model
P[["model"]] = apollo_mnl(mnl_settings, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
###################################################################
#### MODEL ESTIMATION ##
###################################################################
model = apollo_estimate(apollo_beta, apollo_fixed,
apollo_probabilities, apollo_inputs)
Code: Select all
Calculating log-likelihood at equal shares (LL(0)) for applicable
models...
Calculating log-likelihood at observed shares from estimation data
(LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using analytical gradient.
0%....25%....50%....75%....100%
WARNING: Some eigenvalues of the Hessian are positive, indicating
convergence to a saddle point!
Computing score matrix...
Warning message:
In sqrt(diag(varcov)) : NaNs produced
Model run by blandin using Apollo 0.2.8 on R 4.1.0 for Windows.
www.ApolloChoiceModelling.com
Model name : MNL_RP
Model description : Simple MNL model on mode choice RP data
Model run at : 2023-05-09 11:40:23
Estimation method : bfgs
Model diagnosis : successful convergence
Number of individuals : 1871
Number of rows in database : 1871
Number of modelled outcomes : 1871
Number of cores used : 1
Model without mixing
LL(start) : -7319.4
LL at equal shares, LL(0) : -7319.4
LL at observed shares, LL(C) : -12959.74
LL(final) : -7119.83
Rho-squared vs equal shares : 0.0273
Adj.Rho-squared vs equal shares : 0.0256
Rho-squared vs observed shares : 0.4506
Adj.Rho-squared vs observed shares : 0.4497
AIC : 14263.66
BIC : 14330.07
Estimated parameters : 12
Time taken (hh:mm:ss) : 00:05:9.09
pre-estimation : 00:02:42
estimation : 00:00:51.72
post-estimation : 00:01:35.38
Iterations : 16
Min abs eigenvalue of Hessian : 627.5277
Some eigenvalues of Hessian are positive, indicating potential
problems!
Unconstrained optimisation.
These outputs have had the scaling used in estimation applied to them.
Estimates:
Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
beta_ruc -7.069e-13 0.003174 -2.228e-10 0.069879 -1.012e-11
beta_work -3.324e-14 NaN NaN 1.261392 -2.635e-14
beta_mont -3.640e-14 NaN NaN 5.128068 -7.098e-15
beta_mer -3.212e-14 0.242989 -1.322e-13 10.101880 -3.180e-15
beta_1 -0.046473 0.021337 -2.1780 0.018858 -2.4644
beta_2 0.827930 0.099239 8.3428 0.113729 7.2799
beta_3 -0.070419 0.012101 -5.8190 0.011746 -5.9952
beta_4 -0.001739 3.3729e-04 -5.1558 3.3722e-04 -5.1570
beta_5 -1.637894 0.155710 -10.5189 0.154312 -10.6142
beta_6 -0.739536 0.108651 -6.8066 0.122292 -6.0473
beta_7 0.013343 0.001704 7.8289 0.001796 7.4290
beta_8 -0.002197 0.003162 -0.6949 0.003004 -0.7313
Thank you again
Lola
Re: MNL for residential choices
Hi Lola,
From looking at your code, I reckon that your deterministic utility functions are as follows:
V_j = a + b_z*z + b_x*x_j
Where j is the index of the alternative, and z are sociodemographic characteristics of the respondent.
The problem with this formulation is that z variables are the same across all alternatives, so when you do (for example) V_1 - V_2, b_z cancels out. As you are probably aware, only differences in utility matter in MNL models, so your b_z parameters (beta_ruc, beta_wor, beta_mont, and beta_mer) would not be identified. And these are precisely the parameters that are causing problems in your results. For the same reason you do not need the alpha parameter either.
The most traditional way to introduce sociodemographics in a model like yours would be to interact them with the b_x coefficients. So for example, people currently with view to the sea might have a different sensibility to price than other people. Yo would then do something like:
Section 2.5 from Train's book [Train (2009) Discrete choice methods with simulation, Cambridge University Press] contains more details on model identification.
Best wishes
David
From looking at your code, I reckon that your deterministic utility functions are as follows:
V_j = a + b_z*z + b_x*x_j
Where j is the index of the alternative, and z are sociodemographic characteristics of the respondent.
The problem with this formulation is that z variables are the same across all alternatives, so when you do (for example) V_1 - V_2, b_z cancels out. As you are probably aware, only differences in utility matter in MNL models, so your b_z parameters (beta_ruc, beta_wor, beta_mont, and beta_mer) would not be identified. And these are precisely the parameters that are causing problems in your results. For the same reason you do not need the alpha parameter either.
The most traditional way to introduce sociodemographics in a model like yours would be to interact them with the b_x coefficients. So for example, people currently with view to the sea might have a different sensibility to price than other people. Yo would then do something like:
Code: Select all
V[[paste0("alt_",j)]] =
(get(paste0("beta_",1)) + beta_mer * vue_mer)*get(paste0("x_",j,"_",1)) + #prix m2
get(paste0("beta_",2))*get(paste0("x_",j,"_",2)) + #log(pop)
get(paste0("beta_",3))*get(paste0("x_",j,"_",3)) + #densite
get(paste0("beta_",4))*get(paste0("x_",j,"_",4)) + #jobs
get(paste0("beta_",5))*get(paste0("x_",j,"_",5)) + #taille men
get(paste0("beta_",6))*get(paste0("x_",j,"_",6)) + #parc logements
get(paste0("beta_",7))*get(paste0("x_",j,"_",7)) + #ecoles
get(paste0("beta_",8))*get(paste0("x_",j,"_",8)) #distance cbd
}
Best wishes
David
Re: MNL for residential choices
Hi,
Thank you so much for your help David!
Best wishes
Lola
Thank you so much for your help David!
Best wishes
Lola