Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

MNL for residential choices

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
lolabdn
Posts: 4
Joined: 05 Apr 2023, 09:45
Location: France

MNL for residential choices

Post by lolabdn »

Hi,

I am working on a MNL model for residential choices. I have 1000 alternatives and each represent a municipality. This work is based on a survey I have been conducting, and all the respondent declared their municipality of residence, and also other information about this choice.

I started work on my model by using the MNL_iterative_coding example. It allowed me to code the utilities with the different attributes of the municipalities. I also want to add a fixed spatial effect and some socio-demo variables (that are not related to the municipalities but to individuals). Yet, when I do it, I obtain NA for the corresponding estimates.

Is there a specific way to do it ? Is there another example I should explore ?

Thank you for your help
Lola
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: MNL for residential choices

Post by stephanehess »

Hi

we really need to see your code and output in order to be able to help

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
lolabdn
Posts: 4
Joined: 05 Apr 2023, 09:45
Location: France

Re: MNL for residential choices

Post by lolabdn »

Hi,

Thanks for your answer
Here is my code :

Code: Select all

###################################################################
#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ##
###################################################################
#### Library and data 
library(apollo) 

### Clear memory
#rm(list = ls())

#Initialise code
apollo_initialise()

#Set core control
apollo_control = list(
  modelName       = "MNL_RP",
  modelDescr      = "Simple MNL model on mode choice RP data",
  indivID         = "Obs", #Identifiant ménage
  outputDirectory = "output")

###################################################################
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ##
###################################################################

N = nrow(data)
J = 1316 #nb alternatives
K = 8 #nb attributes

database = read_rds("output/data_final_4.rds")#data_final

###################################################################
#### DEFINE MODEL PARAMETERS                                     ##
###################################################################

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(alpha = 1,
                #beta_ch = 0,
                beta_ruc = 0,
                beta_work = 0,
                beta_mont = 0,
                beta_mer = 0,
                #beta_poll = 0,
                #beta_icu = 0,
                #beta_bruit = 0,
                setNames(rep(0,K),paste0("beta_",1:K)))

### Vector with names (in quotes) of parameters to be kept fixed at their 
# starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()


###################################################################
#### GROUP AND VALIDATE INPUTS                                   ##
###################################################################

apollo_inputs = apollo_validateInputs()
apollo_inputs$J = J # need to retain J (number of alternatives) for use
                    # inside apollo_probabilities
apollo_inputs$K = K # need to retain K (number of attributes) for use 
                    # inside apollo_probabilities


###################################################################
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ##
###################################################################

apollo_probabilities=function(apollo_beta, apollo_inputs, 
                              functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: these must use the same names as in mnl_settings, 
  #order is irrelevant
  V = list()
  # for(j in 1:apollo_inputs$J){
  #   V[[paste0("alt_",j)]] = 0
  #   for(k in 1:apollo_inputs$K) V[[paste0("alt_",j)]] = V[[paste0("alt_",j)]] + 
  #       get(paste0("beta_",k))*get(paste0("x_",j,"_",k))
  # }
  
  for(j in 1:apollo_inputs$J){
    V[[paste0("alt_",j)]] = 
       alpha +
       #beta_ch * nb_chambres + #taille_log 
       beta_ruc * log_ruc +
       beta_work * log_dwork +
       beta_mont * vue_montagne +
       beta_mer * vue_mer +
       #beta_poll * pollution +
       #beta_icu * icu_1 +
       #beta_bruit * bruit +
      get(paste0("beta_",1))*get(paste0("x_",j,"_",1)) + #prix m2
      get(paste0("beta_",2))*get(paste0("x_",j,"_",2)) + #log(pop)
      get(paste0("beta_",3))*get(paste0("x_",j,"_",3)) + #densite
      get(paste0("beta_",4))*get(paste0("x_",j,"_",4)) + #jobs
      get(paste0("beta_",5))*get(paste0("x_",j,"_",5)) + #taille men
      get(paste0("beta_",6))*get(paste0("x_",j,"_",6)) + #parc logements
      get(paste0("beta_",7))*get(paste0("x_",j,"_",7)) + #ecoles
      get(paste0("beta_",8))*get(paste0("x_",j,"_",8)) #distance cbd
  }
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = setNames(1:apollo_inputs$J, names(V)), 
    avail         = setNames(apollo_inputs$database[,paste0("av_",1:apollo_inputs$J)], names(V)),
    choiceVar     = Choice, #Alternative choisie 
    utilities     = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

###################################################################
#### MODEL ESTIMATION                                            ##
###################################################################

model = apollo_estimate(apollo_beta, apollo_fixed, 
                        apollo_probabilities, apollo_inputs)
And the output :

Code: Select all

Calculating log-likelihood at equal shares (LL(0)) for applicable
  models...
Calculating log-likelihood at observed shares from estimation data
  (LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using analytical gradient.
 0%....25%....50%....75%....100%
WARNING: Some eigenvalues of the Hessian are positive, indicating
  convergence to a saddle point!
Computing score matrix...
Warning message:
In sqrt(diag(varcov)) : NaNs produced

Model run by blandin using Apollo 0.2.8 on R 4.1.0 for Windows.
www.ApolloChoiceModelling.com

Model name                                  : MNL_RP
Model description                           : Simple MNL model on mode choice RP data
Model run at                                : 2023-05-09 11:40:23
Estimation method                           : bfgs
Model diagnosis                             : successful convergence 
Number of individuals                       : 1871
Number of rows in database                  : 1871
Number of modelled outcomes                 : 1871

Number of cores used                        :  1 
Model without mixing

LL(start)                                   : -7319.4
LL at equal shares, LL(0)                   : -7319.4
LL at observed shares, LL(C)                : -12959.74
LL(final)                                   : -7119.83
Rho-squared vs equal shares                  :  0.0273 
Adj.Rho-squared vs equal shares              :  0.0256 
Rho-squared vs observed shares               :  0.4506 
Adj.Rho-squared vs observed shares           :  0.4497 
AIC                                         :  14263.66 
BIC                                         :  14330.07 

Estimated parameters                        :  12
Time taken (hh:mm:ss)                       :  00:05:9.09 
     pre-estimation                         :  00:02:42 
     estimation                             :  00:00:51.72 
     post-estimation                        :  00:01:35.38 
Iterations                                  :  16  
Min abs eigenvalue of Hessian               :  627.5277 
Some eigenvalues of Hessian are positive, indicating potential
  problems!

Unconstrained optimisation.

These outputs have had the scaling used in estimation applied to them.
Estimates:
             Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
beta_ruc   -7.069e-13    0.003174  -2.228e-10    0.069879    -1.012e-11
beta_work  -3.324e-14         NaN         NaN    1.261392    -2.635e-14
beta_mont  -3.640e-14         NaN         NaN    5.128068    -7.098e-15
beta_mer   -3.212e-14    0.242989  -1.322e-13   10.101880    -3.180e-15
beta_1      -0.046473    0.021337     -2.1780    0.018858       -2.4644
beta_2       0.827930    0.099239      8.3428    0.113729        7.2799
beta_3      -0.070419    0.012101     -5.8190    0.011746       -5.9952
beta_4      -0.001739  3.3729e-04     -5.1558  3.3722e-04       -5.1570
beta_5      -1.637894    0.155710    -10.5189    0.154312      -10.6142
beta_6      -0.739536    0.108651     -6.8066    0.122292       -6.0473
beta_7       0.013343    0.001704      7.8289    0.001796        7.4290
beta_8      -0.002197    0.003162     -0.6949    0.003004       -0.7313
If I use different socio-demo variables, sometimes there are no NA...

Thank you again

Lola
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: MNL for residential choices

Post by dpalma »

Hi Lola,

From looking at your code, I reckon that your deterministic utility functions are as follows:

V_j = a + b_z*z + b_x*x_j

Where j is the index of the alternative, and z are sociodemographic characteristics of the respondent.

The problem with this formulation is that z variables are the same across all alternatives, so when you do (for example) V_1 - V_2, b_z cancels out. As you are probably aware, only differences in utility matter in MNL models, so your b_z parameters (beta_ruc, beta_wor, beta_mont, and beta_mer) would not be identified. And these are precisely the parameters that are causing problems in your results. For the same reason you do not need the alpha parameter either.

The most traditional way to introduce sociodemographics in a model like yours would be to interact them with the b_x coefficients. So for example, people currently with view to the sea might have a different sensibility to price than other people. Yo would then do something like:

Code: Select all

V[[paste0("alt_",j)]] = 
      (get(paste0("beta_",1)) + beta_mer * vue_mer)*get(paste0("x_",j,"_",1)) + #prix m2
      get(paste0("beta_",2))*get(paste0("x_",j,"_",2)) + #log(pop)
      get(paste0("beta_",3))*get(paste0("x_",j,"_",3)) + #densite
      get(paste0("beta_",4))*get(paste0("x_",j,"_",4)) + #jobs
      get(paste0("beta_",5))*get(paste0("x_",j,"_",5)) + #taille men
      get(paste0("beta_",6))*get(paste0("x_",j,"_",6)) + #parc logements
      get(paste0("beta_",7))*get(paste0("x_",j,"_",7)) + #ecoles
      get(paste0("beta_",8))*get(paste0("x_",j,"_",8)) #distance cbd
  }
Section 2.5 from Train's book [Train (2009) Discrete choice methods with simulation, Cambridge University Press] contains more details on model identification.

Best wishes
David
lolabdn
Posts: 4
Joined: 05 Apr 2023, 09:45
Location: France

Re: MNL for residential choices

Post by lolabdn »

Hi,

Thank you so much for your help David!

Best wishes
Lola
Post Reply