Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

(Very) Long Pre-estimation Time for MNL

Ask questions about how to estimate models and how to change your settings for estimation.
Post Reply
felix
Posts: 4
Joined: 28 Nov 2024, 16:52

(Very) Long Pre-estimation Time for MNL

Post by felix »

Hi everyone,
First of all, thanks for providing this very useful package and especially the detailed documentation and examples. I found the "step-by-step" approach extremely useful and the warning and error messages helped me to eliminate bugs (in my data) easily.

Overall goal
I am looking for help regarding the following problem. My final goal is to estimate a mixed logit model with around 10-15 parameters on a dataset with 6861 rows (individuals). The data is cross-sectional, i.e. I don't observe repeated choices per individual. Originally, me and my team had the goal to model each individual's choice set consisting of around 1,000 alternatives (in reality, these are granular locations). We are also hoping to estimate this model many times, in order to iteratively vary a fixed parameter of the model (not yet included).

Current Status and Problem
To test the package, I decided to start with 50 alternatives per individual. As a start, I wanted to estimate a MNL model to retrieve estimates I can later use as starting values. However, the (pre-)estimation of the MNL appears to be very demanding, and I am not exactly sure why and how it could be improved. It appears to me that the "only" non-standard aspect of my setting is the fact that each individual has a larger number of alternatives (here 50, later hopefully more) and that these choice sets of alternatives are mostly not overlapping (for the 6861 individuals with 50 choices each I arrive at 30521 distinct alternatives).

So far, the MNL estimation takes
on 1 core:
Time taken (hh:mm:ss) : 21:24:44.54
pre-estimation : 20:26:0.9
estimation : 00:27:11.54
post-estimation : 00:31:32.1

and on 2 cores:
Time taken (hh:mm:ss) : 135:01:0.02
pre-estimation : 109:39:17.24
estimation : 00:21:44.85
post-estimation : 24:59:57.93

So as you can see, the vast majority of time is pre and post estimation. So questions I have are:
- what can I do to reduce this pre and post estimation time?
- are there any recommended options, especially when proceed towards more alternatives and the mixed logit estimation, that I could benefit from?
I am aware of the speed test function of the package, which in principle is great but does not really help me to understand what the underlying issue is.

Any help would be greatly appreciated.
Best wishes,
Felix

Please find the code below. I add the output after it while omitting some parts of the "Overview of choices for MNL model component". I can send the full .txt file in a message at any time. My code is:

Code: Select all



### Clear memory
rm(list = ls())

### Set working directory


### Packages
library(tidyverse)
library(apollo)
library(data.table)
library(purrr)


# Load Data
load("data_output/choicedata_apollo_cutoff50.RDS")

# Model options -----
NCores = 1
name = paste0("MNL_realdata_3fits_cutoff50_cores",NCores)
description = "MNL model only 3 fit variables on real data, cutoff=50"


# Specify either search = 1 (in that case initial_betas are irrelevant) 
#             or search = 0 and initial_betas based on some stored values or guesses.
initial_betas = c(mu_partisan_fit = -1, 
                  mu_income_fit = 1,
                  mu_race_fit = 1)


# Create directory for output
output_directory = paste0("output/", name)
dir.create(output_directory)






# Estimate ----
# Apollo Choice Modeling: MNL Model, Test Data


### Initialise code ----
apollo_initialise()

# Get number of cores of computer
# nCores = parallel::detectCores()


### Set core controls
apollo_control = list(
  modelName       = name,
  modelDescr      = description,
  indivID         = "caseid", 
  outputDirectory = output_directory,
  calculateLLC=FALSE,
  workInLogs=TRUE,
  nCores          = NCores #use few cores for the simple estimations, but use multithreading to catch potential errors.
)



# Define Model & Parameters ----

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = initial_betas

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()

# Group and validaty inputs ---- 
apollo_inputs = apollo_validateInputs(database=database_wide)
apollo_inputs = c(apollo_inputs, J=J)
  # note: J is the number of distinct alternatives in the dataset. It was created via the following code (on a dataset in long format)
  #   unique_alts <- unique(database_long$alt_number)
  #   J <- length(unique_alts)


# Define Model & Likelihood Function ----
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: will be included in the ml_settings below. 
  V = list()
  for(j in 1:apollo_inputs$J) V[[paste0("alt",j)]] = 0
  for(j in 1:apollo_inputs$J) V[[paste0("alt",j)]] = V[[paste0("alt",j)]] + mu_partisan_fit * get(paste0("partisan_fit_",j)) + mu_income_fit * get(paste0("income_fit_",j)) + mu_race_fit * get(paste0("race_fit_",j))
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives = setNames(1:apollo_inputs$J, names(V)),
    avail        = setNames(apollo_inputs$database[,paste0("av",1:apollo_inputs$J)], names(V)),
    choiceVar    = choice_alt_number,
    utilities    = V
  )
  
  ### Compute probabilities for MNL model component
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}


## Estimate model
assign(paste0("model_",name), apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs))


# Model Outputs ----
# FORMATTED OUTPUT (TO FILE, using model name)
apollo_saveOutput(get(paste0("model_",name)))

# FORMATTED OUTPUT (TO SCREEN)
apollo_modelOutput(get(paste0("model_",name)))


And the output is:
Model run by user using Apollo 0.3.4 on R 4.4.1 for Linux.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
DOI 10.1016/j.jocm.2019.100170
www.ApolloChoiceModelling.com

Model name : MNL_realdata_3fits_cutoff50_cores2
Model description : MNL model only 3 fit variables on real data, cutoff=50
Model run at : 2024-11-30 17:37:15.901963
Estimation method : bgw
Model diagnosis : Relative function convergence
Optimisation diagnosis : Maximum found
hessian properties : Negative definite
maximum eigenvalue : -167.544817
reciprocal of condition number : 0.0879228
Number of individuals : 6861
Number of rows in database : 6861
Number of modelled outcomes : 6861

Number of cores used : 2
Model without mixing

LL(start) : -33616.88
LL at equal shares, LL(0) : -26976.26
LL at observed shares, LL(C) : Not applicable
LL(final) : -25637.73
Rho-squared vs equal shares : 0.0496
Adj.Rho-squared vs equal shares : 0.0495
Rho-squared vs observed shares : Not applicable
Adj.Rho-squared vs observed shares : Not applicable
AIC : 51281.47
BIC : 51301.97

Estimated parameters : 3
Time taken (hh:mm:ss) : 135:01:0.02
pre-estimation : 109:39:17.24
estimation : 00:21:44.85
post-estimation : 24:59:57.93
Iterations : 15

Unconstrained optimisation.

Estimates:
Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
mu_partisan_fit -1.8452 0.07681 -24.02 0.08432 -21.88
mu_income_fit -0.9386 0.02312 -40.59 0.02590 -36.23
mu_race_fit -0.7618 0.05482 -13.90 0.06120 -12.45

WARNING: Some alternatives are never chosen in your data!
WARNING: Some alternatives are always chosen when available!

Overview of choices for MNL model component :
alt1 alt2 alt3 alt4 alt5 alt6 alt7 alt8 alt9
Times available 13 9 16 9 20 15 18 15.00 13
Times chosen 0 0 0 0 0 0 0 1.00 0
Percentage chosen overall 0 0 0 0 0 0 0 0.01 0
alt10 alt11 alt12 alt13 alt14 alt15 alt16
Times available 8 16 12 10 15 10 10
Times chosen 0 0 0 0 0 0 0
Percentage chosen overall 0 0 0 0 0 0 0
alt17 alt18 alt19 alt20 alt21 alt22 alt23
Times available 7.00 8 22 18 18.00 20 16
Times chosen 1.00 0 0 0 5.00 0 0
Percentage chosen overall 0.01 0 0 0 0.07 0 0
alt24 alt25 alt26 alt27 alt28 alt29 alt30
Times available 14 11 16 12 17 11 17
Times chosen 0 0 0 0 0 0 0
Percentage chosen overall 0 0 0 0 0 0 0

...

alt30511 alt30512 alt30513 alt30514 alt30515
Times available 2 1 5 7 1
Times chosen 0 0 0 0 0
Percentage chosen overall 0 0 0 0 0
alt30516 alt30517 alt30518 alt30519 alt30520
Times available 1 1 1 1e+00 1
Times chosen 0 0 0 1e+00 0
Percentage chosen overall 0 0 0 1e-02 0
alt30521
Times available 1
Times chosen 0
Percentage chosen overall 0
[ reached getOption("max.print") -- omitted 1 row ]



Classical covariance matrix:
mu_partisan_fit mu_income_fit mu_race_fit
mu_partisan_fit 0.005900 3.977e-05 -4.4966e-04
mu_income_fit 3.977e-05 5.3470e-04 1.5011e-04
mu_race_fit -4.4966e-04 1.5011e-04 0.003005

Robust covariance matrix:
mu_partisan_fit mu_income_fit mu_race_fit
mu_partisan_fit 0.007109 -2.019e-05 -0.001722
mu_income_fit -2.019e-05 6.7107e-04 3.5045e-04
mu_race_fit -0.001722 3.5045e-04 0.003746

Classical correlation matrix:
mu_partisan_fit mu_income_fit mu_race_fit
mu_partisan_fit 1.00000 0.02239 -0.1068
mu_income_fit 0.02239 1.00000 0.1184
mu_race_fit -0.10678 0.11842 1.0000

Robust correlation matrix:
mu_partisan_fit mu_income_fit mu_race_fit
mu_partisan_fit 1.000000 -0.009243 -0.3337
mu_income_fit -0.009243 1.000000 0.2210
mu_race_fit -0.333712 0.221034 1.0000

20 most extreme outliers in terms of lowest average per choice prediction:
row Avg prob per choice
23516 1.454843e-05
1708 6.548501e-05
21493 8.255964e-05
38269 9.717677e-05
16676 1.944270e-04
45681 3.019932e-04
44312 3.308913e-04
32562 4.128866e-04
32759 6.054975e-04
28952 6.447211e-04
4592 6.638346e-04
33270 7.613318e-04
23741 8.128866e-04
38883 9.383360e-04
41965 1.042226e-03
3827 1.080671e-03
21300 1.210480e-03
34844 1.434141e-03
36240 1.565656e-03
21989 1.577433e-03

Changes in parameter estimates from starting values:
Initial Estimate Difference
mu_partisan_fit -1.000 -1.8452 -0.8452
mu_income_fit 1.000 -0.9386 -1.9386
mu_race_fit 1.000 -0.7618 -1.7618

Settings and functions used in model definition:

apollo_control
--------------
Value
modelName "MNL_realdata_3fits_cutoff50_cores2"
modelDescr "MNL model only 3 fit variables on real data, cutoff=50"
indivID "caseid"
outputDirectory "output/MNL_realdata_3fits_cutoff50_cores2/"
calculateLLC "FALSE"
workInLogs "FALSE"
nCores "2"
debug "FALSE"
seed "13"
mixing "FALSE"
HB "FALSE"
noValidation "FALSE"
noDiagnostics "FALSE"
analyticHessian "FALSE"
memorySaver "FALSE"
panelData "FALSE"
analyticGrad "TRUE"
analyticGrad_manualSet "FALSE"
overridePanel "FALSE"
preventOverridePanel "FALSE"
noModification "FALSE"

Hessian routines attempted
--------------------------
numerical second derivative of LL (using numDeriv)

Scaling used in computing Hessian
---------------------------------
Value
mu_partisan_fit 1.8452117
mu_income_fit 0.9386108
mu_race_fit 0.7617979


apollo_probabilities
----------------------
function (apollo_beta, apollo_inputs, functionality = "estimate")
{
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
V = list()
for (j in 1:apollo_inputs$J) V[[paste0("alt", j)]] = 0
for (j in 1:apollo_inputs$J) V[[paste0("alt", j)]] = V[[paste0("alt",
j)]] + mu_partisan_fit * get(paste0("partisan_fit_",
j)) + mu_income_fit * get(paste0("income_fit_", j)) +
mu_race_fit * get(paste0("race_fit_", j))
mnl_settings = list(alternatives = setNames(1:apollo_inputs$J,
names(V)), avail = setNames(apollo_inputs$database[,
paste0("av", 1:apollo_inputs$J)], names(V)), choiceVar = choice_alt_number,
utilities = V)
P[["model"]] = apollo_mnl(mnl_settings, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
stephanehess
Site Admin
Posts: 1232
Joined: 24 Apr 2020, 16:29

Re: (Very) Long Pre-estimation Time for MNL

Post by stephanehess »

Felix

this is an interesting test case. If you could share your data and code with me outside the forum, even for this smaller example, then I could investigate time saving options for you

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
felix
Posts: 4
Joined: 28 Nov 2024, 16:52

Re: (Very) Long Pre-estimation Time for MNL

Post by felix »

Thanks, I sent you a message. Please let me know in case you did not receive it (it does not appear in my "sent messages" but in my "outbox")
stephanehess
Site Admin
Posts: 1232
Joined: 24 Apr 2020, 16:29

Re: (Very) Long Pre-estimation Time for MNL

Post by stephanehess »

Hi

I had an initial look at your data and code.

Your total number of alternatives J is much larger than what is available for any given individual, as you note. The way Apollo works is that utilities and probabilities are calculated for all alternatives, including in your case unavailable ones and this is leading a lot of unnecessary calculations in your case. As your settings is (I believe) unlabelled, you could make things much easier by simply pre-processing your data so that the number of alternatives in your model and data is simply the max of available ones for any given individual. So in the version you sent me, instead of having 30521 alternatives, you would only have 50

Also, best if you could avoid the use of NA in your data

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
felix
Posts: 4
Joined: 28 Nov 2024, 16:52

Re: (Very) Long Pre-estimation Time for MNL

Post by felix »

Hi Stephane,

Thanks for having a look at this. I will implement it in the way you suggested, which would then also enable me to use the provided function to assess computing time properly.

Best,
Felix
felix
Posts: 4
Joined: 28 Nov 2024, 16:52

Re: (Very) Long Pre-estimation Time for MNL

Post by felix »

Hi Stephane,
Thanks again for your help, I implemented your suggestions (I now have unlabelled alternatives, and no missing values) and it works great now. I have one more related follow-up question. I am now able to estimate a mixed logit model, and using my data and a slightly larger model I get the following estimation times:

Estimated parameters : 27
Time taken (hh:mm:ss) : 02:39:1.82
pre-estimation : 00:34:32.19
estimation : 00:18:43.39
post-estimation : 01:45:46.23
Iterations : 25

As you can see, the estimation takes only a fraction of the overall time. In my setting, I need to run this model iteratively, and for all but the last estimation I don't care about the covariance matrix or anything but the estimated coefficients. I suspect (and I can see during the estimation) that a large part of the time it takes is not coming from the maximization. Is there an option to omit all or most pre or/and post-estimation tasks?
I saw that the "functionality" option takes several inputs, but I am not entirely sure what they mean, and I couldn't find an option in the accompanying guide.

Best,
Felix
stephanehess
Site Admin
Posts: 1232
Joined: 24 Apr 2020, 16:29

Re: (Very) Long Pre-estimation Time for MNL

Post by stephanehess »

Felix

pre-estimation takes a while as this is when the analytical derivatives are prepared. The time taken there is made up for by fast estimation time.

You can turn off the covariance matrix calculations by using hessianRoutine=FALSE in estimate_settings

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply