Multinomial logit estimation gets stuck

roussanoff · Post by **roussanoff** » 17 Sep 2021, 23:45

I am estimating a multinomial logit model. I use the simplest specification similar to Example 3, just to see if I specify everything correctly. My data has many individuals choosing many locations, and so I define the utilities and the availability objects (V and A) using a lot of get() statements, because there are hundreds of choices. Also, my data is "sparse", meaning that there are some alternatives that are never chosen.

The estimation takes about 30 minutes on a High Performance Cluster. The estimation converges without raising any problems After that, it is stuck at the following step (the last line):

Code: Select all

Computing covariance matrix using numerical methods (numDeriv).
 0%....25%....50%....75%....100%
Negative definite Hessian with maximum eigenvalue: -27.102437
Computing score matrix...
Calculating LL(0) for applicable models...
Calculating LL(c) for applicable models...

After 15 hours of waiting, I killed the process because that did not seem reasonable (given that estimation itself took only 30 minutes).

My questions are:
- what does the LL(c) calculation do, exactly? is there a way to disable it in the seetings
- do you have a suggestion for why it might take up so much time/cause issues?

My code is below
### Load Apollo library
library(apollo)
library(data.table)

### Initialise code
apollo_initialise()
parallel::detectCores() #check how many cores the system has

### Set core controls
apollo_control = list(
modelName = "Logit_sch_location",
modelDescr = "Plain logit model",
indivID = "ind",
nCores = 12, #set the # of cores, should be number of cpus 1
outputDirectory = paste0(output_dir, "res_apollo/") #set the output directory
)

# # ################################################################# #
# #### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# # ################################################################# #

database = fread(paste0(data_dir, "mydata.csv"))

# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(bprice = 1,
# family chars
bage = 1,
bagesq = -1,
bsex = -1,
bl1prealproppc = 1,
bschool = 1,
# location choice
bldistance = -1,
bcorn = -1,
bwheat = -1)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()

# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))

# lists of alternatives (created directly from database)
list_alt <- gsub( "PRICE_", "", unique(grep("^PRICE_[^_]+$", colnames(apollo_inputs$database), perl = F, value=T)))
alternatives_set <- list_alt
names(alternatives_set) <- as.character(list_alt)
list_alt50 <- grep("^50.+$", alternatives_set, perl = F, value=T)
list_alt60 <- grep("^60.+$", alternatives_set, perl = F, value=T)
list_alt70 <- grep("^70.+$", alternatives_set, perl = F, value=T)

list_loc <- unique(substr(list_alt, 1, nchar(list_alt)-1)) # each alternative exists twice: school/non-sch, hence unique()
list_loc50 <- unique(substr(list_alt50, 1, nchar(list_alt)-1))
list_loc60 <- unique(substr(list_alt60, 1, nchar(list_alt)-1))
list_loc70 <- unique(substr(list_alt70, 1, nchar(list_alt)-1))

bpl_dummies <- grep("dBPL_MOM", names(apollo_inputs$database), perl = F, fixed = T, value=T)
betas_bpl_dummies <- gsub("dBPL_MOM", "bbpl", bpl_dummies)
expression_dummies <- paste0(paste(bpl_dummies, betas_bpl_dummies, sep= "*"), collapse = " + ")

### Create list of probabilities P
P = list() #probabilities of choice, will be filled automatically
V = list() #utility (except for the type-1 component), need to populate
A = list() # adjust availability based on year

### Create alternative specific constants and coefficients using interactions with socio-demographics
#inspired by this thread on Apollo forum: http://www.apollochoicemodelling.com/fo ... ?f=11&t=27
for (ylocsch_iter in list_alt){
sch_iter <- as.integer(substr(ylocsch_iter, nchar(ylocsch_iter), nchar(ylocsch_iter))) #last digit
loc_iter <- substr(ylocsch_iter, 1, nchar(ylocsch_iter)-1) #everything but the last digit
A[[ylocsch_iter]] = get( paste0("av_", ylocsch_iter) )
if (sch_iter == 0) {
V[[ylocsch_iter ]] = bldistance*get(paste0("LOG_DISTANCE_", ylocsch_iter)) + bcorn*get(paste0("distCorntoPot_", ylocsch_iter)) +
bwheat*get(paste0("distWheatoPot_", ylocsch_iter))
}
else if (sch_iter == 1) {
# no bsch because the origin FE saturate the model
V[[ylocsch_iter]] = bprice*get(paste0("PRICE_", ylocsch_iter)) +
bschool +
bage*AGE + bagesq*AGESQ + bsex*SEX +
bl1prealproppc*l1prealprop_pc +
bldistance*get(paste0("LOG_DISTANCE_", ylocsch_iter)) + bcorn*get(paste0("distCorntoPot_", ylocsch_iter)) +
bwheat*get(paste0("distWheatoPot_", ylocsch_iter))
}
}

### Define settings for MNL model component

mnl_settings = list(
alternatives = alternatives_set,
avail = A,
choiceVar = ylocsch_choice,
V = V
)

### Compute probabilities using NL model
P[["model"]] = apollo_mnl(mnl_settings, functionality)

### Take product across observations for same individual
# P = apollo_panelProd(P, apollo_inputs, functionality)

### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}

## In case you need to check the utility values, for some options, use this bit
# cols_50s009s1 <- grep("50s009s1", names(database), perl = F, fixed = T, value=T)
# cols_small <- c("ind", "BPL_MOM", "AGE", "l1prealprop_pc", "SEX", cols_50s009s1)
# cens_small <- database[,..cols_small]

# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs,
estimate_settings = list(maxIterations=500))

# ################################################################# #
#### MODEL OUTPUTS ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN) ----
# ----------------------------------------------------------------- #

apollo_modelOutput(model)

Post by **stephanehess** » 20 Sep 2021, 13:22

Hi

what's the maximum number of alternatives in your model?

Stephane

roussanoff · Post by **roussanoff** » 21 Sep 2021, 16:54

Hi Stephane,

There are three groups of alternatives. Each individual can only choose from one group of alternatives (this is specified through A). The total number of alternatives is 408 + 410 + 450 (three groups) = 1268. There are about 12000 individuals in each group, so 36000 total. I use 200Gb of RAM with 12 cores.

I removed the never-chosen alternatives (roughly 40 in each group, so 120 more). I still have the estimation getting stuck at this step for hours, after converging relatively quickly.

Code: Select all

Calculating LL(c) for applicable models...

I tried halving the sample size (to 18000 individuals), that did not help

Could you explain what LL(c) stands for?

And a new question: I noticed that the Hessian is calculated using numerical (numDeriv), not analytical methods, despite the fact that I am running a multinomial logit, where analytic expressions are specified (and are used in the Example 3). How does Apollo decide when to use numerical and when to use analytical gradient&Hessian calculation?

Vasily

roussanoff · Post by **roussanoff** » 22 Sep 2021, 01:45

I investigated the issue a bit further. The problem stems from the restriction on choices.

Any of the following allows for very quick estimation without the computation getting stuck:
- restricting the sample to only one third of individuals who are allowed to choose from 408 alternatives (or the second third with 410 alternatives or the third third with 450 alternatives)
- using the full sample but setting avail = 1 (meaning all individuals are allowed to choose from all alternatives). This is wrong, because I know people could not choose from some of the alternatives, and I would like to model that. Note that I still create the A object (through a for-loop, which is not very fast but I like how verbose it is). The computation is not slowed down by the way A is created, it's using it mnl_settings() that makes things very slow.

The problem with the code getting stuck at the LL(c) calculation for hours only arises when I restrict people to choosing from a subset of 408+410+450 alternatives.

Post by **stephanehess** » 22 Sep 2021, 07:05

Vasily

LL(c) is the log-likelihood at constants only, i.e. a model using a full set of alternative specific constants.

When all alternatives are available for all people in the data, then the calculation of LL(c) is trivial and is done analytically. However, when the availabilities vary across individuals, then Apollo will after estimation also estimate a model with constants only. In your case, the number of alternatives is so large that this step would take a very substantial amount of time.

I have just added an option to Apollo for the user to disable this calculation by adding

Code: Select all

calculateLLC=FALSE

in apollo_control.

I have uploaded that version on the apollo website as a development version (http://apollochoicemodelling.com/code.html). However, for now, it will be just the mac version as I don't have access to a windows machine today.

In relation to your question about analytical gradients, this is caused in your model by the specific iterative way in which you've coded your likelihood function. We can look into this for you, but it will have to wait until next week

Stephane

roussanoff · Post by **roussanoff** » 22 Sep 2021, 16:47

Thank you Stephane, I certainly appreciate that you added this setting for me.

I ran the estimation, and everything worked out quickly, with the estimation successfully skipping LL(c) calculation part. This was a huge relief, thank you!

I have a small comment that is also related to having some alternatives not being available. Currently, at the pre-estimation stage, Apollo calculates the utilities from all alternatives to check that there are, for example, no infinities. It ignores the A object at this step. This means that if some values in apollo$database are NA or if they are Inf, then Apollo will report an error. For example, if children are only allowed to take a bus or a train and adults are allowed to take a bus, a train, or a car, I still have to define car travel time and other characteristics for children, even though it has no meaning. Otherwise, the pre-estimation will fail.

I circumvent this by setting those values to 0. I do this because this value, of course, does not matter for the estimation. This is, however, 1) very confusing (what does it mean for children to have no travel time by car?) and 2) potentially gives way to errors (what if I mistakingly do not set my A correctly and allow children to choose to drive?). Perhaps it would be nice to have Apollo only use the choices available in A for verification.

Post by **stephanehess** » 23 Sep 2021, 09:19

Vasily

availabilities should be taken into account in that check for NA/INF. Would you mind sharing your code and data with me offline and I'll see why it's not working in your case

Stephane

ApolloChoiceModelling forum

Multinomial logit estimation gets stuck

Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck

Re: Multinomial logit estimation gets stuck