Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Iterative coding of utilities fails (only when nCores > 1)

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
roussanoff
Posts: 7
Joined: 29 Aug 2021, 22:44

Iterative coding of utilities fails (only when nCores > 1)

Post by roussanoff »

I am trying to estimate a nested logit model with 900 alternatives. 3000 individuals with a single child choose places to live (upper nest) and whether the child goes to school (lower nest). I understand nested logit is not ideal for modeling sequential choices, but my issue is model specification in Apollo.

`ylocsch` is a numeric variable that describes the choice. It is coded in a convenient way. It has 7 digits, where the first 6 digits code the location and the last digit is whether school is chosen or not. For example, 0000021 is "location number 2, child goes to school", 0000030 is "location number 3, child does not go to school". There are 450 locations and 2 choices of school/no school, hence 450*2 = 900 alternatives.

V_ind = bschool - bprice*Price_loc if school == 1
V_ind = 0 otherwise


Because of the large number of alternatives, I use section 11.5 of Apollo's (excellent) manual. Every individual can choose any location. Non-stochastic part of utility V is very simple: there is a constant for "going to school" and there is a price that has to be paid for going to school. Locations differ by the price. The utility from "school==1" is the same for every location, and the utility from "school==0" is normalized to 0.

Below is the code that I am using. It works if I am using a single node, nCores = 1 (albeit very, very slowly). It fails if I try to use more than one node.

Related issue:
Error when estimating a ICLV model with two sources of information


Code: Select all

### Load Apollo library
library(apollo)
library(data.table)

### Initialise code
apollo_initialise()
parallel::detectCores() #check how many cores the system has

### Set core controls
apollo_control = list(
  modelName  = "Nested_logit_loc_sch",
  modelDescr = "Two-level NL model",
  indivID    = "ind",
  nCores     = 3 #[color=#FF0000]if set to 1, the code works[/color]
)

# # ################################################################# #
# #### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# # ################################################################# #

database = fread(paste0(data_dir, "choice_nested_logit.csv"))
list_alt <- unique(database[, ylocsch]) # this is where I define the list of alternatives, I am sure each is chosen at least once.
alternatives_set <- list_alt
names(alternatives_set) <- as.character(list_alt)
list_loc <- unique(database[, yloc]) # this defines the list of locations, useful for defining the nesting structure.

## sort data by id
setorder(database, ind)

 # ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(c(bprice = -0.5,
              bsch = 1), 
              setNames(rep(0.5, length(list_loc)), paste0("lambda_", list_loc)))


### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()

### Read in starting values for at least some parameters from existing model output file
# apollo_beta = apollo_readBeta(apollo_beta, apollo_fixed, "Apollo_example_1", overwriteFixed=FALSE)

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))

  ### Create list of probabilities P
  P = list()
  V = list()
  # A = list() adjust availability based on year
  
  ### Create alternative specific constants and coefficients
  for (ylocsch_iter in as.numeric(list_alt)){
    sch_iter <- as.integer(ylocsch_iter %% 2)
    alt_name <- toString(ylocsch_iter)
    if (sch_iter == 0) {
      V[[as.character(ylocsch_iter)]] = 0
    } else if (sch_iter == 1) {
      V[[as.character(ylocsch_iter)]] = bsch + bprice*get(paste0("PRICE_", ylocsch_iter) )
    }
  }
  
  ### Specify lambdas for all nests for NL model
  nlNests = list(root=1)
  for (loc in list_loc) {
    nlNests[[as.character(loc)]] <- get(paste0("lambda_", loc))
  }
  
  ### Specify tree structure for NL model
  nlStructure= list()
  nlStructure[["root"]]   = as.character(list_loc)
  for (loc in list_loc) {
    nlStructure[[as.character(loc)]] = c(paste0(loc, "0"), paste0(loc, "1"))
  }

  ### Define settings for MNL model component

  nl_settings = list(
    alternatives = alternatives_set,
    avail        = 1,
    choiceVar    = ylocsch_choice,
    V            = V,
    nlNests      = nlNests,
    nlStructure  = nlStructure
  )
  
  ### Compute probabilities using NL model
  P[["model"]] = apollo_nl(nl_settings, functionality)
  
  ### Take product across observations for same individual
  # P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN)                               ----
# ----------------------------------------------------------------- #

apollo_modelOutput(model)

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name)               ----
# ----------------------------------------------------------------- #

apollo_saveOutput(model)
Thank you for creating such an amazing package with great documentation,
Vasily
dpalma
Posts: 190
Joined: 24 Apr 2020, 17:54

Re: Iterative coding of utilities fails (only when nCores > 1)

Post by dpalma »

Hi Vasily,

Thanks for your kind words and for reading the manual!

It's hard to say why the model fails with multiple cores without knowing the exact error message you get. Please let us know what the exact error message is, and we may be able to give you more detailed recommendations.

A couple of tricks that can help is setting noValidation=TRUE and analyticGrad=FALSE inside apollo_control. These are not permanent fixes, and should only be used for testing, but they can help sometimes.

Speedwise, it's a good practice to avoid loops that resize elements, or modify large objects in R. For example, I would replace

Code: Select all

### Specify tree structure for NL model
  nlStructure= list()
  nlStructure[["root"]]   = as.character(list_loc)
  for (loc in list_loc) {
    nlStructure[[as.character(loc)]] = c(paste0(loc, "0"), paste0(loc, "1"))
  }
By something like:

Code: Select all

### Specify tree structure for NL model
nlStructure <- matrix( paste0(list_loc, rep(0:1, each=length(list_loc))), ncol=2 )
nlStructure <- setNames(split(nlStructure, 1:length(list_loc)), list_loc)
nlStructure[['root']] <- as.character(list_loc)
Best wishes
David
roussanoff
Posts: 7
Joined: 29 Aug 2021, 22:44

Re: Iterative coding of utilities fails (only when nCores > 1)

Post by roussanoff »

Hi David,

Many thanks for your answer. I was using 12 nodes The error I was getting was:

Code: Select all

Error in checkForRemoteErrors(lapply(cl, recvResult)) :
  12 nodes produced errors; first error: object 'list_alt' not found
The solution is simple: the object list_alt should be defined inside apollo_probabilities(), using apollo$database. That way, the additional nodes will not give errors.
Post Reply