Page 1 of 1

Error when weight column is added in an error-component logit model

Posted: 01 Mar 2023, 11:08
by Han84
Dear Apollo Team,

I have a problem with adding weight to my estimation. I am estimating a mode choice model of error-components logit type in a panel data set. Without weights, I have no problems and estimation results are produced. When I want the model to consider weights I will encounter this error:

Code: Select all

Testing influence of parameters
During testing, Apollo added disturbances smaller than 0.001 to all starting values.
  This led to a log-likelihood calculation failure!
Affected individuals:
[color=#BF0000]Error in apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities,  : 
  Log-likelihood calculation fails at values close to the starting values![/color]
The weights assigned to some observations are really small and I thought this may have caused the issue. But inflating the weights did not solve the issue. For your information, my code is written below.

Best regards,
Mehdi

Code: Select all

### Clear memory
rm(list = ls())

### Load relevant libraries
library(apollo)
library(tidyverse)

### Initialize code
apollo_initialise()

# Write the model description which will also be the folder name of outputs
modelDescription <- "ECL1_WEIGHTED"

# Address
address <- paste("C:/Users/rekendat2/Documents/Hanif/Models/", 
                 modelDescription,
                 sep="")

### Set core controls
# Some other useful variables: weights, workInLogs
apollo_control <- list(
  modelName       = modelDescription,
  modelDescr      = modelDescription,
  indivID         = "tracker_id",
  outputDirectory = address,
  mixing          = TRUE,
  panelData       = TRUE,
  nCores          = 11,
  weights         = "adjWeight"
  )
  
 ### Load the data

# Upload the database which does not have weight column
database <- read.csv(
  "C:/Users/rekendat2/Documents/Hanif/input_work_panel.csv",
  header = TRUE,
  sep = ","
  )
  
 # Upload the weights file: each user is laid to a weight multiplier
weightsFile <- readxl::read_excel(
   "C:/Users/rekendat2/Documents/Hanif/weight_per_tracker_id.xlsx" )


database <- database %>% 
  left_join(weightsFile, by="tracker_id") %>% 
  mutate(adjWeight = weight * (162232 / sum(weight)))
  
 
 ### Vector of parameters, including any that are kept fixed in estimation
apollo_beta <- c(ASC_b = 0,
                 ASC_w = 0,
                 ASC_c = 0,
                 ASC_pt = 0,
                 ASC_ps = 0,

                 MED_DIST_b = 0,
                 MED_DIST_w = 0,
                 MED_DIST_c = 0,
                 MED_DIST_pt = 0,
                 MED_DIST_ps = 0,

                 LNG_DIST_b = 0,
                 LNG_DIST_w = 0,
                 LNG_DIST_c = 0,
                 LNG_DIST_pt = 0,
                 LNG_DIST_ps = 0,

                 RAIN_b = 0,
                 RAIN_w = 0,
                 RAIN_c = 0,
                 RAIN_pt = 0,
                 RAIN_ps = 0,

                 MALE_b = 0,
                 MALE_w = 0,
                 MALE_c = 0,
                 MALE_pt = 0,
                 MALE_ps = 0,

                 AGE30to44_b = 0,
                 AGE30to44_w = 0,
                 AGE30to44_c = 0,
                 AGE30to44_pt = 0,
                 AGE30to44_ps = 0,

                 AGE45to64_b = 0,
                 AGE45to64_w = 0,
                 AGE45to64_c = 0,
                 AGE45to64_pt = 0,
                 AGE45to64_ps = 0,

                 AGE65_b = 0,
                 AGE65_w = 0,
                 AGE65_c = 0,
                 AGE65_pt = 0,
                 AGE65_ps = 0,

                 LARGE_HHLD_b = 0,
                 LARGE_HHLD_w = 0,
                 LARGE_HHLD_c = 0,
                 LARGE_HHLD_pt = 0,
                 LARGE_HHLD_ps= 0,

                 WITH_CHLD_b = 0,
                 WITH_CHLD_w = 0,
                 WITH_CHLD_c = 0,
                 WITH_CHLD_pt = 0,
                 WITH_CHLD_ps = 0,

                 WITHOUT_CHLD_b = 0,
                 WITHOUT_CHLD_w = 0,
                 WITHOUT_CHLD_c = 0,
                 WITHOUT_CHLD_pt = 0,
                 WITHOUT_CHLD_ps = 0,

                 MED_EDU_b = 0,
                 MED_EDU_w = 0,
                 MED_EDU_c = 0,
                 MED_EDU_pt = 0,
                 MED_EDU_ps = 0,

                 HIGH_EDU_b = 0,
                 HIGH_EDU_w = 0,
                 HIGH_EDU_c = 0,
                 HIGH_EDU_pt = 0,
                 HIGH_EDU_ps = 0,

                 MED_DNST_b = 0,
                 MED_DNST_w = 0,
                 MED_DNST_c = 0,
                 MED_DNST_pt = 0,
                 MED_DNST_ps = 0,

                 HI_DNST_b = 0,
                 HI_DNST_w = 0,
                 HI_DNST_c = 0,
                 HI_DNST_pt = 0,
                 HI_DNST_ps = 0,

                 sigma_b = 1,
                 sigma_w = 1,
                 sigma_c = 1,
                 sigma_pt = 1,
                 sigma_ps = 1
                 )

### Vector with names (in quotes) of parameters to be kept fixed at
### their starting value in 
apollo_fixed = c("ASC_b",
                 "MED_DIST_b",
                 "LNG_DIST_b",
                 "RAIN_b",
                 "AGE30to44_b",
                 "AGE45to64_b",
                 "AGE65_b",
                 "MED_EDU_b",
                 "HIGH_EDU_b",
                 "MALE_b",
                 "LARGE_HHLD_b",
                 "WITH_CHLD_b",
                 "WITHOUT_CHLD_b",
                 "MED_DNST_b",
                 "HI_DNST_b"
)

### Set parameters for generating draws
apollo_draws = list(
  interDrawsType = "mlhs",
  interNDraws    = 500,
  interNormDraws = c("draws_b",
                     "draws_w",
                     "draws_c",
                     "draws_ps",
                     "draws_pt")
  )

### Create random parameters
apollo_randCoeff = function(apollo_beta, apollo_inputs){
  
  randcoeff = list()
  
  randcoeff[["ec_b"]] = draws_b * sigma_b
  
  randcoeff[["ec_w"]] = draws_w * sigma_w
    
  randcoeff[["ec_c"]] = draws_c * sigma_c
  
  randcoeff[["ec_pt"]] = draws_pt * sigma_pt
  
  randcoeff[["ec_ps"]] = draws_ps * sigma_ps
  
  return(randcoeff)
}


apollo_inputs=apollo_validateInputs()

apollo_probabilities = function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, 
                apollo_inputs)
  
  on.exit(apollo_detach(apollo_beta,
                        apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Create alternative specific constants and coefficients using interactions
  ### with socio-demographics
  
  
  ### List of utilities: these must use the same names as in mnl_settings,
  ### order is irrelevant
  V = list()
  
  V[["b"]] = ASC_b + MED_DIST_b * MED_DIST + LNG_DIST_b * LNG_DIST + RAIN_b * RAIN + MALE_b * MALE + AGE30to44_b * AGE30to44 + AGE45to64_b * AGE45to64 + AGE65_b * AGE65 + LARGE_HHLD_b * LARGE_HHLD + MED_EDU_b * MED_EDU + HIGH_EDU_b * HIGH_EDU + WITH_CHLD_b * WITH_CHLD + WITHOUT_CHLD_b * WITHOUT_CHLD + MED_DNST_b * MED_DNST + HI_DNST_b * HI_DNST + ec_b
  
  V[["w"]] = ASC_w + MED_DIST_w * MED_DIST + LNG_DIST_w * LNG_DIST + RAIN_w * RAIN + MALE_w * MALE + AGE30to44_w * AGE30to44 + AGE45to64_w * AGE45to64 + AGE65_w * AGE65 + LARGE_HHLD_w * LARGE_HHLD + MED_EDU_w * MED_EDU + HIGH_EDU_w * HIGH_EDU + WITH_CHLD_w * WITH_CHLD + WITHOUT_CHLD_w * WITHOUT_CHLD + MED_DNST_w * MED_DNST + HI_DNST_w * HI_DNST + ec_w
  
  
  V[["c"]] = ASC_c + MED_DIST_c * MED_DIST + LNG_DIST_c * LNG_DIST + RAIN_c * RAIN + MALE_c * MALE + AGE30to44_c * AGE30to44 + AGE45to64_c * AGE45to64 + AGE65_c * AGE65 + LARGE_HHLD_c * LARGE_HHLD + MED_EDU_c * MED_EDU + HIGH_EDU_c * HIGH_EDU + WITH_CHLD_c * WITH_CHLD + WITHOUT_CHLD_c * WITHOUT_CHLD + MED_DNST_c * MED_DNST + HI_DNST_c * HI_DNST + ec_c
 
  
  V[["pt"]] = ASC_pt + MED_DIST_pt * MED_DIST + LNG_DIST_pt * LNG_DIST + RAIN_pt * RAIN + MALE_pt * MALE + AGE30to44_pt * AGE30to44 + AGE45to64_pt * AGE45to64 + AGE65_pt * AGE65 + LARGE_HHLD_pt * LARGE_HHLD + MED_EDU_pt * MED_EDU + HIGH_EDU_pt * HIGH_EDU + WITH_CHLD_pt * WITH_CHLD + WITHOUT_CHLD_pt * WITHOUT_CHLD + MED_DNST_pt * MED_DNST + HI_DNST_pt * HI_DNST + ec_pt
  
  
  V[["ps"]] = ASC_ps + MED_DIST_ps * MED_DIST + LNG_DIST_ps * LNG_DIST + RAIN_ps * RAIN + MALE_ps * MALE + AGE30to44_ps * AGE30to44 + AGE45to64_ps * AGE45to64 + AGE65_ps * AGE65 + LARGE_HHLD_ps * LARGE_HHLD + MED_EDU_ps * MED_EDU + HIGH_EDU_ps * HIGH_EDU + WITH_CHLD_ps * WITH_CHLD + WITHOUT_CHLD_ps * WITHOUT_CHLD + MED_DNST_ps * MED_DNST + HI_DNST_ps * HI_DNST + ec_ps
 
  
  ### Define Settings for MNL model component
  mnl_settings = list(
    alternatives = c(b=1, w=2, c=3, pt=4, ps=5),
    avail = list(b=av_b, w=av_w, c=av_c, pt=av_pt, ps=av_ps),
    choiceVar = choice,
    utilities = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, 
                            functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, 
                       apollo_inputs, 
                       functionality)
  
  ### Average across inter-individual draws (what is it??)
  P = apollo_avgInterDraws(P, 
                           apollo_inputs, 
                           functionality)
  
  ### This part is added due to the introduction of weights into estimation
  P = apollo_weighting(P,
                       apollo_inputs,
                       functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, 
                         apollo_inputs, 
                         functionality)
  
  return(P)
}

### Settings for the estimation
# other useful settings might be: estimationRoutine, hessianRoutine
estimate_settings = list(
  maxIterations = 400
)

# Estimation
model = apollo_estimate(apollo_beta, 
                        apollo_fixed, 
                        apollo_probabilities, 
                        apollo_inputs,
                        estimate_settings
                        )


Re: Error when weight column is added in an error-component logit model

Posted: 06 Mar 2023, 09:07
by stephanehess
Hi

to help us with debugging this, what happens if all weights are equal o 1?

Stephane

Re: Error when weight column is added in an error-component logit model

Posted: 06 Mar 2023, 12:03
by Han84
Hi Stephane

Even by putting weights equal to one, there is an error. (my estimation routine is "BHHH" in this re-estimation and not "BFGS")

Code: Select all

Error in maxNRCompute(fn = function (theta, fnOrig, gradOrig = NULL, hessOrig = NULL,  : 
  NA in gradient
Mehdi

Re: Error when weight column is added in an error-component logit model

Posted: 06 Mar 2023, 12:16
by stephanehess
Could you share your code and data with me via e-mail and I'll try to have a look

Re: Error when weight column is added in an error-component logit model

Posted: 06 Mar 2023, 12:52
by Han84
Sorrily I am not allowed to share the data set since it belongs to a company.

Re: Error when weight column is added in an error-component logit model

Posted: 08 Mar 2023, 10:25
by stephanehess
Hi

unfortunately there could be many reasons for why this goes wrong, and we cannot identify the problem without trying it ourselves

One more thing you could try for us is the following. Run the code to just before calling apollo_estimate, and then run the following line

sum(log(apollo_probabilities(apollo_beta,apollo_inputs,functionality="estimate"))

can you do that in both versions, and also when all weights are 1?

Stephane