Counterintuitive signs for cost and frequency attributes

nayeem · Post by **nayeem** » 04 Sep 2025, 17:06

Hello,

I recently ran an SCE to collect information about freight users’ willingness to pay for improved travel time reliability. This was to serve as a pilot survey for my project as well as provide informative priors for experimental design of the main SCE. The alternatives are truck, train and express train service. The attributes were travel time, cost, travel time reliability (std deviation of travel time-generic across truck and train, different for express train), frequency (only applies to the train and express train mode-not generic), risk (applies only to the truck mode). Only risk is a categorical variable with two levels. I am also estimating two categorical variables to understand how position of alternatives impact utilities (the positions were changed during survey administration across respondents).

Since this is a pilot study, I have collected 12 choice tasks from 9 individuals. The problem is that I’m obtaining in the model estimation signs for two coefficients which have counterintuitive signs-for express train (frequency) and for train (cost). And some coefficients with the expected signs are also not significant. I can’t understand what is wrong with my model.

Code: Select all


rm(list = ls())

### Load Apollo library
library(apollo)

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName       = "MNL_Pilot",
  modelDescr      = "Simple MNL model on mode choice SP data for freight transport in Bangladesh",
  indivID         = "RID",
  outputDirectory = "output"
)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

### Loading data from package
### if data is to be loaded from a file (e.g. called data.csv), 
### the code would be: database = read.csv("data.csv",header=TRUE)
setwd('G:/SP Survey/R Codes')

install.packages("readxl")   # Run this only once
library(readxl)

database <- read_excel("data.xlsx", sheet = "Sheet2")

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(asc_truck      = 0,
              asc_train      = 0,
              asc_etrain     = 0,
              b_tt_truck     = 0,
              b_tt_train     = 0,
              b_tt_etrain    = 0,
              b_cost_truck   = 0,
              b_cost_train   = 0,
              b_cost_etrain  = 0,
              b_rel          = 0,
              b_rel_etrain   = 0,
              b_freq_train   = 0,
              b_freq_etrain  = 0,
              b_risk_low     = 0,
              b_risk_medium  = 0,
              gamma_1        = 0,
              gamma_2        = 0,
              gamma_3        = 0)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("asc_etrain","b_risk_low","gamma_1")

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
    
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))

  ### Create list of probabilities P
  P = list()
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[["truck"]]    = asc_truck   + b_tt_truck  * time_truck   + b_cost_truck  * cost_truck  + b_rel *rel_truck          + b_risk_low*(risk==1)      +b_risk_medium*(risk==2)   + gamma_1*(pos_truck==1)  + gamma_2*(pos_truck==2)  + gamma_3*(pos_truck==3)
  V[["train"]]    = asc_train   + b_tt_train  * time_train   + b_cost_train  * cost_train  + b_rel *rel_train          + b_freq_train*freq_train                              + gamma_1*(pos_train==1)  + gamma_2*(pos_train==2)  + gamma_3*(pos_train==3)
  V[["etrain"]]   = asc_etrain  + b_tt_etrain * time_etrain  + b_cost_etrain * cost_etrain + b_rel_etrain *rel_etrain  + b_freq_etrain*freq_etrain                            + gamma_1*(pos_etrain==1) + gamma_2*(pos_etrain==2) + gamma_3*(pos_etrain==3)
  
  ### Define settings for MNL model component
  mnl_settings = list(
    alternatives  = c(truck=1, train=2, etrain=3), 
    choiceVar     = choice,
    utilities     = V
  )
  
  ### Compute probabilities using MNL model
  P[["model"]] = apollo_mnl(mnl_settings, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN)                               ----
# ----------------------------------------------------------------- #

apollo_modelOutput(model)

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name)               ----
# ----------------------------------------------------------------- #

apollo_saveOutput(model)

Results:

Code: Select all

Overview of choices for MNL model component :
                                 truck  train etrain
Times available                    108 108.00 108.00
Times chosen                        27  53.00  28.00
Percentage chosen overall           25  49.07  25.93
Percentage chosen when available    25  49.07  25.93





Model name                                  : MNL_Pilot
Model description                           : Simple MNL model on mode choice SP data for freight transport in Bangladesh
Model run at                                : 2025-09-03 14:33:15.312228
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -0.00848
     reciprocal of condition number         : 4.99761e-09
Number of individuals                       : 9
Number of rows in database                  : 108
Number of modelled outcomes                 : 108

Number of cores used                        :  1 
Model without mixing

LL(start)                                   : -118.65
LL at equal shares, LL(0)                   : -118.65
LL at observed shares, LL(C)                : -112.96
LL(final)                                   : -93.93
Rho-squared vs equal shares                  :  0.2084 
Adj.Rho-squared vs equal shares              :  0.0819 
Rho-squared vs observed shares               :  0.1684 
Adj.Rho-squared vs observed shares           :  0.0534 
AIC                                         :  217.86 
BIC                                         :  258.09 

Estimated parameters                        : 15
Time taken (hh:mm:ss)                       :  00:00:2.2 
     pre-estimation                         :  00:00:0.8 
     estimation                             :  00:00:0.87 
     post-estimation                        :  00:00:0.53 
Iterations                                  :  15  

Unconstrained optimisation.

Estimates:
                 Estimate        s.e.   t.rat.(0)    Rob.s.e. Rob.t.rat.(0)
asc_truck       -12.37110    7.043714    -1.75633    6.938595      -1.78294
asc_train       -19.68333    9.116215    -2.15916    9.821921      -2.00402
asc_etrain        0.00000          NA          NA          NA            NA
b_tt_truck       -0.12603    0.191591    -0.65780    0.121034      -1.04126
b_tt_train       -0.24593    0.206514    -1.19086    0.112946      -2.17740
b_tt_etrain      -0.46014    0.334071    -1.37738    0.332425      -1.38420
b_cost_truck     -0.01041    0.008449    -1.23230    0.006440      -1.61669
b_cost_train      0.02614    0.018448     1.41690    0.018157       1.43963
b_cost_etrain    -0.04233    0.019884    -2.12865    0.016941      -2.49848
b_rel            -0.16109    0.310264    -0.51920    0.235734      -0.68335
b_rel_etrain     -0.21271    0.792719    -0.26833    0.808797      -0.26299
b_freq_train      0.29283    0.263072     1.11312    0.270384       1.08302
b_freq_etrain    -0.11975    0.231975    -0.51623    0.095477      -1.25425
b_risk_low        0.00000          NA          NA          NA            NA
b_risk_medium    -0.01891    0.488895    -0.03868    0.321322      -0.05886
gamma_1           0.00000          NA          NA          NA            NA
gamma_2          -1.51620    0.323966    -4.68013    0.725615      -2.08954
gamma_3          -0.62447    0.237952    -2.62436    0.657548      -0.94970

Choice Analysis

Code: Select all

Availabilities not provided for 'apollo_choiceAnalysis', so full availability is assumed.
                                                                          truck   train etrain
Explanator 1 (time_truck), mean when alt is chosen:                     14.1111 14.2075 14.000
Explanator 1 (time_truck), mean when alt is not chosen:                 14.1358 14.0545 14.175
Explanator 1 (time_truck), t-test (mean if chosen - mean if not chosen) -0.0800  0.5600 -0.540

                                                                          truck   train  etrain
Explanator 2 (time_train), mean when alt is chosen:                     11.5556 11.3962 11.6786
Explanator 2 (time_train), mean when alt is not chosen:                 11.4938 11.6182 11.4500
Explanator 2 (time_train), t-test (mean if chosen - mean if not chosen)  0.2500 -1.0700  0.9800

                                                                           truck  train  etrain
Explanator 3 (time_etrain), mean when alt is chosen:                      8.0000 8.1509  7.9643
Explanator 3 (time_etrain), mean when alt is not chosen:                  8.0864 7.9818  8.1000
Explanator 3 (time_etrain), t-test (mean if chosen - mean if not chosen) -0.4800 1.1500 -0.7900

                                                                           truck    train   etrain
Explanator 4 (cost_truck), mean when alt is chosen:                     193.3333 200.7547 203.5714
Explanator 4 (cost_truck), mean when alt is not chosen:                 201.7284 198.5455 198.2500
Explanator 4 (cost_truck), t-test (mean if chosen - mean if not chosen)  -1.2400   0.3900   0.8700

                                                                           truck    train   etrain
Explanator 5 (cost_train), mean when alt is chosen:                     198.5185 201.5094 201.5714
Explanator 5 (cost_train), mean when alt is not chosen:                 201.5309 200.0727 200.5000
Explanator 5 (cost_train), t-test (mean if chosen - mean if not chosen)  -0.9700   0.5200   0.3100

                                                                            truck    train   etrain
Explanator 6 (cost_etrain), mean when alt is chosen:                     280.5926 280.9811 276.2857
Explanator 6 (cost_etrain), mean when alt is not chosen:                 279.3580 278.4000 280.8500
Explanator 6 (cost_etrain), t-test (mean if chosen - mean if not chosen)   0.3700   0.9100  -1.4700

                                                                        truck  train  etrain
Explanator 7 (rel_truck), mean when alt is chosen:                     1.7207 1.6951  1.6475
Explanator 7 (rel_truck), mean when alt is not chosen:                 1.6786 1.6835  1.7037
Explanator 7 (rel_truck), t-test (mean if chosen - mean if not chosen) 0.3200 0.1000 -0.4500

                                                                        truck   train etrain
Explanator 8 (rel_train), mean when alt is chosen:                     1.6567  1.5568 1.7221
Explanator 8 (rel_train), mean when alt is not chosen:                 1.6140  1.6900 1.5905
Explanator 8 (rel_train), t-test (mean if chosen - mean if not chosen) 0.3900 -1.4100 1.3100

                                                                          truck  train  etrain
Explanator 9 (rel_etrain), mean when alt is chosen:                      1.0978 1.1666  1.1111
Explanator 9 (rel_etrain), mean when alt is not chosen:                  1.1474 1.1045  1.1434
Explanator 9 (rel_etrain), t-test (mean if chosen - mean if not chosen) -0.6600 0.9600 -0.4300

                                                                           truck  train etrain
Explanator 10 (freq_train), mean when alt is chosen:                      5.7407 6.0377 6.0357
Explanator 10 (freq_train), mean when alt is not chosen:                  6.0370 5.8909 5.9375
Explanator 10 (freq_train), t-test (mean if chosen - mean if not chosen) -1.4800 0.8500 0.4900

                                                                           truck   train  etrain
Explanator 11 (freq_etrain), mean when alt is chosen:                     3.4074  3.2642  3.2143
Explanator 11 (freq_etrain), mean when alt is not chosen:                 3.2469  3.3091  3.3125
Explanator 11 (freq_etrain), t-test (mean if chosen - mean if not chosen) 0.6000 -0.2000 -0.3900

Ouputs of apollo_choiceAnalysis saved to output/MNL_Pilot_choiceAnalysis.csv

The Data Dictionary
https://drive.google.com/file/d/14koAWi ... sp=sharing

Thanks for your help.

Nayeem

Post by **stephanehess** » 05 Sep 2025, 17:51

Hi

with that small a sample, you should not expect everything to work yet. You need quite a bit more data.

Also, I would not make the cost coefficient mode specific, a dollar is a dollar, no matter what it is spent on.

Stephane

nayeem · Post by **nayeem** » 09 Sep 2025, 06:33

Can I use a generic cost attribute when the attribute levels of the alternatives are different? For example in my SCE, cost(etrain) > cost(truck) > cost(train). For the time attribute its the opposite, time(truck) > time(etrain) > time(train).

In my SCE the range of the cost attribute is different for the alternatives. For example, cost(truck) varied from 160 USD to 240 USD (over 5 levels) while for cost(train) varied from 184 USD to 220 USD (over 3 levels). Could this create a problem in model estimation when I have collected sufficient data?

Nayeem

Post by **stephanehess** » 09 Sep 2025, 14:22

Sure, that is often the case

ApolloChoiceModelling forum

Counterintuitive signs for cost and frequency attributes

Counterintuitive signs for cost and frequency attributes

Re: Counterintuitive signs for cost and frequency attributes

Re: Counterintuitive signs for cost and frequency attributes

Re: Counterintuitive signs for cost and frequency attributes