Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Joint Estimation for Dual Response Survey with Two Segments

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
Post Reply
bye1830
Posts: 5
Joined: 24 Apr 2024, 06:40

Joint Estimation for Dual Response Survey with Two Segments

Post by bye1830 »

Hi Apollo Team,

Thank you for your support in organizing this forum.

I have conducted a choice experiment-based survey using a dual response design. After referring the Apollo manual, I thought I should use a joint estimation. I have a couple of questions and would greatly appreciate your guidance.

For context, my experiment focuses on zero-emission truck choices, including Battery Electric Trucks (BETs) and Hydrogen Fuel Cell Electric Trucks (HFCETs). The dual response format that I used consists of a forced choice followed by an unforced choice. The survey respondents are segmented into two groups: fleet operators who exclusively use diesel trucks ("diesel fleets") and those who also operate natural gas trucks ("NG fleets"). For diesel fleets, the reference alternative is a diesel truck, while for NG fleets, the reference alternatives include both diesel and natural gas trucks.

In each choice task, I asked two questions: the first requires choosing between BET and HFCET, and the second asks if they would still choose the option selected in the first question if the reference alternative(s) were available. Thus, in the second question, diesel fleets choose between 1) a diesel truck and 2) the BET/HFCET selected previously, while NG fleets choose from 1) a diesel truck, 2) a natural gas truck, and 3) the BET/HFCET selected previously.

I have a total of 54 respondents for this choice experiment section, with 12 from NG fleets and 42 from diesel fleets. Each respondent received 6 choice tasks, resulting in a total of 324 choice tasks in my survey data. Each task consists of a forced choice followed by an unforced choice.

Regarding the joint estimation, which of the following approaches would you recommend? I have also included R scripts below.
  • CASE 1: Joint estimation of two datasets – forced choice data (324 observations) and unforced choice data (324 observations)
  • CASE 2: Joint estimation of three datasets – forced choice data (324 observations), unforced choice data for diesel fleets (252 observations), and unforced choice data for NG fleets (72 observations)
CASE 1 - Joint estimation of two datasets

Code: Select all

###############################################
### LOAD LIBRARY AND DEFINE CORE SETTINGS   ###
###############################################

rm(list=ls())
install.packages("apollo")
library(apollo)

#Initialize code
apollo_initialise()

#Set core controls
apollo_control = list(
  modelName = "Main_JOINT-INTRTN1-TWO-DATASETS",
  modelDescr = "Forced-unforced data joint model",
  indivID = "ID",
  outputDirectory = "output"
)

###############################################
### LOAD DATA AND APPLY ANY TRANSFORMATIONS ###
###############################################

#consider both forced and unforced choice data
database = read.csv("E:/Survey/Estimation/main_survey.csv", header=TRUE)

###############################################
### DEFINE MODEL PARAMETERS                 ###
###############################################

#Vector of parameters
apollo_beta = c(asc_bev = 0, asc_hfcev = 0, asc_ngv = 0, asc_dsl = 0,
                b_pcost = 0, b_ocost = 0, b_range = 0, b_offsite = 0, b_onsite_bev = 0, b_onsite_hfcev = 0,
                asc_bev_shift_adopter = 0, asc_hfcev_shift_adopter = 0,
                asc_ngv_shift_small_org = 0,
                b_ocost_shift_small_fleet = 0,
                mu_unforced = 1)

# Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use
apollo_fixed = c("asc_dsl")

###############################################
### GROUP AND VALIDATE INPUTS               ###
###############################################

apollo_inputs = apollo_validateInputs()

###############################################
### DEFINE MODEL AND LIKELIHOOD FUNCTION    ###
###############################################

apollo_probabilities = function (apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Create coefficients using interactions with fleet characteristics
  asc_bev_value = asc_bev + asc_bev_shift_adopter * bev_adopter
  asc_hfcev_value = asc_hfcev + asc_hfcev_shift_adopter * hfcev_adopter
  asc_ngv_value = asc_ngv + asc_ngv_shift_small_org * small_org3
  b_pcost_value = b_pcost / relative_annual_revenue
  b_ocost_value = b_ocost + b_ocost_shift_small_fleet * small_fleet3
  b_offsite_value = b_offsite * small_fleet3
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[["bev"]] = asc_bev_value + b_pcost_value * bev_pcost + b_ocost_value * bev_ocost + b_range * bev_range + b_offsite_value * bev_offsite_binary + b_onsite_bev * bev_onsite
  V[["hfcev"]] = asc_hfcev_value + b_pcost_value * hfcev_pcost + b_ocost_value * hfcev_ocost + b_range * hfcev_range + b_offsite_value * hfcev_offsite_binary + b_onsite_hfcev * hfcev_onsite
  V[["dsl"]] = asc_dsl + b_pcost_value * dsl_pcost + b_ocost_value * dsl_ocost + b_range * dsl_range + b_offsite_value * dsl_offsite_binary
  V[["ngv"]] = asc_ngv_value + b_pcost_value * ngv_pcost + b_ocost_value * ngv_ocost + b_range * ngv_range + b_offsite_value * ngv_offsite_binary
  
  
  ### Compute probabilities for "forced" choice using MNL model
  mnl_settings_forced = list(
    alternatives = c(bev=1, hfcev=2),
    avail = list(bev=alt_electric, hfcev=alt_hydrogen),
    choiceVar = choice, 
    utilities = list(bev = V[["bev"]],
                     hfcev = V[["hfcev"]]),
    rows = (forced==1)
  )
  
  P[["choice_forced"]] = apollo_mnl(mnl_settings_forced, functionality)
  
  
  ### Compute probabilities for "unforced" choice using MNL model
  mnl_settings_unforced = list(
    alternatives = c(bev=1, hfcev=2, dsl=3, ngv=4),
    avail = list(bev=alt_electric, hfcev=alt_hydrogen, dsl=alt_diesel, ngv=alt_cng),
    choiceVar = choice, 
    utilities = list(bev = mu_unforced*V[["bev"]],
                     hfcev = mu_unforced*V[["hfcev"]],
                     dsl = mu_unforced*V[["dsl"]],
                     ngv = mu_unforced*V[["ngv"]]),
    rows = (forced==2)
  )
  
  P[["choice_unforced"]] = apollo_mnl(mnl_settings_unforced, functionality)
  
  
  ### Combined model
  P = apollo_combineModels(P, apollo_inputs, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

#Model estimation
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

#Model outputs
# ------------------------------------------------------------------------------- #
# -------------------------- FORMATTED OUTPUT (TO SCREEN) ------------------------
# ------------------------------------------------------------------------------- #
apollo_modelOutput(model)
# ------------------------------------------------------------------------------- #
# ------------------ FORMATTED OUTPUT (TO FILE, using model name) ----------------
# ------------------------------------------------------------------------------- #
apollo_saveOutput(model)

CASE 2 - Joint estimation of three datasets

Code: Select all

###############################################
### LOAD LIBRARY AND DEFINE CORE SETTINGS   ###
###############################################

rm(list=ls())
install.packages("apollo")
library(apollo)

#Initialize code
apollo_initialise()

#Set core controls
apollo_control = list(
  modelName = "Main_JOINT-INTRTN2-THREE-DATASETS",
  modelDescr = "Joint estimation of forced, unforced diesel, and unforced NG datasets",
  indivID = "ID",
  outputDirectory = "output"
)

###############################################
### LOAD DATA AND APPLY ANY TRANSFORMATIONS ###
###############################################

#consider both forced and unforced choice data
database = read.csv("E:/Survey/Estimation/main_survey.csv", header=TRUE)


###############################################
### DEFINE MODEL PARAMETERS                 ###
###############################################

#Vector of parameters
apollo_beta = c(asc_bev = 0, asc_hfcev = 0, asc_ngv = 0, asc_dsl = 0,
                b_pcost = 0, b_ocost = 0, b_range = 0, b_offsite = 0, b_onsite_bev = 0, b_onsite_hfcev = 0,
                asc_bev_shift_adopter = 0, asc_hfcev_shift_adopter = 0,
                asc_ngv_shift_small_org = 0,
                b_ocost_shift_small_fleet = 0,
                mu_unforced_dsl = 1, mu_unforced_ng = 1)

# Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use
apollo_fixed = c("asc_dsl")


###############################################
### GROUP AND VALIDATE INPUTS               ###
###############################################

apollo_inputs = apollo_validateInputs()


###############################################
### DEFINE MODEL AND LIKELIHOOD FUNCTION    ###
###############################################

apollo_probabilities = function (apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Create coefficients using interactions with fleet characteristics
  asc_bev_value = asc_bev + asc_bev_shift_adopter * bev_adopter
  asc_hfcev_value = asc_hfcev + asc_hfcev_shift_adopter * hfcev_adopter
  asc_ngv_value = asc_ngv + asc_ngv_shift_small_org * small_org3
  b_pcost_value = b_pcost / relative_annual_revenue
  b_ocost_value = b_ocost + b_ocost_shift_small_fleet * small_fleet3
  b_offsite_value = b_offsite * small_fleet3
  
  ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
  V = list()
  V[["bev"]] = asc_bev_value + b_pcost_value * bev_pcost + b_ocost_value * bev_ocost + b_range * bev_range + b_offsite_value * bev_offsite_binary + b_onsite_bev * bev_onsite
  V[["hfcev"]] = asc_hfcev_value + b_pcost_value * hfcev_pcost + b_ocost_value * hfcev_ocost + b_range * hfcev_range + b_offsite_value * hfcev_offsite_binary + b_onsite_hfcev * hfcev_onsite
  V[["dsl"]] = asc_dsl + b_pcost_value * dsl_pcost + b_ocost_value * dsl_ocost + b_range * dsl_range + b_offsite_value * dsl_offsite_binary
  V[["ngv"]] = asc_ngv_value + b_pcost_value * ngv_pcost + b_ocost_value * ngv_ocost + b_range * ngv_range + b_offsite_value * ngv_offsite_binary
  
  
  ### Compute probabilities for "forced" choice using MNL model
  mnl_settings_forced = list(
    alternatives = c(bev=1, hfcev=2),
    avail = list(bev=alt_electric, hfcev=alt_hydrogen),
    choiceVar = choice, 
    utilities = list(bev = V[["bev"]],
                     hfcev = V[["hfcev"]]),
    rows = (choice_set==2)
  )
  
  P[["choice_forced"]] = apollo_mnl(mnl_settings_forced, functionality)
  
  
  ### Compute probabilities for "unforced" choice for "diesel fleets" using MNL model
  mnl_settings_unforced_dsl = list(
    alternatives = c(bev=1, hfcev=2, dsl=3),
    avail = list(bev=alt_electric, hfcev=alt_hydrogen, dsl=alt_diesel),
    choiceVar = choice, 
    utilities = list(bev = mu_unforced_dsl*V[["bev"]],
                     hfcev = mu_unforced_dsl*V[["hfcev"]],
                     dsl = mu_unforced_dsl*V[["dsl"]]),
    rows = (choice_set==3)
  )
  
  P[["choice_unforced_dsl"]] = apollo_mnl(mnl_settings_unforced_dsl, functionality)
  
  
  ### Compute probabilities for "unforced" choice for "NG fleets" using MNL model
  mnl_settings_unforced_ng = list(
    alternatives = c(bev=1, hfcev=2, dsl=3, ngv=4),
    avail = list(bev=alt_electric, hfcev=alt_hydrogen, dsl=alt_diesel, ngv=alt_cng),
    choiceVar = choice, 
    utilities = list(bev = mu_unforced_ng*V[["bev"]],
                     hfcev = mu_unforced_ng*V[["hfcev"]],
                     dsl = mu_unforced_ng*V[["dsl"]],
                     ngv = mu_unforced_ng*V[["ngv"]]),
    rows = (choice_set==4)
  )
  
  P[["choice_unforced_ng"]] = apollo_mnl(mnl_settings_unforced_ng, functionality)
  
  
  ### Combined model
  P = apollo_combineModels(P, apollo_inputs, functionality)
  
  ### Take product across observation for same individual
  P = apollo_panelProd(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

#Model estimation
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)

#Model outputs
# ------------------------------------------------------------------------------- #
# -------------------------- FORMATTED OUTPUT (TO SCREEN) ------------------------
# ------------------------------------------------------------------------------- #
apollo_modelOutput(model)
# ------------------------------------------------------------------------------- #
# ------------------ FORMATTED OUTPUT (TO FILE, using model name) ----------------
# ------------------------------------------------------------------------------- #
apollo_saveOutput(model)

In addition, could you please let me know if there are any erroneous parts in my R scripts above? I'd appreciate any suggestions for improvements.

Thank you very much!

Best regards,

YB
stephanehess
Site Admin
Posts: 1064
Joined: 24 Apr 2020, 16:29

Re: Joint Estimation for Dual Response Survey with Two Segments

Post by stephanehess »

Hi

looks correct to me. I would prefer specification 2 as you allow for further scale differences. In addition however, did you test for differences between the samples in how they react to the individual attributes?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
bye1830
Posts: 5
Joined: 24 Apr 2024, 06:40

Re: Joint Estimation for Dual Response Survey with Two Segments

Post by bye1830 »

Hi Stephane,

Thank you for reviewing my script and providing suggestions. Regarding the test for differences between the two segments, could you educate me about the types of approaches I could use?

I've attempted one potential method. For diesel fleets and NG fleets, the forced choice questions are the same, while the unforced choice questions were presented with different options. Thus, I attempted to compare estimation results using the forced-choice data between the two segments and also conducted joint estimation combining these two segments.

Here are the results. Just FYI, I've modified the utility functions from my previous post after reflecting on your reply to my other post (i.e., treating 'annual_revenue' as a categorical variable).

Estimation results for diesel fleets (forced choice, 252 observations)

Code: Select all

				Estimate Std.err. t-ratio(0) Rob.std.err. Rob.t-ratio(0)
asc_bev				1.147	1.145	1.002	1.371	0.836
asc_hfcev			0.000	NA	NA	NA	NA
b_pcost				0.127	0.773	0.164	0.636	0.199
b_ocost				-0.840	0.582	-1.443	0.669	-1.254
b_range				0.261	0.055	4.784	0.067	3.877
b_offsite			0.259	0.194	1.337	0.206	1.262
b_onsite_bev			-1.053	0.997	-1.056	1.197	-0.880
b_onsite_hfcev			0.429	1.004	0.427	1.266	0.339
asc_bev_shift_adopter		1.530	0.590	2.596	0.552	2.771
b_pcost_AR_less_than_10M	-0.396	0.808	-0.489	0.669	-0.591
b_pcost_AR_between_10M_15M	-0.206	0.984	-0.209	0.697	-0.295
b_pcost_AR_between_15M_30M	-2.098	1.312	-1.598	0.907	-2.314
b_pcost_AR_NA			-0.130	0.906	-0.143	0.709	-0.183

*The 'asc_hfcev_shift_adopter' parameter was excluded from the utility function as none of the diesel fleets operate HFCEVs in my sample.

--> The estimates for 'b_range', 'asc_bev_shift_adoper', and 'b_pcost_AR_btw_15_30M' are significant at the 1% or 5% level.

Estimation results for NG fleets (forced choice, 72 observations)

Code: Select all

				Estimate Std.err. t-ratio(0) Rob.std.err. Rob.t-ratio(0)
asc_bev				-0.092	2.136	-0.043	3.474	-0.026
asc_hfcev			0.000	NA	NA	NA	NA
b_pcost				-0.498	0.672	-0.740	0.592	-0.841
b_ocost				-1.249	1.112	-1.123	0.887	-1.408
b_range				0.186	0.107	1.730	0.114	1.624
b_offsite			0.862	0.551	1.564	0.631	1.367
b_onsite_bev			-0.623	1.846	-0.338	3.035	-0.205
b_onsite_hfcev			0.005	1.859	0.002	2.983	0.002
asc_bev_shift_adopter		2.003	0.886	2.261	0.482	4.159
asc_hfcev_shift_adopter		1.958	0.969	2.019	0.856	2.288
b_pcost_AR_less_than_10M	-0.142	0.910	-0.156	0.873	-0.162
b_pcost_AR_between_10M_15M	0.438	1.198	0.366	0.813	0.539
b_pcost_AR_between_15M_30M	0.010	1.360	0.008	0.643	0.016

*The 'b_pcost_AR_NA' parameter was excluded from the utility functions because none of the NG fleets chose the 'Decline to state' option in the annual revenue question."

--> The estimates for 'b_range', 'asc_bev_shift_adoper', and 'asc_hfcev_shift_adopter' are significant at the 1%, 5%, or 10% level. For those commonly significant estimates between diesel and NG fleets, their absolute values are different (0.261 vs 0.186 for 'b_range', also 1.530 vs 2.003 for 'asc_bev_shift_adoper').


Joint estimation results for both fleets (forced choice, 324 observations)

Code: Select all

				Estimate Std.err. t-ratio(0) Rob.std.err. Rob.t-ratio(0)
asc_bev				1.085	1.055	1.028	1.284	0.845
asc_hfcev			0.000	NA	NA	NA	NA
b_pcost				-0.181	0.558	-0.324	0.481	-0.375
b_ocost				-0.921	0.534	-1.724	0.601	-1.533
b_range				0.253	0.053	4.801	0.065	3.902
b_offsite			0.320	0.186	1.724	0.201	1.592
b_onsite_bev			-1.107	0.915	-1.210	1.116	-0.991
b_onsite_hfcev			0.451	0.919	0.490	1.171	0.385
asc_bev_shift_adopter		1.784	0.539	3.309	0.484	3.689
asc_hfcev_shift_adopter		2.216	0.774	2.861	1.031	2.149
b_pcost_AR_less_than_10M	-0.134	0.602	-0.222	0.523	-0.256
b_pcost_AR_between_10M_15M	0.151	0.778	0.194	0.551	0.273
b_pcost_AR_between_15M_30M	-1.339	1.033	-1.296	0.835	-1.603
b_pcost_AR_NA			0.182	0.734	0.248	0.577	0.316
mu_ngv_fleets			0.785	0.326	2.409	0.305	2.576
--> The estimates for 'b_ocost', 'b_range', 'b_offsite', 'asc_bev_shift_adoper', and 'asc_hfcev_shift_adopter' are found significant at the 1%, 5%, or 10% level. In particular, the scale parameter 'mu_ngv_fleets' is significant at the 5% level.

Then, would it be reasonable to say these two segments have differences in responding to individual attributes? Are there any approaches to test the differences?

In addition, I wonder which t-ratio between traditional ones and robust ones I should refer to when determining the significance of each estimate. Do you have any recommendations, especially given that my sample is relatively not very large (324 observations for each of forced and unforced choices)?

I'd greatly appreciate any wisdom and insights you could share. Thank you very much!

Best regards,

YB
stephanehess
Site Admin
Posts: 1064
Joined: 24 Apr 2020, 16:29

Re: Joint Estimation for Dual Response Survey with Two Segments

Post by stephanehess »

Hi

the two separate models is the same as if you had a fully segmented model. So you can use a LR test to compare the sum of the two separaet models against your generic model without differences. The degrees of freedom would be the additional parameters needed for two separate models

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply