Thank you, Stephane. Your comments and answers have finally guided me to what I'm after.
To zoom out, I'm trying to classify the respondents in my dataset into distinct behavioural groups based on the estimates for their latent variables. To do that, I want to estimate the class allocation probabilities based on the values of latent variables, in other words (or as I think of it simple terms) estimate only indicators and class allocation probabilities of a Latent Variable Latent Class model. I then want to use apollo_lcUnconditionals to pull out allocation probabilities for each individual and create a dichotomous variable that will be 1 for those individuals who fall into that class (have a probability above a certain threshold) and 0 for those that don't (probability is below a certain threshold). I then want to use that variable as a predictor in a series of choice models. If you're curious, the reason I want this estimation to be sequential is that I want the class allocation to be the same for all different dependent variables that I will use in different choice models.
While browsing this forum for a relevant example, I stumbled upon this answer -
and used ut as guidance for my case. In the code below I tried to adapt that example to my needs, with the intent of using one latent variable to predict class allocation. Obviously, it didn't work, since inClassProb=P has a different length from classProb=pi_values, and without the choice component I don't know how and when to state that I'm interested in 2 classes only. My hunch is that for my purposes I don't even need inClassProb, but without it lc_settings doesn't work (which you probably know well).
Code: Select all
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
### Clear memory
rm(list = ls())
### Load libraries
library(apollo)
### Initialise code
apollo_initialise()
### Set core controls
apollo_control = list(
modelName ="Translink_Time_Covid_OL_1LV_Est",
modelDescr ="ICLV for Translink with 1 LV Estimation",
indivID ="UNID",
panelData = FALSE,
mixing = TRUE,
nCores = 3)
# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS ####
# ################################################################# #
setwd("D:/Research/2020_TransLink_Overcrowding/Data/Time_Covid")
database = read.csv("translink_time_covid.csv",header=TRUE)
# ################################################################# #
#### DEFINE MODEL PARAMETERS ####
# ################################################################# #
### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(zeta_acon = 1,
zeta_scon = 1,
zeta_both = 1,
zeta_seat = 1,
zeta_offpeak = 1,
zeta_alt = 1,
tau_acon_1 =-2,
tau_acon_2 =-1,
tau_acon_3 = 1,
tau_acon_4 = 2,
tau_scon_1 =-2,
tau_scon_2 =-1,
tau_scon_3 = 1,
tau_scon_4 = 2,
tau_both_1 =-2,
tau_both_2 =-1,
tau_both_3 = 1,
tau_both_4 = 2,
tau_seat_1 =-2,
tau_seat_2 =-1,
tau_seat_3 = 1,
tau_seat_4 = 2,
tau_offpeak_1 =-2,
tau_offpeak_2 =-1,
tau_offpeak_3 = 1,
tau_offpeak_4 = 2,
tau_alt_1 =-2,
tau_alt_2 =-1,
tau_alt_3 = 1,
tau_alt_4 = 2,
gamma_LV1_female = 0,
gamma_LV1_wave2 = 0,
gamma_LV1_ampeak = 0,
gamma_LV1_fulltime = 0,
gamma_LV1_incomelow = 0,
gamma_LV1_age65O = 0,
gamma_LV1_edbach = 0,
gamma_LV1_ptno = 0,
gamma_LV1_kids = 0,
gamma_LV1_car = 0,
piCons = 1,
piLV1 = 1)
### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()
# ################################################################# #
#### DEFINE RANDOM COMPONENTS ####
# ################################################################# #
### Set parameters for generating draws
apollo_draws = list(
interDrawsType="halton",
interNDraws=100,
interUnifDraws=c(),
interNormDraws=c("eta1"),
intraDrawsType='',
intraNDraws=0,
intraUnifDraws=c(),
intraNormDraws=c()
)
### Create random parameters
apollo_randCoeff=function(apollo_beta, apollo_inputs){
randcoeff = list()
randcoeff[["LV1"]] = gamma_LV1_female * FEMALE + gamma_LV1_wave2 * WAVE2 + gamma_LV1_ampeak * TIME_BC_AM +
gamma_LV1_fulltime * EMPLOY_FULL + gamma_LV1_incomelow * INCOME_LOW + gamma_LV1_age65O * AGE_65O +
gamma_LV1_edbach * EDUCATION_BACH + gamma_LV1_kids * HOUSEHOLD_CHILD_CLEAN_N + gamma_LV1_car * CAR_B +
gamma_LV1_ptno * PT_C_NO +
eta1
return(randcoeff)
}
# ################################################################# #
#### DEFINE LATENT CLASS COMPONENTS ####
# ################################################################# #
apollo_lcPars=function(apollo_beta, apollo_inputs){
lcpars = list()
### Class allocation probabilities
### These are the probabilities of a binary logit model
### apollo_mnl could be used too (with functionality="raw"
### and choice=NA), but explicitly writing the probability
### is easier.
VA = piCons + piLV1*LV1
VB = 0
piA = exp(VA)/(exp(VA) + exp(VB))
piB = 1 - piA
lcpars[["pi_values"]] = apollo_firstRow(list(piA, piB), apollo_inputs)
return(lcpars)
}
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
# ################################################################# #
apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
### Attach inputs and detach after function exit
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
### Create list of probabilities P
P = list()
### Likelihood of indicators
ol_settings1 = list(outcomeOrdered=AGREE_BOTH_CROWD_BC_R,
V=zeta_both*LV1,
tau=list(tau_both_1, tau_both_2, tau_both_3, tau_both_4))
ol_settings2 = list(outcomeOrdered=AGREE_CONCERN_BC_R,
V=zeta_acon*LV1,
tau=list(tau_acon_1, tau_acon_2, tau_acon_3, tau_acon_4))
ol_settings3 = list(outcomeOrdered=AGREE_SEAT_BC_R,
V=zeta_seat*LV1,
tau=list(tau_seat_1, tau_seat_2, tau_seat_3, tau_seat_4))
ol_settings4 = list(outcomeOrdered=STATE_CONCERNED_R,
V=zeta_scon*LV1,
tau=list(tau_scon_1, tau_scon_2, tau_scon_3, tau_scon_4))
ol_settings5 = list(outcomeOrdered=AGREE_OFFPEAK_BC_R,
V=zeta_offpeak*LV1,
tau=list(tau_offpeak_1, tau_offpeak_2, tau_offpeak_3, tau_offpeak_4))
ol_settings6 = list(outcomeOrdered=AGREE_ALT_BC_R,
V=zeta_alt*LV1,
tau=list(tau_alt_1, tau_alt_2, tau_alt_3, tau_alt_4))
P[["indic_both"]] = apollo_ol(ol_settings1, functionality)
P[["indic_acon"]] = apollo_ol(ol_settings2, functionality)
P[["indic_seat"]] = apollo_ol(ol_settings3, functionality)
P[["indic_scon"]] = apollo_ol(ol_settings4, functionality)
P[["indic_offpeak"]] = apollo_ol(ol_settings5, functionality)
P[["indic_alt"]] = apollo_ol(ol_settings6, functionality)
### Compute latent class model probabilities
lc_settings = list(inClassProb=P, classProb=pi_values)
P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)
### Likelihood of the whole model
P = apollo_combineModels(P, apollo_inputs, functionality)
### Take product across observation for same individual
#P = apollo_panelProd(P, apollo_inputs, functionality)
### Average across inter-individual draws
P = apollo_avgInterDraws(P, apollo_inputs, functionality)
### Prepare and return outputs of function
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### ESTIMATE SETTINGS ####
# ################################################################# #
estimate_settings = list(maxIterations = 250)
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs, estimate_settings)
# ################################################################# #
#### MODEL OUTPUTS ####
# ################################################################# #
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN) ----
# ----------------------------------------------------------------- #
apollo_modelOutput(model)
# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name) ----
# ----------------------------------------------------------------- #
apollo_saveOutput(model)
Can you please advise me on how to modify the code above to achieve what I'm after? Also, is it possible to streamline the process, and
assign each of the respondents to a respective latent class using allocation thresholds right away?