Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. This may take a day or two at busy times. There is no need to submit the post multiple times.

Latent variable estimation

Ask questions about model specifications. Ideally include a mathematical explanation of your proposed model.
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Latent variable estimation

Post by bokapatsila »

Hi Stephane ad David,

Is it possible to use Appollo for the estimation of Latent Variables only?

I know how to code 4 latent variables that are based on 16 Likert-scale questions and some demographics, using the structural equation part of a hybrid choice model. However, I am not interested in the measurement part of the model, but only need estimates for those 4 latent variables. Can it be estimated using Apollo? If yes, what changes have to be made to an ICLV example (#24) available for Apollo? Thanks!
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Latent variable estimation

Post by stephanehess »

Hi

it's not quite clear what you mean here.

The estimation of a model requires a dependent variable. In a hybrid choice model, you have choices and Likert-scale questions as the dependent variables. Which of these would you keep?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: Latent variable estimation

Post by bokapatsila »

Hi Stephen,

I'm sorry I wasn't clear enough. I don't want to estimate a hybrid choice model at this point. I want to estimate 4 latent variables only and use these estimates to experiment with their influence on different dependent variables to see which of them will work out best before running it simultaneously as a hybrid choice model. Is this possible in apollo? Thanks!
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Latent variable estimation

Post by stephanehess »

Hi

I understand what you are after here are the parameters for the structural equation of the latent variable. However, my point is that for estimation, you need a dependent variable. So you can't drop both the choices and the indicators. Maybe what you want to do is to keep the indicators only as the dependent variable, like in SEM.

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: Latent variable estimation

Post by bokapatsila »

That's actually what I'm after, thanks for this suggestion. In that case, can I simultaneously estimate 4 ordered logit models with normally distributed error terms in Apollo? Or the only way to do it in Apollo would be to estimate each of them separately? Thank you for the suggestions!
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Latent variable estimation

Post by stephanehess »

An ordered logit model does not have a normally distributed error term. But I guess what you mean is to include the latent variable in the utility function of the ordered logit models. But you'll need to use the LV in at least two OLs so there can't be 4 OLs in your case
--------------------------------
Stephane Hess
www.stephanehess.me.uk
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: Latent variable estimation

Post by bokapatsila »

stephanehess wrote: 07 Oct 2021, 17:09 Hi

I understand what you are after here are the parameters for the structural equation of the latent variable. However, my point is that for estimation, you need a dependent variable. So you can't drop both the choices and the indicators. Maybe what you want to do is to keep the indicators only as the dependent variable, like in SEM.

Stephane
Thank you Stephane. In this case when I define the latent variable 1 (LV1) as:

randcoeff[["LV1"]] = gamma_LV1_female * FEMALE + gamma_LV1_wave2 * WAVE2 + gamma_LV1_ampeak * TIME_BC_AM +
gamma_LV1_fulltime * EMPLOY_FULL + gamma_LV1_incomelow * INCOME_LOW + gamma_LV1_age65O * AGE_65O +
gamma_LV1_edbach * EDUCATION_BACH + gamma_LV1_kids * HOUSEHOLD_CHILD_CLEAN_N + gamma_LV1_car * CAR_B +
gamma_LV1_ptno * PT_C_NO +
eta1

Which is then used in indicators:

ol_settings9 = list(outcomeOrdered=STATE_APP_R,
V=zeta_app*LV1,
tau=list(tau_app_1, tau_app_2, tau_app_3, tau_app_4))
ol_settings10 = list(outcomeOrdered=STATE_MPAY_R,
V=zeta_mpay*LV1,
tau=list(tau_mpay_1, tau_mpay_2, tau_mpay_3, tau_mpay_4))

P[["indic_app"]] = apollo_ol(ol_settings9, functionality)
P[["indic_mpay"]] = apollo_ol(ol_settings10, functionality)

Can I obtain an estimate for LV1 for each individual in my dataset? If yes, how would I do that?
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Latent variable estimation

Post by stephanehess »

Hi

there is no such thing as an estimate for each person unless you estimate person-specific models, for which you would need very large amounts of data per person. I assume what you referring to is the Bayesian idea of posteriors from the sample level distribution. For this, you can use apollo_conditionals. There is a discussion in the manual

Best wishes

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
bokapatsila
Posts: 21
Joined: 28 Jul 2021, 02:41

Re: Latent variable estimation

Post by bokapatsila »

Thank you, Stephane. Your comments and answers have finally guided me to what I'm after.

To zoom out, I'm trying to classify the respondents in my dataset into distinct behavioural groups based on the estimates for their latent variables. To do that, I want to estimate the class allocation probabilities based on the values of latent variables, in other words (or as I think of it simple terms) estimate only indicators and class allocation probabilities of a Latent Variable Latent Class model. I then want to use apollo_lcUnconditionals to pull out allocation probabilities for each individual and create a dichotomous variable that will be 1 for those individuals who fall into that class (have a probability above a certain threshold) and 0 for those that don't (probability is below a certain threshold). I then want to use that variable as a predictor in a series of choice models. If you're curious, the reason I want this estimation to be sequential is that I want the class allocation to be the same for all different dependent variables that I will use in different choice models.

While browsing this forum for a relevant example, I stumbled upon this answer - http://www.apollochoicemodelling.com/fo ... +class#p74 and used ut as guidance for my case. In the code below I tried to adapt that example to my needs, with the intent of using one latent variable to predict class allocation. Obviously, it didn't work, since inClassProb=P has a different length from classProb=pi_values, and without the choice component I don't know how and when to state that I'm interested in 2 classes only. My hunch is that for my purposes I don't even need inClassProb, but without it lc_settings doesn't work (which you probably know well).

Code: Select all

# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
# ################################################################# #

### Clear memory
rm(list = ls())

### Load libraries
library(apollo)

### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName  ="Translink_Time_Covid_OL_1LV_Est",
  modelDescr ="ICLV for Translink with 1 LV Estimation",
  indivID    ="UNID",
  panelData = FALSE,
  mixing     = TRUE,
  nCores     = 3)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

setwd("D:/Research/2020_TransLink_Overcrowding/Data/Time_Covid")

database = read.csv("translink_time_covid.csv",header=TRUE)

# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta=c(zeta_acon = 1, 
              zeta_scon = 1,
              zeta_both = 1,
              zeta_seat = 1,
              zeta_offpeak = 1, 
              zeta_alt = 1, 
              tau_acon_1 =-2, 
              tau_acon_2 =-1, 
              tau_acon_3 = 1, 
              tau_acon_4 = 2,
              tau_scon_1 =-2, 
              tau_scon_2 =-1, 
              tau_scon_3 = 1, 
              tau_scon_4 = 2,
              tau_both_1 =-2, 
              tau_both_2 =-1, 
              tau_both_3 = 1, 
              tau_both_4 = 2,
              tau_seat_1 =-2, 
              tau_seat_2 =-1, 
              tau_seat_3 = 1, 
              tau_seat_4 = 2,
              tau_offpeak_1 =-2, 
              tau_offpeak_2 =-1,
              tau_offpeak_3 = 1, 
              tau_offpeak_4 = 2,
              tau_alt_1 =-2, 
              tau_alt_2 =-1,
              tau_alt_3 = 1, 
              tau_alt_4 = 2,
              gamma_LV1_female  = 0, 
              gamma_LV1_wave2  = 0,
              gamma_LV1_ampeak  = 0, 
              gamma_LV1_fulltime  = 0,
              gamma_LV1_incomelow  = 0,
              gamma_LV1_age65O  = 0,
              gamma_LV1_edbach = 0,
              gamma_LV1_ptno = 0,
              gamma_LV1_kids = 0,
              gamma_LV1_car = 0,
              
              piCons  = 1, 
              piLV1   = 1)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c()

# ################################################################# #
#### DEFINE RANDOM COMPONENTS                                    ####
# ################################################################# #

### Set parameters for generating draws
apollo_draws = list(
  interDrawsType="halton", 
  interNDraws=100,          
  interUnifDraws=c(),      
  interNormDraws=c("eta1"), 
  
  intraDrawsType='',
  intraNDraws=0,          
  intraUnifDraws=c(),     
  intraNormDraws=c()      
)

### Create random parameters
apollo_randCoeff=function(apollo_beta, apollo_inputs){
  randcoeff = list()
  
  randcoeff[["LV1"]] = gamma_LV1_female * FEMALE + gamma_LV1_wave2 * WAVE2 + gamma_LV1_ampeak * TIME_BC_AM +
    gamma_LV1_fulltime * EMPLOY_FULL + gamma_LV1_incomelow * INCOME_LOW + gamma_LV1_age65O * AGE_65O + 
    gamma_LV1_edbach * EDUCATION_BACH + gamma_LV1_kids * HOUSEHOLD_CHILD_CLEAN_N + gamma_LV1_car * CAR_B +
    gamma_LV1_ptno * PT_C_NO + 
    eta1

  return(randcoeff)
}


# ################################################################# #
#### DEFINE LATENT CLASS COMPONENTS                              ####
# ################################################################# #

apollo_lcPars=function(apollo_beta, apollo_inputs){
  lcpars = list()

  ### Class allocation probabilities
  ### These are the probabilities of a binary logit model
  ### apollo_mnl could be used too (with functionality="raw" 
  ### and choice=NA), but explicitly writing the probability 
  ### is easier.
  VA  = piCons + piLV1*LV1
  VB  = 0
  piA = exp(VA)/(exp(VA) + exp(VB))
  piB = 1 - piA
  lcpars[["pi_values"]] = apollo_firstRow(list(piA, piB), apollo_inputs)
  
  return(lcpars)
}


# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))

  ### Create list of probabilities P
  P = list()

  ### Likelihood of indicators
  ol_settings1 = list(outcomeOrdered=AGREE_BOTH_CROWD_BC_R, 
                      V=zeta_both*LV1, 
                      tau=list(tau_both_1, tau_both_2, tau_both_3, tau_both_4))
  ol_settings2 = list(outcomeOrdered=AGREE_CONCERN_BC_R, 
                      V=zeta_acon*LV1, 
                      tau=list(tau_acon_1, tau_acon_2, tau_acon_3, tau_acon_4))
  ol_settings3 = list(outcomeOrdered=AGREE_SEAT_BC_R, 
                      V=zeta_seat*LV1, 
                      tau=list(tau_seat_1, tau_seat_2, tau_seat_3, tau_seat_4))
  ol_settings4 = list(outcomeOrdered=STATE_CONCERNED_R, 
                      V=zeta_scon*LV1, 
                      tau=list(tau_scon_1, tau_scon_2, tau_scon_3, tau_scon_4))
  ol_settings5 = list(outcomeOrdered=AGREE_OFFPEAK_BC_R, 
                      V=zeta_offpeak*LV1, 
                      tau=list(tau_offpeak_1, tau_offpeak_2, tau_offpeak_3, tau_offpeak_4))
  ol_settings6 = list(outcomeOrdered=AGREE_ALT_BC_R, 
                      V=zeta_alt*LV1, 
                      tau=list(tau_alt_1, tau_alt_2, tau_alt_3, tau_alt_4))

  P[["indic_both"]]     = apollo_ol(ol_settings1, functionality)
  P[["indic_acon"]]     = apollo_ol(ol_settings2, functionality)
  P[["indic_seat"]]      = apollo_ol(ol_settings3, functionality)
  P[["indic_scon"]]      = apollo_ol(ol_settings4, functionality)
  P[["indic_offpeak"]]     = apollo_ol(ol_settings5, functionality)
  P[["indic_alt"]]      = apollo_ol(ol_settings6, functionality)
  
  ### Compute latent class model probabilities
  lc_settings   = list(inClassProb=P, classProb=pi_values)
  P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)

  ### Likelihood of the whole model
  P = apollo_combineModels(P, apollo_inputs, functionality)

  ### Take product across observation for same individual
  #P = apollo_panelProd(P, apollo_inputs, functionality)

  ### Average across inter-individual draws
  P = apollo_avgInterDraws(P, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### ESTIMATE SETTINGS                                           ####
# ################################################################# #

estimate_settings = list(maxIterations  = 250)

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs, estimate_settings)

# ################################################################# #
#### MODEL OUTPUTS                                               ####
# ################################################################# #

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO SCREEN)                               ----
# ----------------------------------------------------------------- #

apollo_modelOutput(model)

# ----------------------------------------------------------------- #
#---- FORMATTED OUTPUT (TO FILE, using model name)               ----
# ----------------------------------------------------------------- #

apollo_saveOutput(model)

Can you please advise me on how to modify the code above to achieve what I'm after? Also, is it possible to streamline the process, and
assign each of the respondents to a respective latent class using allocation thresholds right away?
stephanehess
Site Admin
Posts: 998
Joined: 24 Apr 2020, 16:29

Re: Latent variable estimation

Post by stephanehess »

Hi

while you can of course do this, to me, it is a wrong thing to try and do. The latent variable is not deterministic, so using it to deterministically classify people means that you are ignoring the random part of the LV.

In terms of the other part of your question, if you have a latent class model, you need something that varies across classes, which you don't seem to do here, as you only have a single value for each parameter.

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply