latent class model (LCM) in logit Multinomial
Posted: 20 May 2025, 18:52
Hi everyone,
I am working latent class model (LCM) in R using the apollo package, with the objective of analyzing substitution between tobacco-related products in different countries and population subgroups. The model specification is of multinomial logit type, with the following characteristics:
- Alternatives: cigarettes, refillable and disposable (with alternating constants).
- Attributes per alternative: price (continuous), perceived harm, concealability, taste
Class segmentation: the model includes a latent class function (lcpars) with sociodemographic variables such as age, income, type of activity, educational level and gender.
The estimation runs without errors, and the log-likelihood is updated normally, but the final output does not include standard errors, robust errors or t-ratios. Everything appears as NA, even for parameters that are clearly estimated, such as class attribute coefficients and class intercepts. Here I show a fragment of the result:
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
WARNING: Element apollo_lcPars in the global environment differs from that inside apollo_inputs. The latter will
be used. If you wish to use the former, stop this function by pressing the "Escape" key, and rerun
apollo_validateInputs before calling this function.
WARNING: Object apollo_lcPars from the global environment should not be used inside apollo_probabilities. If
needed, save it inside apollo_inputs and call it as apollo_inputs$apollo_lcPars.
Preparing user-defined functions.
Testing likelihood function...
Apollo found a model component of type MNL without a componentName. The name was set to "class1" by
default.
Overview of choices for MNL model component class1:
ecigdisp ecigrech cig optout
Times available 4760.00 4760.00 4760.00 4760.00
Times chosen 1324.00 1499.00 840.00 1097.00
Percentage chosen overall 27.82 31.49 17.65 23.05
Percentage chosen when available 27.82 31.49 17.65 23.05
Apollo found a model component of type MNL without a componentName. The name was set to "class2" by
default.
Overview of choices for MNL model component class2:
ecigdisp ecigrech cig optout
Times available 4760.00 4760.00 4760.00 4760.00
Times chosen 1324.00 1499.00 840.00 1097.00
Percentage chosen overall 27.82 31.49 17.65 23.05
Percentage chosen when available 27.82 31.49 17.65 23.05
Pre-processing likelihood function...
Creating cluster...
Preparing workers for multithreading...
INFORMATION: Apollo was not able to compute analytical gradients for your model. This could be because you are using
model components for which analytical gradients are not yet implemented, or because you coded your
own model functions. If however you only used apollo_mnl, apollo_fmnl, apollo_normalDensity,
apollo_ol or apollo_op, then there could be another issue. You might want to ask for help in the
Apollo forum (http://www.apollochoicemodelling.com/forum) on how to solve this issue. If you do,
please post your code and data (if not confidential).
Analytical gradients could not be calculated for all components, numerical gradients will be used.
Testing influence of parameters.......................Error en apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, :
SPECIFICATION ISSUE - Parameter bclass1_ingreso_cat2 does not influence the log-likelihood of your model!
The estimation seems to converge well, but the Hessian cannot be inverted to compute the parameter variances. I suspect that there is some collinearity or redundancy problem in the data, but I have not been able to identify it clearly. Has anyone faced this problem? Is there any way to force the calculation of the Hessian or to diagnose the collinearity directly in apollo?
The code is:
# #################################################################
############ FINAL RESULTS: Substitution among tobacco-related products ####
################## Abril 10 2025 ##########################
# This code processes the final results of a Latent Class Model (LCM) using class-specific multinomial logit specifications.
# The goal is to uncover heterogeneous choice patterns and estimate class-specific parameters.
# Based on these estimates, informative distributions over attribute sensitivities are obtained, which can be used to redesign
# the experiment using a Bayesian D-efficient design tailored to the latent class structure.
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
rm(list=ls())
library(apollo)
library(readr)
library(purrr)
library(writexl)
# Inicializar Apollo
apollo_initialise()
# Control del modelo
apollo_control = list(
modelName = paste("LCM_alt_", CASO, sep=""),
modelDescr = "LCM alternativo compatible con Apollo antiguo",
indivID = "unique_id",
panelData = TRUE,
mixing = FALSE,
nCores = 2,
outputDirectory = "output",
nClasses = 2
)
apollo_beta = c(
# --------- Parámetros de utilidad para clase 1 ---------
asc_disp_class1 = 0.473805,
asc_rech_class1 = 0.473924,
asc_cig_class1 = 4.582244,
bprice_class1 = -0.014242,
bext1_class1 = -0.674032,
bext2_class1 = -1.384394,
bext3_class1 = 0,
bharm_class1 = -0.699164,
bhide_class1 = 0.148043,
bflavour1_class1 = 0.083724,
bflavour2_class1 = 0.626005,
# --------- Parámetros de utilidad para clase 2 ---------
asc_disp_class2 = -0.774436,
asc_rech_class2 = 0.026897,
asc_cig_class2 = 0.538305,
bprice_class2 = 0.006231,
bext1_class2 = 0.442261,
bext2_class2 = 0.831525,
bext3_class2 = -1.077648,
bharm_class2 = 0.141028,
bhide_class2 = -0.372712,
bflavour1_class2 = 0.213053,
bflavour2_class2 = -0.652160,
# --------- Parámetros para la probabilidad de clase ---------
bclass1_intercept = 0,
# bclass1_genero = 0,
bclass1_ingreso_cat2 = 0,
bclass1_ingreso_cat3 = 0,
bclass1_actividad_cat2 = 0,
bclass1_actividad_cat3 = 0,
bclass1_edad = 0,
bclass1_edunivel_continua = 0
)
database$anosnivel <- as.numeric(as.character(database$anosnivel))
##If we want to keep parameters fixed to their starting values during the estimation (eg. asc), we include their names in the character vector apollo_fixed.
## this vector is kept empty (apollo_fixed = c()) if all parameters are to be estimated. Parameters included in apollo_fixed are kept at the value used in apollo_beta, which may not be zero
apollo_fixed = c()
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
# (apollo_lcPars) Esta función calcula la probabilidad de pertenecer a cada clase latente con base en variables socioeconómicas,
# y luego esas probabilidades se usan para definir la pertenencia a clase para cada observación.
apollo_lcPars = function(apollo_beta, apollo_inputs) {
database = apollo_inputs$database
# Crear variables dummies para las categorías de mis variables
database$ingreso_cat2 = ifelse(database$ingreso_cat == 2, 1, 0)
database$ingreso_cat3 = ifelse(database$ingreso_cat == 3, 1, 0)
database$actividad_cat2 = ifelse(database$actividad_cat == 2, 1, 0)
database$actividad_cat3 = ifelse(database$actividad_cat == 3, 1, 0)
# Función de utilidad para la clase 1
V = apollo_beta["bclass1_intercept"] + apollo_beta["bclass1_edad"]* database$edad_grupo +
# apollo_beta["bclass1_genero"] * database$genero +
# apollo_beta["bclass1_ingreso_cat2"] * database$ingreso_cat2 +
apollo_beta["bclass1_ingreso_cat3"] * database$ingreso_cat3 +
apollo_beta["bclass1_actividad_cat2"] * database$actividad_cat2 +
apollo_beta["bclass1_actividad_cat3"] * database$actividad_cat3+
apollo_beta["bclass1_edunivel_continua"] * database$cat_niveledu
Pclass1 = exp(V) / (1 + exp(V))
Pclass2 = 1 - Pclass1
pi_values = list(class1 = Pclass1, class2 = Pclass2)
return(pi_values)
}
apollo_probabilities = function(apollo_beta, apollo_inputs, functionality="estimate") {
### Inicialización
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
### ------------------------------
### Clase 1
### ------------------------------
V_class1 = list()
V_class1[["ecigdisp"]] = asc_disp_class1 + bprice_class1*price1 +
bext1_class1*(ext1==1) + bext2_class1*(ext1==2) + bext3_class1*(ext1==3) +
bharm_class1*harm1 + bhide_class1*hide1 +
bflavour1_class1*(flavour1==1) + bflavour2_class1*(flavour1==2)
V_class1[["ecigrech"]] = asc_rech_class1 + bprice_class1*price2 +
bext1_class1*(ext2==1) + bext2_class1*(ext2==2) + bext3_class1*(ext2==3) +
bharm_class1*harm2 + bhide_class1*hide2 +
bflavour1_class1*(flavour2==1) + bflavour2_class1*(flavour2==2)
V_class1[["cig"]] = asc_cig_class1 + bprice_class1*price3 +
bext2_class1*ext3 + bharm_class1*harm3 + bhide_class1*hide3 +
bflavour1_class1*(flavour3==1) + bflavour2_class1*(flavour3==2)
V_class1[["optout"]] = 0
mnl_settings_class1 = list(
alternatives = c(ecigdisp=1, ecigrech=2, cig=3, optout=4),
avail = list(ecigdisp=1, ecigrech=1, cig=1, optout=1),
choiceVar = chosen_option,
V = V_class1
)
P[["class1"]] = apollo_mnl(mnl_settings_class1, functionality)
### ------------------------------
### Clase 2
### ------------------------------
V_class2 = list()
V_class2[["ecigdisp"]] = asc_disp_class2 + bprice_class2*price1 +
bext1_class2*(ext1==1) + bext2_class2*(ext1==2) + bext3_class2*(ext1==3) +
bharm_class2*harm1 + bhide_class2*hide1 +
bflavour1_class2*(flavour1==1) + bflavour2_class2*(flavour1==2)
V_class2[["ecigrech"]] = asc_rech_class2 + bprice_class2*price2 +
bext1_class2*(ext2==1) + bext2_class2*(ext2==2) + bext3_class2*(ext2==3) +
bharm_class2*harm2 + bhide_class2*hide2 +
bflavour1_class2*(flavour2==1) + bflavour2_class2*(flavour2==2)
V_class2[["cig"]] = asc_cig_class2 + bprice_class2*price3 +
bext2_class2*ext3 + bharm_class2*harm3 + bhide_class2*hide3 +
bflavour1_class2*(flavour3==1) + bflavour2_class2*(flavour3==2)
V_class2[["optout"]] = 0
mnl_settings_class2 = list(
alternatives = c(ecigdisp=1, ecigrech=2, cig=3, optout=4),
avail = list(ecigdisp=1, ecigrech=1, cig=1, optout=1),
choiceVar = chosen_option,
V = V_class2
)
P[["class2"]] = apollo_mnl(mnl_settings_class2, functionality)
### ------------------------------
### Combinar probabilidades según clase latente
### ------------------------------
lc_settings = list(inClassProb = apollo_inputs$apollo_lcPars, classSpecificVals = P)
# Calcular probabilidades de clase latente
pi_values = apollo_lcPars(apollo_beta, apollo_inputs)
# Transformación softmax para obtener probabilidades de clase
pi_class1 = exp(pi_values$class1) / (exp(pi_values$class1) + exp(0))
pi_class2 = 1 - pi_class1
P = list(
model = pi_class1 * P[["class1"]] + pi_class2 * P[["class2"]]
)
P = apollo_panelProd(P, apollo_inputs, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
I am very grateful for any guidance or suggestions to be able to correctly interpret the results or adjust the model.
Regards,
I am working latent class model (LCM) in R using the apollo package, with the objective of analyzing substitution between tobacco-related products in different countries and population subgroups. The model specification is of multinomial logit type, with the following characteristics:
- Alternatives: cigarettes, refillable and disposable (with alternating constants).
- Attributes per alternative: price (continuous), perceived harm, concealability, taste
Class segmentation: the model includes a latent class function (lcpars) with sociodemographic variables such as age, income, type of activity, educational level and gender.
The estimation runs without errors, and the log-likelihood is updated normally, but the final output does not include standard errors, robust errors or t-ratios. Everything appears as NA, even for parameters that are clearly estimated, such as class attribute coefficients and class intercepts. Here I show a fragment of the result:
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
WARNING: Element apollo_lcPars in the global environment differs from that inside apollo_inputs. The latter will
be used. If you wish to use the former, stop this function by pressing the "Escape" key, and rerun
apollo_validateInputs before calling this function.
WARNING: Object apollo_lcPars from the global environment should not be used inside apollo_probabilities. If
needed, save it inside apollo_inputs and call it as apollo_inputs$apollo_lcPars.
Preparing user-defined functions.
Testing likelihood function...
Apollo found a model component of type MNL without a componentName. The name was set to "class1" by
default.
Overview of choices for MNL model component class1:
ecigdisp ecigrech cig optout
Times available 4760.00 4760.00 4760.00 4760.00
Times chosen 1324.00 1499.00 840.00 1097.00
Percentage chosen overall 27.82 31.49 17.65 23.05
Percentage chosen when available 27.82 31.49 17.65 23.05
Apollo found a model component of type MNL without a componentName. The name was set to "class2" by
default.
Overview of choices for MNL model component class2:
ecigdisp ecigrech cig optout
Times available 4760.00 4760.00 4760.00 4760.00
Times chosen 1324.00 1499.00 840.00 1097.00
Percentage chosen overall 27.82 31.49 17.65 23.05
Percentage chosen when available 27.82 31.49 17.65 23.05
Pre-processing likelihood function...
Creating cluster...
Preparing workers for multithreading...
INFORMATION: Apollo was not able to compute analytical gradients for your model. This could be because you are using
model components for which analytical gradients are not yet implemented, or because you coded your
own model functions. If however you only used apollo_mnl, apollo_fmnl, apollo_normalDensity,
apollo_ol or apollo_op, then there could be another issue. You might want to ask for help in the
Apollo forum (http://www.apollochoicemodelling.com/forum) on how to solve this issue. If you do,
please post your code and data (if not confidential).
Analytical gradients could not be calculated for all components, numerical gradients will be used.
Testing influence of parameters.......................Error en apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, :
SPECIFICATION ISSUE - Parameter bclass1_ingreso_cat2 does not influence the log-likelihood of your model!
The estimation seems to converge well, but the Hessian cannot be inverted to compute the parameter variances. I suspect that there is some collinearity or redundancy problem in the data, but I have not been able to identify it clearly. Has anyone faced this problem? Is there any way to force the calculation of the Hessian or to diagnose the collinearity directly in apollo?
The code is:
# #################################################################
############ FINAL RESULTS: Substitution among tobacco-related products ####
################## Abril 10 2025 ##########################
# This code processes the final results of a Latent Class Model (LCM) using class-specific multinomial logit specifications.
# The goal is to uncover heterogeneous choice patterns and estimate class-specific parameters.
# Based on these estimates, informative distributions over attribute sensitivities are obtained, which can be used to redesign
# the experiment using a Bayesian D-efficient design tailored to the latent class structure.
# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS ####
# ################################################################# #
rm(list=ls())
library(apollo)
library(readr)
library(purrr)
library(writexl)
# Inicializar Apollo
apollo_initialise()
# Control del modelo
apollo_control = list(
modelName = paste("LCM_alt_", CASO, sep=""),
modelDescr = "LCM alternativo compatible con Apollo antiguo",
indivID = "unique_id",
panelData = TRUE,
mixing = FALSE,
nCores = 2,
outputDirectory = "output",
nClasses = 2
)
apollo_beta = c(
# --------- Parámetros de utilidad para clase 1 ---------
asc_disp_class1 = 0.473805,
asc_rech_class1 = 0.473924,
asc_cig_class1 = 4.582244,
bprice_class1 = -0.014242,
bext1_class1 = -0.674032,
bext2_class1 = -1.384394,
bext3_class1 = 0,
bharm_class1 = -0.699164,
bhide_class1 = 0.148043,
bflavour1_class1 = 0.083724,
bflavour2_class1 = 0.626005,
# --------- Parámetros de utilidad para clase 2 ---------
asc_disp_class2 = -0.774436,
asc_rech_class2 = 0.026897,
asc_cig_class2 = 0.538305,
bprice_class2 = 0.006231,
bext1_class2 = 0.442261,
bext2_class2 = 0.831525,
bext3_class2 = -1.077648,
bharm_class2 = 0.141028,
bhide_class2 = -0.372712,
bflavour1_class2 = 0.213053,
bflavour2_class2 = -0.652160,
# --------- Parámetros para la probabilidad de clase ---------
bclass1_intercept = 0,
# bclass1_genero = 0,
bclass1_ingreso_cat2 = 0,
bclass1_ingreso_cat3 = 0,
bclass1_actividad_cat2 = 0,
bclass1_actividad_cat3 = 0,
bclass1_edad = 0,
bclass1_edunivel_continua = 0
)
database$anosnivel <- as.numeric(as.character(database$anosnivel))
##If we want to keep parameters fixed to their starting values during the estimation (eg. asc), we include their names in the character vector apollo_fixed.
## this vector is kept empty (apollo_fixed = c()) if all parameters are to be estimated. Parameters included in apollo_fixed are kept at the value used in apollo_beta, which may not be zero
apollo_fixed = c()
apollo_inputs = apollo_validateInputs()
# ################################################################# #
#### GROUP AND VALIDATE INPUTS ####
# ################################################################# #
# (apollo_lcPars) Esta función calcula la probabilidad de pertenecer a cada clase latente con base en variables socioeconómicas,
# y luego esas probabilidades se usan para definir la pertenencia a clase para cada observación.
apollo_lcPars = function(apollo_beta, apollo_inputs) {
database = apollo_inputs$database
# Crear variables dummies para las categorías de mis variables
database$ingreso_cat2 = ifelse(database$ingreso_cat == 2, 1, 0)
database$ingreso_cat3 = ifelse(database$ingreso_cat == 3, 1, 0)
database$actividad_cat2 = ifelse(database$actividad_cat == 2, 1, 0)
database$actividad_cat3 = ifelse(database$actividad_cat == 3, 1, 0)
# Función de utilidad para la clase 1
V = apollo_beta["bclass1_intercept"] + apollo_beta["bclass1_edad"]* database$edad_grupo +
# apollo_beta["bclass1_genero"] * database$genero +
# apollo_beta["bclass1_ingreso_cat2"] * database$ingreso_cat2 +
apollo_beta["bclass1_ingreso_cat3"] * database$ingreso_cat3 +
apollo_beta["bclass1_actividad_cat2"] * database$actividad_cat2 +
apollo_beta["bclass1_actividad_cat3"] * database$actividad_cat3+
apollo_beta["bclass1_edunivel_continua"] * database$cat_niveledu
Pclass1 = exp(V) / (1 + exp(V))
Pclass2 = 1 - Pclass1
pi_values = list(class1 = Pclass1, class2 = Pclass2)
return(pi_values)
}
apollo_probabilities = function(apollo_beta, apollo_inputs, functionality="estimate") {
### Inicialización
apollo_attach(apollo_beta, apollo_inputs)
on.exit(apollo_detach(apollo_beta, apollo_inputs))
P = list()
### ------------------------------
### Clase 1
### ------------------------------
V_class1 = list()
V_class1[["ecigdisp"]] = asc_disp_class1 + bprice_class1*price1 +
bext1_class1*(ext1==1) + bext2_class1*(ext1==2) + bext3_class1*(ext1==3) +
bharm_class1*harm1 + bhide_class1*hide1 +
bflavour1_class1*(flavour1==1) + bflavour2_class1*(flavour1==2)
V_class1[["ecigrech"]] = asc_rech_class1 + bprice_class1*price2 +
bext1_class1*(ext2==1) + bext2_class1*(ext2==2) + bext3_class1*(ext2==3) +
bharm_class1*harm2 + bhide_class1*hide2 +
bflavour1_class1*(flavour2==1) + bflavour2_class1*(flavour2==2)
V_class1[["cig"]] = asc_cig_class1 + bprice_class1*price3 +
bext2_class1*ext3 + bharm_class1*harm3 + bhide_class1*hide3 +
bflavour1_class1*(flavour3==1) + bflavour2_class1*(flavour3==2)
V_class1[["optout"]] = 0
mnl_settings_class1 = list(
alternatives = c(ecigdisp=1, ecigrech=2, cig=3, optout=4),
avail = list(ecigdisp=1, ecigrech=1, cig=1, optout=1),
choiceVar = chosen_option,
V = V_class1
)
P[["class1"]] = apollo_mnl(mnl_settings_class1, functionality)
### ------------------------------
### Clase 2
### ------------------------------
V_class2 = list()
V_class2[["ecigdisp"]] = asc_disp_class2 + bprice_class2*price1 +
bext1_class2*(ext1==1) + bext2_class2*(ext1==2) + bext3_class2*(ext1==3) +
bharm_class2*harm1 + bhide_class2*hide1 +
bflavour1_class2*(flavour1==1) + bflavour2_class2*(flavour1==2)
V_class2[["ecigrech"]] = asc_rech_class2 + bprice_class2*price2 +
bext1_class2*(ext2==1) + bext2_class2*(ext2==2) + bext3_class2*(ext2==3) +
bharm_class2*harm2 + bhide_class2*hide2 +
bflavour1_class2*(flavour2==1) + bflavour2_class2*(flavour2==2)
V_class2[["cig"]] = asc_cig_class2 + bprice_class2*price3 +
bext2_class2*ext3 + bharm_class2*harm3 + bhide_class2*hide3 +
bflavour1_class2*(flavour3==1) + bflavour2_class2*(flavour3==2)
V_class2[["optout"]] = 0
mnl_settings_class2 = list(
alternatives = c(ecigdisp=1, ecigrech=2, cig=3, optout=4),
avail = list(ecigdisp=1, ecigrech=1, cig=1, optout=1),
choiceVar = chosen_option,
V = V_class2
)
P[["class2"]] = apollo_mnl(mnl_settings_class2, functionality)
### ------------------------------
### Combinar probabilidades según clase latente
### ------------------------------
lc_settings = list(inClassProb = apollo_inputs$apollo_lcPars, classSpecificVals = P)
# Calcular probabilidades de clase latente
pi_values = apollo_lcPars(apollo_beta, apollo_inputs)
# Transformación softmax para obtener probabilidades de clase
pi_class1 = exp(pi_values$class1) / (exp(pi_values$class1) + exp(0))
pi_class2 = 1 - pi_class1
P = list(
model = pi_class1 * P[["class1"]] + pi_class2 * P[["class2"]]
)
P = apollo_panelProd(P, apollo_inputs, functionality)
P = apollo_prepareProb(P, apollo_inputs, functionality)
return(P)
}
# ################################################################# #
#### MODEL ESTIMATION ####
# ################################################################# #
model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
I am very grateful for any guidance or suggestions to be able to correctly interpret the results or adjust the model.
Regards,