Important: Read this before posting to this forum

  1. This forum is for questions related to the use of Apollo. We will answer some general choice modelling questions too, where appropriate, and time permitting. We cannot answer questions about how to estimate choice models with other software packages.
  2. There is a very detailed manual for Apollo available at http://www.ApolloChoiceModelling.com/manual.html. This contains detailed descriptions of the various Apollo functions, and numerous examples are available at http://www.ApolloChoiceModelling.com/examples.html. In addition, help files are available for all functions, using e.g. ?apollo_mnl
  3. Before asking a question on the forum, users are kindly requested to follow these steps:
    1. Check that the same issue has not already been addressed in the forum - there is a search tool.
    2. Ensure that the correct syntax has been used. For any function, detailed instructions are available directly in Apollo, e.g. by using ?apollo_mnl for apollo_mnl
    3. Check the frequently asked questions section on the Apollo website, which discusses some common issues/failures. Please see http://www.apollochoicemodelling.com/faq.html
    4. Make sure that R is using the latest official release of Apollo.
  4. If the above steps do not resolve the issue, then users should follow these steps when posting a question:
    1. provide full details on the issue, including the entire code and output, including any error messages
    2. posts will not immediately appear on the forum, but will be checked by a moderator first. We check the forum at least twice a week. It may thus take a couple of days for your post to appear and before we reply. There is no need to submit the post multiple times.

Errors convergence latent class model

Ask questions about errors you encouunter. Please make sure to include full details about your model specifications, and ideally your model file.
Post Reply
LotteMuller
Posts: 4
Joined: 26 Sep 2024, 13:57

Errors convergence latent class model

Post by LotteMuller »

Hi Apollo forum,

I have a question about running a latent class model on choice data for two treatment groups (each with ~300 respondents). One treatment group converges successfully, but the other does not. The only difference between the groups is that one treatment group sees an additional attribute; otherwise, everything (cards, blocks, levels) is identical.

Before running a 3 class LCM, I estimated a 2-class LCM, MNL and MMNL models separately for each treatment group. The results were quite similar across both treatment groups, suggesting little difference between them. However, when moving to the 3-class LCM, the treatment group with one additional attribute fails to converge, giving errors about saddle points, singular Hessians, and singular convergence (parameters tending to +/- infinity).

So far i have tried different initial values: including those from the other treatment group, MMNL estimates, apollo_searchStart, and looping through 500 random initial values within a plausible range. I found one lucky set of initial values that led to convergence, but the results seem strange (a now a significantly positive cost parameter, and other parameter that was never significant before now is), and struggle to trust the results. I have not managed to get this LCM to converge for other initial values. Besides this, I also checked for issues in the data—differences in choices, descriptives, and cleaning steps—but found no obvious problems.

Given that the 2-class LCM/MNL/MMNL are so different from this 3-class LCM, I’m wondering:
Is my model over-specified? The difference would be only be 3 addition parameters to estimate.
Are there specific tests you would recommend to find and fix the issue?
Have I overlooked a typo in the code?

Thanks for taking the time to read this question!

Code: Select all

# ################################################################# #
#### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
# ################################################################# #

### Clear memory
rm(list = ls())

### Load libraries
library(apollo)
library(dplyr)

# Set working directory and read data
# Function to set working directory based on operating system
set_working_directory <- function() {
  os_info <- Sys.info()[["sysname"]]
  
  if (os_info == "Windows") {
    setwd("C:/Users/lotte/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis")
  } else if (os_info == "Darwin") {
    setwd("~/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis") # macOS
  } else if (os_info == "Linux") {
    setwd("~/path/to/your/directory")
  } else {
    stop("Unsupported operating system")
  }
  cat("Working directory set to:", getwd(), "\n")
}


# Set the working directory based on the OS
set_working_directory()
### Initialise code
apollo_initialise()

### Set core controls
apollo_control = list(
  modelName       = "LC_T2-4_3_class_all",
  modelDescr      = "LC model on T2-4 ghana choice data, 3 classes, all coeffs vary",
  indivID         = "ID",
  nCores          =  6, 
  debug = TRUE,
  workInLogs=TRUE,
  outputDirectory = "output"
)

# ################################################################# #
#### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
# ################################################################# #

### Loading data:
database = read.csv("choice_data_2indicator_wide.csv", header=TRUE)

database <- database %>%
  rename(ID = id) %>%
  arrange(ID)



### Rescaling variables and testing for missing values:

varlist <- c("rainMean_1", "rainMean_2", "rainMean_3", "rainVariance_1", "rainVariance_2", "rainVariance_3")
# Use mutate with across to divide values by 10
database <- database %>%
  mutate(across(all_of(varlist), ~ . / 10))
# interpretation:
# cm rainfall instead of mm rainfall.


varlist <- c("costI_1", "costI_2", "costI_3")
# Use mutate with across to divide values by 10000
database <- database %>%
  mutate(across(all_of(varlist), ~ . / 100))
# interpretation:
# costs in terms of 100 Ghanaian cedi.


# Define the variables to check
vars_to_check <- c(
  paste0("rainMean_", 1:3),
  paste0("rainVariance_", 1:3),
  paste0("dayMean_", 1:3),
  paste0("costsI_", 1:3),
  paste0("fracI_", 1:3),
  paste0("payback_", 1:3)
)

# Ensure the variables exist in the dataframe
vars_to_check <- vars_to_check[vars_to_check %in% colnames(database)]

# Check for infinite values in the specified columns
if (length(vars_to_check) > 0) {
  infinite_values <- sapply(database[, vars_to_check, drop = FALSE], function(col) any(is.infinite(col)))
  
  # Show which variables contain infinite values
  cat("Variables with infinite values:\n")
  print(names(infinite_values[infinite_values]))
} else {
  cat("No matching variables found in the dataframe.\n")
}


# Rename columns
database <- database %>%
  rename(
    variance_1 = rainVariance_1,
    variance_2 = rainVariance_2,
    variance_3 = rainVariance_3
  )





# ################################################################# #
#### DEFINE MODEL PARAMETERS                                     ####
# ################################################################# #

### Vector of parameters, including any that are kept fixed in estimation
apollo_beta = c(    asc_sq_a         = -1,
                    asc_sq_b         = 0.8,
                    asc_sq_c         = 0,
                    
                    b_rainmean_a     = -0.006,
                    b_rainmean_b     = -0.02,
                    b_rainmean_c     = 0,
                    
                    b_variance_a     = -0.007,
                    b_variance_b     = 0,
                    b_variance_c     = 0,
                    
                    b_daymean_a      = 0.1,
                    b_daymean_b      = 0.05,
                    b_daymean_c      = 0,
                    
                    b_fracI_a        = 0.04,
                    b_fracI_b        = 0.006,
                    b_fracI_c        = 0,
                    
                    b_costI_a        = -0.7,
                    b_costI_b        = -0.3,
                    b_costI_c        = -0.1,
                    
                    b_payback_a      = 0.5,
                    b_payback_b      = -0.3,
                    b_payback_c      = 0,
                    
                    delta_a          = 1,
                    delta_b          = 0.4,
                    delta_c          = 0
)

### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
apollo_fixed = c("delta_c")

# ################################################################# #
#### DEFINE LATENT CLASS COMPONENTS                              ####
# ################################################################# #

apollo_lcPars=function(apollo_beta, apollo_inputs){
  lcpars = list()
  
  lcpars[["asc_sq"]]      = list(asc_sq_a, asc_sq_b, asc_sq_c)
  lcpars[["b_rainmean"]]  = list(b_rainmean_a, b_rainmean_b, b_rainmean_c)
  lcpars[["b_variance"]]  = list(b_variance_a, b_variance_b, b_variance_c)
  lcpars[["b_daymean"]]   = list(b_daymean_a, b_daymean_b, b_daymean_c)
  lcpars[["b_fracI"]]     = list(b_fracI_a, b_fracI_b, b_fracI_c)
  lcpars[["b_costI"]]     = list(b_costI_a, b_costI_b, b_costI_c)
  lcpars[["b_payback"]]   = list(b_payback_a, b_payback_b, b_payback_c)
  
  
  V=list()
  
  V[["class_a"]] = delta_a
  V[["class_b"]] = delta_b
  V[["class_c"]] = delta_c
  
  classAlloc_settings = list(
    alternatives = c(class_a=1, class_b=2, class_c=3), 
    V            = V
  )
  
  lcpars[["pi_values"]] = apollo_classAlloc(classAlloc_settings)
  
  return(lcpars)
}

# ################################################################# #
#### GROUP AND VALIDATE INPUTS                                   ####
# ################################################################# #

apollo_inputs = apollo_validateInputs()

# ################################################################# #
#### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
# ################################################################# #

apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
  
  ### Attach inputs and detach after function exit
  apollo_attach(apollo_beta, apollo_inputs)
  on.exit(apollo_detach(apollo_beta, apollo_inputs))
  
  ### Create list of probabilities P
  P = list()
  
  ### Define settings for MNL model component that are generic across classes
  mnl_settings = list(
    alternatives = c(option1=1, option2=2, sq=3),
    avail        = list(option1=1, option2=1, sq=1),
    choiceVar    = choice
  )
  
  ### Loop over classes
  for(s in 1:3){
    
    
    ### Compute class-specific utilities
    V = list()
    V[["option1"]]  = b_rainmean[[s]] * rainMean_1 + 
      b_variance[[s]] * variance_1 + 
      b_daymean[[s]] * dayMean_1 + 
      b_fracI[[s]] *fracI_1 + 
      b_costI[[s]] * costI_1 + 
      b_payback[[s]] * payback_1 
    
    
    V[["option2"]]  = b_rainmean[[s]] * rainMean_2 + 
      b_variance[[s]] * variance_2 + 
      b_daymean[[s]] * dayMean_2 + 
      b_fracI[[s]] *fracI_2 + 
      b_costI[[s]] * costI_2 + 
      b_payback[[s]] * payback_2 
    
    V[["sq"]]       = asc_sq[[s]]   
    
    
    mnl_settings$utilities = V
    mnl_settings$componentName = paste0("Class_",s)
    
    ### Compute within-class choice probabilities using MNL model
    P[[paste0("Class_",s)]] = apollo_mnl(mnl_settings, functionality)
    
    ### Take product across observation for same individual
    P[[paste0("Class_",s)]] = apollo_panelProd(P[[paste0("Class_",s)]], apollo_inputs ,functionality)
  }
  
  ### Compute latent class model probabilities
  lc_settings   = list(inClassProb = P, classProb=pi_values)
  P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)
  
  ### Prepare and return outputs of function
  P = apollo_prepareProb(P, apollo_inputs, functionality)
  return(P)
}

# ################################################################# #
#### MODEL ESTIMATION                                            ####
# ################################################################# #

model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)


### Optional starting values search
apollo_beta=apollo_searchStart(apollo_beta, apollo_fixed,apollo_probabilities, apollo_inputs)

stephanehess
Site Admin
Posts: 1351
Joined: 24 Apr 2020, 16:29

Re: Errors convergence latent class model

Post by stephanehess »

Hi

could you show us the full output, please?

Thanks
--------------------------------
Stephane Hess
www.stephanehess.me.uk
LotteMuller
Posts: 4
Joined: 26 Sep 2024, 13:57

Re: Errors convergence latent class model

Post by LotteMuller »

Hi Stephan,

Thanks for your help- here is the output below.
To clarify, in treatment two we show respondents additional information to the respondent (number of dry days) compared to treatment one. Hence this has the additional attribute.

Treatment group one [without attribute daymean] LCM with class membership:
Here i did not struggle so much to get the model to converge based on initial values. The coefficients also are within the ranges of MMNL, and the LCM without the class membership covariates.

Code: Select all

> # ################################################################# #
> #### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
> # ################################################################# #
> 
> ### Clear memory
> rm(list = ls())
> 
> ### Load libraries
> library(apollo)
> library(dplyr)
> library(corrplot)
> 
> # Set working directory and read data
> # Function to set working directory based on operating system
> set_working_directory <- function() {
+   os_info <- Sys.info()[["sysname"]]
+   
+   if (os_info == "Windows") {
+     setwd("C:/Users/lotte/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis")
+   } else if (os_info == "Darwin") {
+     setwd("~/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis") # macOS
+   } else if (os_info == "Linux") {
+     setwd("~/path/to/your/directory")
+   } else {
+     stop("Unsupported operating system")
+   }
+   cat("Working directory set to:", getwd(), "\n")
+ }
> 
> 
> # Set the working directory based on the OS
> set_working_directory()
Working directory set to: C:/Users/lotte/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis 
> 
> ### Initialise code
> apollo_initialise()
Apollo ignition sequence completed
> 
> ### Set core controls
> apollo_control = list(
+   modelName       = "LC_T1-3_3_class_all_Demog-CCatt-ANA",
+   modelDescr      = "LC model on treatment Ghana choice data, 3 classes, all coeffs vary, including demog-CCatt-ANA attitudes",
+   indivID         = "ID",
+   nCores          =  5, 
+   outputDirectory = "output"
+ )
> 
> 
> # ################################################################# #
> #### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
> # ################################################################# #
> 
> ### Loading data:
> database = read.csv("choice_data_1indicator_wide.csv", header=TRUE)
> 
> database <- database %>%
+   rename(ID = id) %>%
+   arrange(ID)
> 
> 
> 
> ### Rescaling variables:
> varlist <- c("rainMean_1", "rainMean_2", "rainMean_3", "variance_1", "variance_2", "variance_3")
> # Use mutate with across to divide values by 10
> database <- database %>%
+   mutate(across(all_of(varlist), ~ . / 10))
> # interpretation:
> # cm rainfall instead of mm rainfall.
> 
> 
> varlist <- c("costI_1", "costI_2", "costI_3")
> # Use mutate with across to divide values by 10000
> database <- database %>%
+   mutate(across(all_of(varlist), ~ . / 100))
> # interpretation:
> # costs in terms of 100 Ghanaian cedi.
> 
> 
> 
> # Define the variables to check
> vars_to_check <- c(
+   paste0("rainMean_", 1:3),
+   paste0("variance_", 1:3),
+   paste0("costsI_", 1:3),
+   paste0("fracI_", 1:3),
+   paste0("payback_", 1:3)
+ )
> 
> # Ensure the variables exist in the dataframe
> vars_to_check <- vars_to_check[vars_to_check %in% colnames(database)]
> 
> # Check for infinite values in the specified columns
> if (length(vars_to_check) > 0) {
+   infinite_values <- sapply(database[, vars_to_check, drop = FALSE], function(col) any(is.infinite(col)))
+   
+   # Show which variables contain infinite values
+   cat("Variables with infinite values:\n")
+   print(names(infinite_values[infinite_values]))
+ } else {
+   cat("No matching variables found in the dataframe.\n")
+ }
Variables with infinite values:
character(0)
> 
> 
> 
> # Find IDs with missing values in specified columns
> ids_with_na <- unique(database$ID[
+   is.na(database$causes_likert) | 
+     is.na(database$predict_likert) | 
+     is.na(database$reality_likert) |
+     is.na(database$valence_likert) | 
+     is.na(database$spat_dist_likert) | 
+     is.na(database$area_weather_likert) | 
+     is.na(database$temp_dist_likert)
+ ])
> 
> # Remove all rows with those IDs
> database <- database[!database$ID %in% ids_with_na, ]
> print(length(unique(database$ID)))
[1] 292
> 
> 
> database$attribute_1_rainfall <- ifelse(database$attribute_1 %in% c(1), 1, 0)
> database$attribute_1_drydays  <- ifelse(database$attribute_1 %in% c(2), 1, 0)
> database$attribute_1_fracI    <- ifelse(database$attribute_1 %in% c(3), 1, 0)
> database$attribute_1_costI    <- ifelse(database$attribute_1 %in% c(4), 1, 0)
> database$attribute_1_payback  <- ifelse(database$attribute_1 %in% c(5), 1, 0)
> database$attribute_1_other    <- ifelse(database$attribute_1 %in% c(88), 1, 0)
> database$attribute_1_none     <- ifelse(database$attribute_1 %in% c(99), 1, 0)
> table(database$attribute_ignore_reason.4)
  0   1 
378 138 
> 
> table(database$education)
  1   2   3   4   5   6   8   9 
270  42 294 912 186  18  24   6 
> database$education <- database$education - 1
> database$educ_below <- ifelse(database$education %in% c(0, 1, 2), 1, 0)
> database$educ_median <- ifelse(database$education %in% c(3), 1, 0)
> database$educ_above <- ifelse(database$education %in% c(4, 5, 6, 7, 8), 1, 0)
> 
> 
> # Remove variables with missing values:
> database <- database[, colSums(is.na(database)) == 0]
> 
> # ################################################################# #
> #### DEFINE MODEL PARAMETERS                                     ####
> # ################################################################# #
> 
> ### Vector of parameters, including any that are kept fixed in estimation
> apollo_beta = c(asc_sq_a         = 0,
+                 asc_sq_b         = 0,
+                 asc_sq_c         = 0,
+                 
+                 b_rainmean_a     = 0.007,
+                 b_rainmean_b     = -0.02,
+                 b_rainmean_c     = -0.07,
+                 
+                 b_variance_a     = -0.009,
+                 b_variance_b     = 0.01,
+                 b_variance_c     = 0.02,
+                 
+                 b_fracI_a        = 0.03,
+                 b_fracI_b        = 0.2,
+                 b_fracI_c        = 0.01,
+                 
+                 b_costI_a        = -0.8,
+                 b_costI_b        = -0.5,
+                 b_costI_c        = -0.7,
+                 
+                 b_payback_a      = 0.4,
+                 b_payback_b      = 0.3,
+                 b_payback_c      = 0.2,
+                 
+                 delta_a          = 0,
+                 delta_b          = 0,
+                 delta_c          = 0,
+                 
+                 gamma_educ_b_a   = 0,
+                 gamma_educ_b_b   = 0,
+                 gamma_educ_b_c   = 0,
+ 
+                 gamma_reality_a        = -0.9,
+                 gamma_reality_b        = 0,
+                 gamma_reality_c        = 0,
+ 
+                 gamma_predict_a        = 0.5,
+                 gamma_predict_b        = 0,
+                 gamma_predict_c        = 0,
+                 
+                 gamma_ANAcosts_a       = 0,
+                 gamma_ANAcosts_b       = 0,
+                 gamma_ANAcosts_c       = 0
+                 )
> 
> 
> ### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
> apollo_fixed = c("delta_c",
+                  "gamma_educ_b_c", 
+                  "gamma_reality_c",
+                  "gamma_predict_c",
+                  "gamma_ANAcosts_c"
+ )
> 
> 
> # ################################################################# #
> #### DEFINE LATENT CLASS COMPONENTS                              ####
> # ################################################################# #
> 
> apollo_lcPars=function(apollo_beta, apollo_inputs){
+   lcpars = list()
+   
+   lcpars[["b_rainmean"]]  = list(b_rainmean_a, b_rainmean_b, b_rainmean_c)
+   lcpars[["asc_sq"]]      = list(asc_sq_a, asc_sq_b, asc_sq_c)
+   lcpars[["b_variance"]]  = list(b_variance_a, b_variance_b, b_variance_c)
+   lcpars[["b_fracI"]]     = list(b_fracI_a, b_fracI_b, b_fracI_c)
+   lcpars[["b_costI"]]     = list(b_costI_a, b_costI_b, b_costI_c)
+   lcpars[["b_payback"]]   = list(b_payback_a, b_payback_b, b_payback_c)
+   
+   
+   V=list()
+   
+   V[["class_a"]] = delta_a +
+     gamma_educ_b_a   * educ_below +
+     gamma_reality_a*reality_likert +
+     gamma_predict_a*predict_likert +
+     gamma_ANAcosts_a*attribute_ignore.4
+ 
+   
+   V[["class_b"]] = delta_b +
+     gamma_educ_b_b   * educ_below +
+     gamma_reality_b * reality_likert +
+     gamma_predict_b * predict_likert +
+     gamma_ANAcosts_b*attribute_ignore.4
+ 
+   
+   V[["class_c"]] = delta_c + 
+     gamma_educ_b_c   * educ_below +
+     gamma_reality_c*reality_likert +
+     gamma_predict_c*predict_likert +
+     gamma_ANAcosts_c*attribute_ignore.4
+ 
+   
+   classAlloc_settings = list(
+     alternatives = c(class_a=1, class_b=2, class_c=3), 
+     V            = V
+   )
+   
+   lcpars[["pi_values"]] = apollo_classAlloc(classAlloc_settings)
+   
+   return(lcpars)
+ }
> 
> # ################################################################# #
> #### GROUP AND VALIDATE INPUTS                                   ####
> # ################################################################# #
> 
> apollo_inputs = apollo_validateInputs()
Several observations per individual detected based on the value of ID. Setting panelData in apollo_control set to TRUE.
All checks on apollo_control completed.
All checks on database completed.
> 
> # ################################################################# #
> #### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
> # ################################################################# #
> 
> 
> apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
+   
+   ### Attach inputs and detach after function exit
+   apollo_attach(apollo_beta, apollo_inputs)
+   on.exit(apollo_detach(apollo_beta, apollo_inputs))
+   
+   ### Create list of probabilities P
+   P = list()
+   
+   ### Define settings for MNL model component that are generic across classes
+   mnl_settings = list(
+     alternatives = c(option1=1, option2=2, sq=3),
+     avail        = list(option1=1, option2=1, sq=1),
+     choiceVar    = choice
+   )
+   
+   
+   ### Loop over classes
+   for(s in 1:3){
+     
+     ### Compute class-specific utilities
+     V = list()
+     V[["option1"]]  = b_rainmean[[s]] * rainMean_1 + 
+       b_variance[[s]] * variance_1 + 
+       b_fracI[[s]] *fracI_1 + 
+       b_costI[[s]] * costI_1 + 
+       b_payback[[s]] * payback_1 
+     
+     
+     V[["option2"]]  = b_rainmean[[s]] * rainMean_2 + 
+       b_variance[[s]] * variance_2 + 
+       b_fracI[[s]] *fracI_2 + 
+       b_costI[[s]] * costI_2 + 
+       b_payback[[s]] * payback_2 
+     
+     
+     V[["sq"]]       = asc_sq[[s]]     
+     
+     
+     mnl_settings$utilities = V
+     mnl_settings$componentName = paste0("Class_",s)
+     
+     ### Compute within-class choice probabilities using MNL model
+     P[[paste0("Class_",s)]] = apollo_mnl(mnl_settings, functionality)
+     
+     ### Take product across observation for same individual
+     P[[paste0("Class_",s)]] = apollo_panelProd(P[[paste0("Class_",s)]], apollo_inputs ,functionality)
+   }
+   
+   ### Compute latent class model probabilities
+   lc_settings   = list(inClassProb = P, classProb=pi_values)
+   P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)
+   
+   ### Prepare and return outputs of function
+   P = apollo_prepareProb(P, apollo_inputs, functionality)
+   return(P)
+ }
> 
> # ################################################################# #
> #### MODEL ESTIMATION                                            ####
> # ################################################################# #
> 
> model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
Preparing user-defined functions.

Testing likelihood function...
Apollo found a model component of type classAlloc without a componentName. The name was set to "classAlloc" by default.
INFORMATION: Setting "avail" is missing, so full availability is assumed. 

Overview of choices for MNL model component Class_1:
                                 option1 option2      sq
Times available                  1752.00 1752.00 1752.00
Times chosen                      707.00  822.00  223.00
Percentage chosen overall          40.35   46.92   12.73
Percentage chosen when available   40.35   46.92   12.73


Overview of choices for MNL model component Class_2:
                                 option1 option2      sq
Times available                  1752.00 1752.00 1752.00
Times chosen                      707.00  822.00  223.00
Percentage chosen overall          40.35   46.92   12.73
Percentage chosen when available   40.35   46.92   12.73


Overview of choices for MNL model component Class_3:
                                 option1 option2      sq
Times available                  1752.00 1752.00 1752.00
Times chosen                      707.00  822.00  223.00
Percentage chosen overall          40.35   46.92   12.73
Percentage chosen when available   40.35   46.92   12.73


Summary of class allocation for model component :
         Mean prob.
Class_1      0.3712
Class_2      0.3144
Class_3      0.3144
The class allocation probabilities for model component "model" are calculated at the observation level in 'apollo_lcPars', but are used in
  'apollo_probabilities' to multiply within class probabilities that are at the individual level. Apollo will average the class allocation
  probabilities across observations for the same individual level before using them to multiply the within-class probabilities. If your class
  allocation probabilities are constant across choice situations for the same individual, then this is of no concern. If your class allocation
  probabilities however vary across choice tasks, then you should change your model specification in 'apollo_probabilities' to only call
  'apollo_panelProd' after calling 'apollo_lc'.

Pre-processing likelihood function...
Creating cluster...
Preparing workers for multithreading...

Testing influence of parameters
Starting main estimation

BGW using analytic model derivatives supplied by caller...


Iterates will be written to: 
 output/LC_T1-3_3_class_all_Demog-CCatt-ANA_iterations.csv
    it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
     0     1 1.252453379e+03
     1     4 1.166509803e+03 6.862e-02 5.573e-02 1.42e-01   G   1.37e-01
     2     5 1.136445346e+03 2.577e-02 2.455e-02 1.28e-01   G   8.57e-02
     3     6 1.127176718e+03 8.156e-03 7.620e-03 1.22e-01   S   1.31e-02
     4     7 1.125511688e+03 1.477e-03 1.571e-03 2.31e-01   S   4.48e-03
     5     8 1.125218260e+03 2.607e-04 5.696e-04 2.92e-01   S   1.79e-03
     6    10 1.125017150e+03 1.787e-04 3.212e-04 9.71e-02   S   5.34e-03
     7    11 1.124976681e+03 3.597e-05 6.245e-05 5.18e-02   S   0.00e+00
     8    12 1.124959508e+03 1.526e-05 1.899e-05 3.32e-02   S   0.00e+00
     9    13 1.124954407e+03 4.535e-06 4.802e-06 8.15e-03   S   0.00e+00
    10    14 1.124952201e+03 1.960e-06 1.425e-06 8.86e-03   S   0.00e+00
    11    16 1.124950882e+03 1.173e-06 9.279e-07 1.43e-03  G-S  0.00e+00
    12    17 1.124950535e+03 3.085e-07 2.527e-07 7.60e-04   S   0.00e+00
    13    18 1.124950458e+03 6.839e-08 5.810e-08 2.01e-03   S   0.00e+00
    14    19 1.124950442e+03 1.391e-08 1.586e-08 7.61e-04   S   0.00e+00
    15    20 1.124950437e+03 4.991e-09 4.611e-09 2.18e-04   S   0.00e+00
    16    22 1.124950436e+03 7.718e-10 8.769e-10 1.83e-04  G-S  0.00e+00
    17    23 1.124950436e+03 1.428e-10 1.579e-10 1.60e-04   S   0.00e+00
    18    24 1.124950436e+03 4.093e-11 4.258e-11 3.32e-05   S   0.00e+00

***** Relative function convergence *****

Estimated parameters with approximate standard errors from BHHH matrix:
                    Estimate     BHHH se BHH t-ratio (0)
asc_sq_a           -0.828714    0.517915        -1.60010
asc_sq_b            1.799647    3.336250         0.53942
asc_sq_c            5.807879   11.915533         0.48742
b_rainmean_a        0.004210    0.005862         0.71822
b_rainmean_b      6.1869e-04    0.038504         0.01607
b_rainmean_c       -0.012996    0.145363        -0.08940
b_variance_a       -0.005039    0.010864        -0.46381
b_variance_b       -0.024367    0.039298        -0.62007
b_variance_c        0.047588    0.451244         0.10546
b_fracI_a           0.022335    0.002243         9.95615
b_fracI_b           0.158355    0.040001         3.95875
b_fracI_c           0.032552    0.078088         0.41686
b_costI_a          -0.774423    0.141074        -5.48946
b_costI_b          -2.499484    0.859111        -2.90939
b_costI_c          -1.121240    2.156101        -0.52003
b_payback_a         0.419895    0.035194        11.93076
b_payback_b         0.234768    0.137467         1.70781
b_payback_c         0.435231    0.365201         1.19176
delta_a             2.800874    0.780981         3.58635
delta_b             2.191453    1.176675         1.86241
delta_c             0.000000          NA              NA
gamma_educ_b_a     -1.253143    0.467050        -2.68310
gamma_educ_b_b     -1.278983    0.581957        -2.19773
gamma_educ_b_c      0.000000          NA              NA
gamma_reality_a    -0.964777    0.349006        -2.76436
gamma_reality_b    -2.473164    0.737257        -3.35455
gamma_reality_c     0.000000          NA              NA
gamma_predict_a     0.494142    0.223809         2.20787
gamma_predict_b     0.931369    0.283087         3.29004
gamma_predict_c     0.000000          NA              NA
gamma_ANAcosts_a   -1.281211    0.596300        -2.14860
gamma_ANAcosts_b   -0.408260    0.732043        -0.55770
gamma_ANAcosts_c    0.000000          NA              NA

Final LL: -1124.9504


Summary of class allocation for model component :
         Mean prob.
Class_1      0.6709
Class_2      0.2151
Class_3      0.1141

Calculating log-likelihood at equal shares (LL(0)) for applicable models...
Calculating log-likelihood at observed shares from estimation data (LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Computing covariance matrix using numerical jacobian of analytical gradient.
 0%....25%....50%....75%...100%
Negative definite Hessian with maximum eigenvalue: -0.061696
Computing score matrix...

Your model was estimated using the BGW algorithm. Please acknowledge this by citing Bunch et al. (1993) - DOI 10.1145/151271.151279
> 
> 
> # ################################################################# #
> #### MODEL OUTPUTS                                               ####
> # ################################################################# #
> 
> # ----------------------------------------------------------------- #
> #---- FORMATTED OUTPUT (TO SCREEN)                               ----
> # ----------------------------------------------------------------- #
> 
> # apollo_modelOutput(model)
> apollo_modelOutput(model, modelOutput_settings = list(printPVal=2)) # for two sided t-test
Model run by lotte using Apollo 0.3.4 on R 4.4.2 for Windows.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
  DOI 10.1016/j.jocm.2019.100170
  www.ApolloChoiceModelling.com

Model name                                  : LC_T1-3_3_class_all_Demog-CCatt-ANA
Model description                           : LC model on treatment Ghana choice data, 3 classes, all coeffs vary, including demog-CCatt-ANA attitudes
Model run at                                : 2025-02-25 15:33:05.815117
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : -0.061696
     reciprocal of condition number         : 1.5165e-07
Number of individuals                       : 292
Number of rows in database                  : 1752
Number of modelled outcomes                 : 1752

Number of cores used                        :  5 
Model without mixing

LL(start)                                   : -1252.45
LL (whole model) at equal shares, LL(0)     : -1924.77
LL (whole model) at observed shares, LL(C)  : -1723.34
LL(final, whole model)                      : -1124.95
Rho-squared vs equal shares                  :  0.4155 
Adj.Rho-squared vs equal shares              :  0.401 
Rho-squared vs observed shares               :  0.3472 
Adj.Rho-squared vs observed shares           :  0.3345 
AIC                                         :  2305.9 
BIC                                         :  2459.02 

LL(0,Class_1)                    : -1924.77
LL(final,Class_1)                : -1788.35
LL(0,Class_2)                    : -1924.77
LL(final,Class_2)                : -4113
LL(0,Class_3)                    : -1924.77
LL(final,Class_3)                : -5627.14

Estimated parameters                        : 28
Time taken (hh:mm:ss)                       :  00:00:11.23 
     pre-estimation                         :  00:00:5.02 
     estimation                             :  00:00:0.73 
     post-estimation                        :  00:00:5.48 
Iterations                                  :  18  

Unconstrained optimisation.

Estimates:
                    Estimate        s.e.   t.rat.(0)  p(2-sided)    Rob.s.e. Rob.t.rat.(0)  p(2-sided)
asc_sq_a           -0.828714    0.554544    -1.49441    0.135069    0.628567      -1.31842    0.187364
asc_sq_b            1.799647    2.188124     0.82246    0.410814    1.671131       1.07690    0.281523
asc_sq_c            5.807879    4.022171     1.44397    0.148748    2.106988       2.75648    0.005843
b_rainmean_a        0.004210    0.005860     0.71854    0.472426    0.006067       0.69392    0.487731
b_rainmean_b      6.1869e-04    0.025759     0.02402    0.980838    0.021849       0.02832    0.977409
b_rainmean_c       -0.012996    0.045109    -0.28810    0.773268    0.026175      -0.49652    0.619529
b_variance_a       -0.005039    0.010805    -0.46635    0.640967    0.011148      -0.45197    0.651291
b_variance_b       -0.024367    0.041356    -0.58921    0.555719    0.049771      -0.48959    0.624422
b_variance_c        0.047588    0.085661     0.55553    0.578531    0.076031       0.62589    0.531384
b_fracI_a           0.022335    0.002366     9.43837    0.000000    0.002578       8.66223    0.000000
b_fracI_b           0.158355    0.043039     3.67934  2.3384e-04    0.055436       2.85652    0.004283
b_fracI_c           0.032552    0.016293     1.99789    0.045728    0.015593       2.08764    0.036831
b_costI_a          -0.774423    0.160411    -4.82773   1.381e-06    0.194125      -3.98930   6.627e-05
b_costI_b          -2.499484    0.917010    -2.72569    0.006417    1.487384      -1.68046    0.092868
b_costI_c          -1.121240    1.197073    -0.93665    0.348938    0.925733      -1.21119    0.225822
b_payback_a         0.419895    0.040242    10.43430    0.000000    0.050872       8.25396   2.220e-16
b_payback_b         0.234768    0.120153     1.95391    0.050712    0.157229       1.49316    0.135394
b_payback_c         0.435231    0.216615     2.00924    0.044512    0.251211       1.73254    0.083178
delta_a             2.800874    0.840833     3.33107  8.6513e-04    0.990213       2.82856    0.004676
delta_b             2.191453    1.159220     1.89046    0.058697    1.272222       1.72254    0.084972
delta_c             0.000000          NA          NA          NA          NA            NA          NA
gamma_educ_b_a     -1.253143    0.420443    -2.98053    0.002878    0.436585      -2.87033    0.004100
gamma_educ_b_b     -1.278983    0.545047    -2.34655    0.018948    0.580078      -2.20485    0.027465
gamma_educ_b_c      0.000000          NA          NA          NA          NA            NA          NA
gamma_reality_a    -0.964777    0.322882    -2.98802    0.002808    0.345880      -2.78934    0.005282
gamma_reality_b    -2.473164    0.689278    -3.58805  3.3316e-04    0.724297      -3.41457  6.3882e-04
gamma_reality_c     0.000000          NA          NA          NA          NA            NA          NA
gamma_predict_a     0.494142    0.266767     1.85234    0.063977    0.338253       1.46086    0.144053
gamma_predict_b     0.931369    0.324451     2.87060    0.004097    0.413273       2.25364    0.024219
gamma_predict_c     0.000000          NA          NA          NA          NA            NA          NA
gamma_ANAcosts_a   -1.281211    0.547175    -2.34150    0.019206    0.539301      -2.37569    0.017516
gamma_ANAcosts_b   -0.408260    0.727260    -0.56137    0.574547    0.867087      -0.47084    0.637754
gamma_ANAcosts_c    0.000000          NA          NA          NA          NA            NA          NA


Summary of class allocation for model component :
         Mean prob.
Class_1      0.6709
Class_2      0.2151
Class_3      0.1141


Treatment group two [with attribute daymean] LCM with class membership:
In the case without the class membership covariates, the cost parameter is also positive, although not significant. In previous MMNL models, we do not find positive costs (although there is significant variance), and results are similar to those found in treatment one.
As mentioned in my earlier post, I struggled to find initial values for which the models converged. Here for treatment group two, I used the values from treatment group one LCM.

Code: Select all

> # ################################################################# #
> #### LOAD LIBRARY AND DEFINE CORE SETTINGS                       ####
> # ################################################################# #
> 
> ### Clear memory
> rm(list = ls())
> 
> ### Load libraries
> library(apollo)
> library(dplyr)
> library(corrplot)
> 
> # Set working directory and read data
> # Function to set working directory based on operating system
> set_working_directory <- function() {
+   os_info <- Sys.info()[["sysname"]]
+   
+   if (os_info == "Windows") {
+     setwd("C:/Users/lotte/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis")
+   } else if (os_info == "Darwin") {
+     setwd("~/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis") # macOS
+   } else if (os_info == "Linux") {
+     setwd("~/path/to/your/directory")
+   } else {
+     stop("Unsupported operating system")
+   }
+   cat("Working directory set to:", getwd(), "\n")
+ }
> 
> 
> # Set the working directory based on the OS
> set_working_directory()
Working directory set to: C:/Users/lotte/surfdrive/WORK/PhD/3-GhanaCE/Ghana-CE-Analysis 
> ### Initialise code
> apollo_initialise()
Apollo ignition sequence completed
> 
> ### Set core controls
> apollo_control = list(
+   modelName       = "LC_T2-4_3_class_all_demog_CCatt_ANA",
+   modelDescr      = "LC model on T2-4 treatment Ghana choice data, 3 classes, all coeffs vary, including demographic information",
+   indivID         = "ID",
+   nCores          =  5, 
+   # mixing = FALSE,
+   debug = TRUE,
+   # workInLogs=TRUE,
+   outputDirectory = "output"
+ )
> 
> # ################################################################# #
> #### LOAD DATA AND APPLY ANY TRANSFORMATIONS                     ####
> # ################################################################# #
> 
> ### Loading data:
> database = read.csv("choice_data_2indicator_wide.csv", header=TRUE)
> 
> database <- database %>%
+   rename(ID = id) %>%
+   arrange(ID)
> 
> 
> 
> ### Rescaling variables:
> varlist <- c("rainMean_1", "rainMean_2", "rainMean_3", "rainVariance_1", "rainVariance_2", "rainVariance_3")
> # Use mutate with across to divide values by 10
> database <- database %>%
+   mutate(across(all_of(varlist), ~ . / 10))
> # interpretation:
> # cm rainfall instead of mm rainfall.
> 
> 
> varlist <- c("costI_1", "costI_2", "costI_3")
> # Use mutate with across to divide values by 10000
> database <- database %>%
+   mutate(across(all_of(varlist), ~ . / 100))
> # interpretation:
> # costs in terms of 100 Ghanaian cedi.
> 
> 
> # Define the variables to check
> vars_to_check <- c(
+   paste0("rainMean_", 1:3),
+   paste0("rainVariance_", 1:3),
+   paste0("dayMean_", 1:3),
+   paste0("costsI_", 1:3),
+   paste0("fracI_", 1:3),
+   paste0("payback_", 1:3)
+ )
> 
> # Ensure the variables exist in the dataframe
> vars_to_check <- vars_to_check[vars_to_check %in% colnames(database)]
> 
> # Check for infinite values in the specified columns
> if (length(vars_to_check) > 0) {
+   infinite_values <- sapply(database[, vars_to_check, drop = FALSE], function(col) any(is.infinite(col)))
+   
+   # Show which variables contain infinite values
+   cat("Variables with infinite values:\n")
+   print(names(infinite_values[infinite_values]))
+ } else {
+   cat("No matching variables found in the dataframe.\n")
+ }
Variables with infinite values:
character(0)
> 
> 
> # Rename columns
> database <- database %>%
+   rename(
+     variance_1 = rainVariance_1,
+     variance_2 = rainVariance_2,
+     variance_3 = rainVariance_3
+   )
> 
> 
> 
> print(paste("nr unique individuals:", length(unique(database$ID))))
[1] "nr unique individuals: 321"
> 
> 
> table(database$education)
   1    2    3    4    5    6    7    8 
 300   42  306 1032  126   84   12   24 
> database$education <- database$education - 1
> database$educ_below <- ifelse(database$education %in% c(0, 1, 2), 1, 0)
> database$educ_median <- ifelse(database$education %in% c(3), 1, 0)
> database$educ_above <- ifelse(database$education %in% c(4, 5, 6, 7, 8), 1, 0)
> 
> 
> # Remove variables with missing values:
> database <- database[, colSums(is.na(database)) == 0]
> 
> 
> 
> # ################################################################# #
> #### DEFINE MODEL PARAMETERS                                     ####
> # ################################################################# #
> 
> ### Vector of parameters, including any that are kept fixed in estimation
> apollo_beta = c(    
+                     asc_sq_a         = -0.83,
+                     asc_sq_b         = 1.80,
+                     asc_sq_c         = 5.80,
+ 
+                     b_rainmean_a     = 0,
+                     b_rainmean_b     = 0,
+                     b_rainmean_c     = -0.01,
+ 
+                     b_variance_a     = 0,
+                     b_variance_b     = -0.02,
+                     b_variance_c     = 0.05,
+ 
+                     b_daymean_a      = 0.1,
+                     b_daymean_b      = 0.05,
+                     b_daymean_c      = 0,
+ 
+                     b_fracI_a        = 0.02,
+                     b_fracI_b        = 0.16,
+                     b_fracI_c        = 0.03,
+ 
+                     b_costI_a        = -0.77,
+                     b_costI_b        = -2.50,
+                     b_costI_c        = -1.12,
+ 
+                     b_payback_a      = 0.42,
+                     b_payback_b      = 0.23,
+                     b_payback_c      = 0.44,
+ 
+                     delta_a          = 2.8,
+                     delta_b          = 2.19,
+                     delta_c          = 0,
+ 
+                     gamma_educ_b_a   = -1.25,
+                     gamma_educ_b_b   = -1.28,
+                     gamma_educ_b_c   = 0,
+ 
+                     gamma_reality_a   = -0.96,
+                     gamma_reality_b   = -2.47,
+                     gamma_reality_c   = 0,
+ 
+                     gamma_predict_a   = 0.49,
+                     gamma_predict_b   = 0.93,
+                     gamma_predict_c   = 0,
+ 
+                     gamma_ANAcosts_a       = -1.28,
+                     gamma_ANAcosts_b       = -0.41,
+                     gamma_ANAcosts_c       = 0
+     
+ )
> 
> 
> ### Vector with names (in quotes) of parameters to be kept fixed at their starting value in apollo_beta, use apollo_beta_fixed = c() if none
> apollo_fixed = c("delta_c", 
+                  "gamma_educ_b_c",
+                  "gamma_reality_c",
+                  "gamma_predict_c",
+                  "gamma_ANAcosts_c"
+ )
> 
> 
> # ################################################################# #
> #### DEFINE LATENT CLASS COMPONENTS                              ####
> # ################################################################# #
> 
> apollo_lcPars=function(apollo_beta, apollo_inputs){
+   lcpars = list()
+   
+   lcpars[["asc_sq"]]      = list(asc_sq_a, asc_sq_b, asc_sq_c)
+   lcpars[["b_rainmean"]]  = list(b_rainmean_a, b_rainmean_b, b_rainmean_c)
+   lcpars[["b_variance"]]  = list(b_variance_a, b_variance_b, b_variance_c)
+   lcpars[["b_daymean"]]   = list(b_daymean_a, b_daymean_b, b_daymean_c)
+   lcpars[["b_fracI"]]     = list(b_fracI_a, b_fracI_b, b_fracI_c)
+   lcpars[["b_costI"]]     = list(b_costI_a, b_costI_b, b_costI_c)
+   lcpars[["b_payback"]]   = list(b_payback_a, b_payback_b, b_payback_c)
+   
+   
+   V=list()
+   
+   V[["class_a"]] = delta_a +
+     gamma_educ_b_a * educ_below +
+     gamma_reality_a*reality_likert +
+     gamma_predict_a*predict_likert +
+     gamma_ANAcosts_a    * attribute_ignore.4
+   
+ 
+   
+   
+   V[["class_b"]] = delta_b +
+     gamma_educ_b_b * educ_below +
+     gamma_reality_b*reality_likert +
+     gamma_predict_b*predict_likert +
+     gamma_ANAcosts_b*attribute_ignore.4
+ 
+ 
+   
+   V[["class_c"]] = delta_c +
+     gamma_educ_b_c * educ_below +
+     gamma_reality_c*reality_likert +
+     gamma_predict_c*predict_likert +
+     gamma_ANAcosts_c*attribute_ignore.4
+   
+   
+   
+   classAlloc_settings = list(
+     alternatives = c(class_a=1, class_b=2, class_c=3), 
+     V            = V
+   )
+   
+   lcpars[["pi_values"]] = apollo_classAlloc(classAlloc_settings)
+   
+   return(lcpars)
+ }
> 
> # ################################################################# #
> #### GROUP AND VALIDATE INPUTS                                   ####
> # ################################################################# #
> 
> apollo_inputs = apollo_validateInputs()
Missing setting for workInLogs in apollo_control, set to default of FALSE
Missing setting for seed in apollo_control, set to default of 13
Missing setting for mixing in apollo_control, set to default of FALSE
Missing setting for HB in apollo_control, set to default of FALSE
Missing setting memorySaver in apollo_control, set to default of FALSE
Several observations per individual detected based on the value of ID. Setting panelData in apollo_control set to TRUE.
Missing setting for cpp in apollo_control, set to default of FALSE.
Missing setting for analyticGrad in apollo_control, set to default of TRUE
Missing setting for matrixMult in apollo_control, set to default of FALSE
Missing setting for overridePanel in apollo_control, set to default of FALSE.
Missing setting for preventOverridePanel in apollo_control, set to default of FALSE.
Missing setting for noModification in apollo_control, set to default of FALSE.
All checks on apollo_control completed.
All checks on database completed.
> 
> # ################################################################# #
> #### DEFINE MODEL AND LIKELIHOOD FUNCTION                        ####
> # ################################################################# #
> 
> apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
+   
+   ### Attach inputs and detach after function exit
+   apollo_attach(apollo_beta, apollo_inputs)
+   
+   ### Create list of probabilities P
+   P = list()
+   
+   ### Define settings for MNL model component that are generic across classes
+   mnl_settings = list(
+     alternatives = c(option1=1, option2=2, sq=3),
+     avail        = list(option1=1, option2=1, sq=1),
+     choiceVar    = choice
+   )
+   
+   ### Loop over classes
+   for(s in 1:3){
+     
+     
+     ### Compute class-specific utilities
+     V = list()
+     V[["option1"]]  = b_rainmean[[s]] * rainMean_1 + 
+       b_variance[[s]] * variance_1 + 
+       b_daymean[[s]] * dayMean_1 +
+       b_fracI[[s]] *fracI_1 + 
+       b_costI[[s]] * costI_1 + 
+       b_payback[[s]] * payback_1 
+     
+     
+     V[["option2"]]  = b_rainmean[[s]] * rainMean_2 + 
+       b_variance[[s]] * variance_2 + 
+       b_daymean[[s]] * dayMean_2 +
+       b_fracI[[s]] *fracI_2 + 
+       b_costI[[s]] * costI_2 + 
+       b_payback[[s]] * payback_2 
+     
+     V[["sq"]]       = asc_sq[[s]]   
+     
+     
+     mnl_settings$utilities = V
+     mnl_settings$componentName = paste0("Class_",s)
+     
+     ### Compute within-class choice probabilities using MNL model
+     P[[paste0("Class_",s)]] = apollo_mnl(mnl_settings, functionality)
+     
+     ### Take product across observation for same individual
+     P[[paste0("Class_",s)]] = apollo_panelProd(P[[paste0("Class_",s)]], apollo_inputs ,functionality)
+   }
+   
+   ### Compute latent class model probabilities
+   lc_settings   = list(inClassProb = P, classProb=pi_values)
+   P[["model"]] = apollo_lc(lc_settings, apollo_inputs, functionality)
+   
+   ### Prepare and return outputs of function
+   P = apollo_prepareProb(P, apollo_inputs, functionality)
+   return(P)
+ }
> 
> # ################################################################# #
> #### MODEL ESTIMATION                                            ####
> # ################################################################# #
> ### Define settings for MNL model component that are generic across classes
> # estimate_settings = list(
> #   maxIterations = 200
> # )
> 
> 
> model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs
+                         # estimate_settings
+                         )
Preparing user-defined functions.
- Inserting component name in apollo_probabilities
- Replacing tau=c(...) by tau=list(...) in calls to apollo_ol.
- Expanding loops in apollo_lcPars.
- Expanding loops in apollo_probabilities.
- Inserting scaling in apollo_probabilities
- Inserting scaling in apollo_lcPars
- Inserting quotes in settings for apollo_rrm (if present)
- Inserting function() in user-defined functions

Testing likelihood function...
Apollo found a model component of type classAlloc without a componentName. The name was set to "classAlloc" by default.
INFORMATION: Setting "avail" is missing, so full availability is assumed. 

Overview of choices for MNL model component Class_1:
                                 option1 option2      sq
Times available                  1926.00 1926.00 1926.00
Times chosen                      802.00  894.00  230.00
Percentage chosen overall          41.64   46.42   11.94
Percentage chosen when available   41.64   46.42   11.94


Overview of choices for MNL model component Class_2:
                                 option1 option2      sq
Times available                  1926.00 1926.00 1926.00
Times chosen                      802.00  894.00  230.00
Percentage chosen overall          41.64   46.42   11.94
Percentage chosen when available   41.64   46.42   11.94


Overview of choices for MNL model component Class_3:
                                 option1 option2      sq
Times available                  1926.00 1926.00 1926.00
Times chosen                      802.00  894.00  230.00
Percentage chosen overall          41.64   46.42   11.94
Percentage chosen when available   41.64   46.42   11.94


Summary of class allocation for model component :
         Mean prob.
Class_1      0.6603
Class_2      0.2158
Class_3      0.1239
The class allocation probabilities for model component "model" are calculated at the observation level in 'apollo_lcPars', but are used in
  'apollo_probabilities' to multiply within class probabilities that are at the individual level. Apollo will average the class allocation
  probabilities across observations for the same individual level before using them to multiply the within-class probabilities. If your class
  allocation probabilities are constant across choice situations for the same individual, then this is of no concern. If your class allocation
  probabilities however vary across choice tasks, then you should change your model specification in 'apollo_probabilities' to only call
  'apollo_panelProd' after calling 'apollo_lc'.

Pre-processing likelihood function...
Creating cluster...
Attempting to split data into 5 pieces.
Obs. per worker (thread): 390, 390, 390, 390, 366
Writing pieces to disk.....
Writing completed. 220.3MB of RAM in use.
Preparing workers for multithreading...
Cleaning memory in main thread... Done. 220.3MB of RAM in use.
Loading libraries and likelihood function... Done. 905.3MB of RAM in use.
Loading data... Done. 907.3MB of RAM in use
Creating apollo_logLike...
Pre-processing model...
Preparing pre-processing report
ComponentName  Type   Gradient Optimisation
Class_1        MNL    analytic
Class_2        MNL    analytic
Class_3        MNL    analytic
model          LC     analytic
Whole model gradient function creation succeeded.
Counting number of observations... Done.

Testing influence of parameters
Starting main estimation

BGW using analytic model derivatives supplied by caller...


Iterates will be written to: 
 output/LC_T2-4_3_class_all_demog_CCatt_ANA_iterations.csv
    it    nf     F            RELDF    PRELDF    RELDX    MODEL stppar
     0     1 1.215831761e+03
     1     3 1.198194825e+03 1.451e-02 1.530e-02 3.25e-02   G   9.19e-01
     2     5 1.184226018e+03 1.166e-02 1.400e-02 1.03e-01   G   1.29e-01
     3     6 1.179350820e+03 4.117e-03 4.371e-03 9.45e-02   G   1.22e-02
     4     8 1.176429265e+03 2.477e-03 1.970e-03 1.41e-01  G-S  3.40e-03
     5     9 1.174521088e+03 1.622e-03 1.322e-03 1.22e-01   S   3.29e-03
     6    10 1.172950396e+03 1.337e-03 9.713e-04 1.55e-01   S   1.74e-03
     7    11 1.172478491e+03 4.023e-04 8.993e-04 1.65e-01   S   1.60e-03
     8    13 1.171119176e+03 1.159e-03 1.159e-03 4.04e-02   S   3.73e-02
     9    14 1.170300287e+03 6.992e-04 7.203e-04 4.52e-02   S   2.24e-02
    10    15 1.169687081e+03 5.240e-04 4.260e-04 4.09e-02   S   1.41e-02
    11    16 1.169017980e+03 5.720e-04 6.169e-04 8.43e-02   S   3.94e-03
    12    17 1.168716614e+03 2.578e-04 2.959e-04 9.54e-02   S   1.30e-03
    13    18 1.168212828e+03 4.311e-04 2.696e-04 8.89e-02   S   0.00e+00
    14    20 1.167220655e+03 8.493e-04 8.503e-04 1.60e-01  G-S  0.00e+00
    15    22 1.167047692e+03 1.482e-04 4.717e-04 4.65e-02   S   8.24e-03
    16    23 1.166790862e+03 2.201e-04 2.880e-04 3.37e-02   S   2.40e-03
    17    24 1.166743129e+03 4.091e-05 8.626e-05 5.80e-02   S   1.05e-03
    18    25 1.166711697e+03 2.694e-05 3.382e-05 5.24e-02   S   2.56e-04
    19    26 1.166704733e+03 5.969e-06 7.355e-06 9.73e-03   S   0.00e+00
    20    27 1.166701110e+03 3.106e-06 2.758e-06 1.19e-02   S   0.00e+00
    21    28 1.166698858e+03 1.930e-06 1.474e-06 2.71e-02   S   0.00e+00
    22    29 1.166697241e+03 1.387e-06 1.304e-06 4.09e-02   S   0.00e+00
    23    30 1.166696473e+03 6.579e-07 7.750e-07 3.29e-02   S   0.00e+00
    24    31 1.166696131e+03 2.934e-07 2.909e-07 1.81e-02   S   0.00e+00
    25    32 1.166695970e+03 1.380e-07 8.949e-08 1.06e-02   S   0.00e+00
    26    35 1.166695950e+03 1.685e-08 3.844e-08 3.91e-04   G   6.23e-03
    27    36 1.166695941e+03 7.745e-09 4.571e-08 4.98e-04   G   1.08e-02
    28    37 1.166695929e+03 1.016e-08 2.077e-08 3.76e-04   S   1.73e-03
    29    38 1.166695914e+03 1.277e-08 1.730e-08 4.04e-04   S   1.02e-03
    30    39 1.166695907e+03 6.263e-09 6.919e-09 4.77e-04   S   8.66e-04
    31    44 1.166695666e+03 2.065e-07 1.826e-07 5.10e-02   S   0.00e+00
    32    45 1.166695616e+03 4.292e-08 3.748e-08 2.21e-02   S   0.00e+00
    33    46 1.166695580e+03 3.043e-08 2.145e-08 2.14e-02   S   0.00e+00
    34    47 1.166695560e+03 1.751e-08 1.939e-08 3.32e-02   S   0.00e+00
    35    48 1.166695549e+03 9.181e-09 8.550e-09 1.76e-02   S   0.00e+00
    36    49 1.166695546e+03 2.999e-09 3.991e-09 1.42e-02   S   0.00e+00
    37    50 1.166695542e+03 2.864e-09 2.134e-09 8.57e-03   S   0.00e+00
    38    51 1.166695541e+03 1.656e-09 1.804e-09 1.79e-02   S   0.00e+00
    39    52 1.166695539e+03 1.215e-09 1.209e-09 1.43e-02   S   0.00e+00
    40    53 1.166695538e+03 6.355e-10 5.918e-10 1.11e-02   S   0.00e+00
    41    54 1.166695538e+03 4.659e-10 3.407e-10 1.01e-02   S   0.00e+00
    42    55 1.166695537e+03 4.380e-10 3.482e-10 1.90e-02   S   0.00e+00
    43    56 1.166695537e+03 2.642e-10 2.319e-10 2.22e-02   S   0.00e+00
    44    57 1.166695537e+03 1.187e-10 1.130e-10 1.87e-02   S   0.00e+00
    45    58 1.166695537e+03 5.721e-11 6.028e-11 1.37e-02   S   0.00e+00

***** Relative function convergence *****

Estimated parameters:
                    Estimate
asc_sq_a           -0.711682
asc_sq_b           28.143655
asc_sq_c            0.026173
b_rainmean_a       -0.003495
b_rainmean_b       -0.055204
b_rainmean_c       -0.051129
b_variance_a        0.001789
b_variance_b       -0.676776
b_variance_c        0.022421
b_daymean_a         0.085838
b_daymean_b         2.920336
b_daymean_c        -0.037498
b_fracI_a           0.031113
b_fracI_b           0.525198
b_fracI_c           0.028180
b_costI_a          -0.884805
b_costI_b          10.835196
b_costI_c          -1.478916
b_payback_a         0.479937
b_payback_b         1.024163
b_payback_c        -0.177559
delta_a             1.802144
delta_b            19.393231
delta_c             0.000000
gamma_educ_b_a     -0.564472
gamma_educ_b_b     -0.120271
gamma_educ_b_c      0.000000
gamma_reality_a    -0.489320
gamma_reality_b   -18.149345
gamma_reality_c     0.000000
gamma_predict_a     0.570672
gamma_predict_b    -0.025911
gamma_predict_c     0.000000
gamma_ANAcosts_a   -1.728785
gamma_ANAcosts_b   -0.827748
gamma_ANAcosts_c    0.000000

Final LL: -1166.6955


Summary of class allocation for model component :
         Mean prob.
Class_1      0.7585
Class_2      0.1295
Class_3      0.1120

Calculating log-likelihood at equal shares (LL(0)) for applicable models...
Calculating log-likelihood at observed shares from estimation data (LL(c)) for applicable models...
Calculating LL of each model component...
Calculating other model fit measures
Freeing memory on main thread...
Computing covariance matrix using numerical jacobian of analytical gradient.
 0%....25%....50%....75%....100%
Negative definite Hessian with maximum eigenvalue: 0
Computing score matrix...
Restoring data to main thread... Done. 222.1MB of RAM in use

Your model was estimated using the BGW algorithm. Please acknowledge this by citing Bunch et al. (1993) - DOI 10.1145/151271.151279
> 
> # ################################################################# #
> #### MODEL OUTPUTS                                               ####
> # ################################################################# #
> 
> # ----------------------------------------------------------------- #
> #---- FORMATTED OUTPUT (TO SCREEN)                               ----
> # ----------------------------------------------------------------- #
> 
> # apollo_modelOutput(model)
> 
> apollo_modelOutput(model, modelOutput_settings = list(printPVal=2)) # for two sided t-test
Model run by lotte using Apollo 0.3.4 on R 4.4.2 for Windows.
Please acknowledge the use of Apollo by citing Hess & Palma (2019)
  DOI 10.1016/j.jocm.2019.100170
  www.ApolloChoiceModelling.com

Model name                                  : LC_T2-4_3_class_all_demog_CCatt_ANA
Model description                           : LC model on T2-4 treatment Ghana choice data, 3 classes, all coeffs vary, including demographic information
Model run at                                : 2025-02-25 15:50:38.858248
Estimation method                           : bgw
Model diagnosis                             : Relative function convergence
Optimisation diagnosis                      : Maximum found
     hessian properties                     : Negative definite
     maximum eigenvalue                     : 0
     reciprocal of condition number         : 7.48394e-14
Number of individuals                       : 321
Number of rows in database                  : 1926
Number of modelled outcomes                 : 1926

Number of cores used                        :  5 
Model without mixing

LL(start)                                   : -1215.83
LL (whole model) at equal shares, LL(0)     : -2115.93
LL (whole model) at observed shares, LL(C)  : -1877.54
LL(final, whole model)                      : -1166.7
Rho-squared vs equal shares                  :  0.4486 
Adj.Rho-squared vs equal shares              :  0.434 
Rho-squared vs observed shares               :  0.3786 
Adj.Rho-squared vs observed shares           :  0.3653 
AIC                                         :  2395.39 
BIC                                         :  2567.85 

LL(0,Class_1)                    : -2115.93
LL(final,Class_1)                : -1985.17
LL(0,Class_2)                    : -2115.93
LL(final,Class_2)                : -13325.12
LL(0,Class_3)                    : -2115.93
LL(final,Class_3)                : -6661.93

Estimated parameters                        : 31
Time taken (hh:mm:ss)                       :  00:00:15.38 
     pre-estimation                         :  00:00:5.93 
     estimation                             :  00:00:2.2 
     post-estimation                        :  00:00:7.24 
Iterations                                  :  45  

Unconstrained optimisation.

Estimates:
                    Estimate        s.e.   t.rat.(0)  p(2-sided)    Rob.s.e. Rob.t.rat.(0)  p(2-sided)
asc_sq_a           -0.711682    0.557706   -1.276089    0.201924    0.581799     -1.223243    0.221238
asc_sq_b           28.143655   12.939135    2.175080    0.029624    8.544617      3.293729  9.8868e-04
asc_sq_c            0.026173    3.845318    0.006806    0.994569    5.601923      0.004672    0.996272
b_rainmean_a       -0.003495    0.005632   -0.620438    0.534969    0.005549     -0.629778    0.528840
b_rainmean_b       -0.055204    0.044172   -1.249746    0.211392    0.033731     -1.636616    0.101711
b_rainmean_c       -0.051129    0.048921   -1.045125    0.295965    0.056726     -0.901335    0.367410
b_variance_a        0.001789    0.009483    0.188626    0.850386    0.008270      0.216305    0.828750
b_variance_b       -0.676776    0.268494   -2.520635    0.011714    0.191324     -3.537331  4.0419e-04
b_variance_c        0.022421    0.085701    0.261620    0.793614    0.066549      0.336909    0.736186
b_daymean_a         0.085838    0.030905    2.777425    0.005479    0.029572      2.902657    0.003700
b_daymean_b         2.920336    1.180361    2.474105    0.013357    0.735646      3.969759   7.195e-05
b_daymean_c        -0.037498    0.236230   -0.158734    0.873879    0.165378     -0.226739    0.820627
b_fracI_a           0.031113    0.002348   13.252835    0.000000    0.002906     10.708146    0.000000
b_fracI_b           0.525198    0.187726    2.797682    0.005147    0.117338      4.475946   7.607e-06
b_fracI_c           0.028180    0.015200    1.853968    0.063744    0.018220      1.546660    0.121945
b_costI_a          -0.884805    0.144364   -6.129002   8.843e-10    0.155086     -5.705244   1.162e-08
b_costI_b          10.835196    4.349851    2.490935    0.012741    2.903791      3.731396  1.9042e-04
b_costI_c          -1.478916    1.271392   -1.163226    0.244738    1.683915     -0.878261    0.379802
b_payback_a         0.479937    0.034942   13.735203    0.000000    0.043992     10.909698    0.000000
b_payback_b         1.024163    0.415462    2.465116    0.013697    0.296811      3.450556  5.5943e-04
b_payback_c        -0.177559    0.215904   -0.822399    0.410850    0.098837     -1.796494    0.072416
delta_a             1.802144    0.790110    2.280876    0.022556    0.881523      2.044352    0.040919
delta_b            19.393231 3927.876594    0.004937    0.996061    3.550448      5.462192   4.703e-08
delta_c             0.000000          NA          NA          NA          NA            NA          NA
gamma_educ_b_a     -0.564472    0.399205   -1.413991    0.157365    0.383789     -1.470786    0.141349
gamma_educ_b_b     -0.120271    0.561852   -0.214062    0.830498    0.557788     -0.215622    0.829283
gamma_educ_b_c      0.000000          NA          NA          NA          NA            NA          NA
gamma_reality_a    -0.489320    0.267761   -1.827446    0.067633    0.283503     -1.725978    0.084351
gamma_reality_b   -18.149345 3927.876432   -0.004621    0.996313    2.887447     -6.285602   3.266e-10
gamma_reality_c     0.000000          NA          NA          NA          NA            NA          NA
gamma_predict_a     0.570672    0.253448    2.251633    0.024345    0.269067      2.120930    0.033928
gamma_predict_b    -0.025911    0.347603   -0.074542    0.940579    0.375445     -0.069014    0.944978
gamma_predict_c     0.000000          NA          NA          NA          NA            NA          NA
gamma_ANAcosts_a   -1.728785    0.429757   -4.022704   5.753e-05    0.435232     -3.972101   7.124e-05
gamma_ANAcosts_b   -0.827748    0.647684   -1.278011    0.201246    0.676088     -1.224319    0.220832
gamma_ANAcosts_c    0.000000          NA          NA          NA          NA            NA          NA


Summary of class allocation for model component :
         Mean prob.
Class_1      0.7585
Class_2      0.1295
Class_3      0.1120


I have checked for differences in the descriptive between the treatment groups, and did not find anything very different there. That also holds for the exact choices made by respondents per choice tasks (per assigned comparable blocks).

Do you have any suggestions on how I can try to understand why these results differ? Additionally, there are coefficients with much larger values in absolute terms?

Thanks again for taking the time.
stephanehess
Site Admin
Posts: 1351
Joined: 24 Apr 2020, 16:29

Re: Errors convergence latent class model

Post by stephanehess »

Hi

how does the fit compare to the two class models?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
LotteMuller
Posts: 4
Joined: 26 Sep 2024, 13:57

Re: Errors convergence latent class model

Post by LotteMuller »

Hi Stephane,

Below is a table comparing the fit with the 2 class model (without class membership), 3 class model (without class membership) and the 3 class model with class membership, for each treatment group:
The fit (sig) improved each time (tested using the likelihood ratio), indicating that the 3 class + membership fits best, however, am i overfitting (looking at the BIC)? Could this be the explanation as to why I get significant positive costs in treatment group 2, 3 class + class membership model? Anything stand out or worth testing that i have missed?

thanks again...

Lotte
Attachments
LCMs-fit.PNG
LCMs-fit.PNG (69.04 KiB) Viewed 20923 times
stephanehess
Site Admin
Posts: 1351
Joined: 24 Apr 2020, 16:29

Re: Errors convergence latent class model

Post by stephanehess »

Hi

it looksl ike maybe you have a group of people here who always choose the SQ option. Could you look at the posterior class allocation probabilities and see whether the people very likely to fall into class b are also always choosing the SQ?

Stephane
--------------------------------
Stephane Hess
www.stephanehess.me.uk
Post Reply