Mixed Logit Joint Estimation Fails with Singular Convergence

androdri · Post by **androdri** » 07 May 2025, 14:08

Dear Apollo Forum Users,

We are currently estimating a mixed logit model from a discrete choice experiment conducted in three countries. The experiment design was consistent across countries: respondents chose among three product alternatives plus an opt-out, with four non-price attributes and price (in USD cents).

We have successfully estimated both multinomial logit (MNL) and mixed logit (MMNL) models separately for each country, as well as for the pooled sample (without distinguishing countries). However, we would prefer to use a joint estimation strategy that models all countries simultaneously while accounting for heterogeneity across them.

To this end, we implemented a joint MMNL specification, interacting each utility function with a country-specific scale parameter (μ), and allowing all other parameters to vary randomly (normal distributions for all but price, which is log-normal). The code structure follows the Apollo guidance on combining multiple MNL components.

Unfortunately, while the joint MNL model converges successfully, the joint MMNL model results in "singular convergence". We’ve tested various sets of starting values, including:

All coefficients initialized at 0.
- Coefficients taken from the pooled MMNL estimation (i.e., without country interactions), which converged without issues.
Still, the joint MMNL estimation fails. We are wondering if:
- The issue arises due to a lack of identification when combining random parameters with country-specific interactions?
  - There is a need to impose further restrictions (e.g., fixing some random parameter variances across countries)?
    - There are known issues with joint MMNL specifications in Apollo that we might be encountering?
      
      I have attached the relevant portion of the code below for reference. Any insights, suggestions, or diagnostic tips would be very welcome!
      
      Thank you in advance for your help.
      
      Best regards,
      
      ### Initialise code
      rm(list=ls())
      library(apollo)
      library(readr)
      library(purrr)
      library(writexl)
      library(tidyverse)
      library(gridExtra)
      
      apollo_initialise()
      
      ### Set core controls
      apollo_control = list(
      modelName = "MMNL_cont_sinInt_pool_1",
      modelDescr = "Mixed Logit (MMNL) model, based on final data",
      indivID = "unique_id", # Name of column in the database with each individual's ID
      mixing = TRUE,
      outputDirectory = "output"
      )
      
      # ################################################################# #
      #### DEFINE MODEL PARAMETERS ####
      # ################################################################# #
      
      ### Vector of parameters, including any that are kept fixed in estimation
      apollo_beta=c(asc_disp_mu = 2.66235627298614, asc_disp_sig = 2.04190609548596, asc_rech_mu = 2.88144773307611, asc_rech_sig = -2.36948135361606, asc_cig_mu = 2.63656072629063, asc_cig_sig = -2.81064812025337, bprice_mu = -17.3846372858944, bprice_sig = 21.2582770442041, bext1_mu = -0.518582512908777, bext1_sig = -0.094131702107882, bext2_mu = -1.04902237806684, bext2_sig = -1.02619572257211, bext3_mu = -0.586525057171061, bext3_sig = 0.299366413153592, bharm_mu = -0.507369735339598, bharm_sig = 0.717126371047121, bhide_mu = -0.123408428641911, bhide_sig = -0.467480418535583, bflavour1_mu = 0.175191161821176, bflavour1_sig = 0.0395994335525264, bflavour2_mu = 0.219332050759916, bflavour2_sig = -0.550427457565768,
      mu_col = 0,
      mu_arg = 0,
      mu_chi = 0
      )
      
      apollo_fixed = c("mu_col")
      
      # ################################################################# #
      #### DEFINE RANDOM COMPONENTS ####
      # ################################################################# #
      
      ### Set parameters for generating draws
      apollo_draws = list(
      interDrawsType = "mlhs",
      interNDraws = 1000,
      interNormDraws = c("draws_asc_disp","draws_asc_rech","draws_asc_cig",
      "draws_bprice",
      "draws_bext1","draws_bext2","draws_bext3",
      "draws_bharm","draws_bhide",
      "draws_bflavour1","draws_bflavour2")
      )
      
      ### Create random parameters
      apollo_randCoeff = function(apollo_beta, apollo_inputs){
      randcoeff = list()
      
      randcoeff[["asc_disp"]] = asc_disp_mu + asc_disp_sig * draws_asc_disp
      randcoeff[["asc_rech"]] = asc_rech_mu + asc_rech_sig * draws_asc_rech
      randcoeff[["asc_cig"]] = asc_cig_mu + asc_cig_sig * draws_asc_cig
      
      randcoeff[["bprice"]] = -exp(bprice_mu + bprice_sig * draws_bprice)
      randcoeff[["bext1"]] = bext1_mu + bext1_sig * draws_bext1
      randcoeff[["bext2"]] = bext2_mu + bext2_sig * draws_bext2
      randcoeff[["bext3"]] = bext3_mu + bext3_sig * draws_bext3
      randcoeff[["bharm"]] = bharm_mu + bharm_sig * draws_bharm
      randcoeff[["bhide"]] = bhide_mu + bhide_sig * draws_bhide
      randcoeff[["bflavour1"]] = bflavour1_mu + bflavour1_sig * draws_bflavour1
      randcoeff[["bflavour2"]] = bflavour2_mu + bflavour2_sig * draws_bflavour2
      
      return(randcoeff)
      }
      
      # ################################################################# #
      #### GROUP AND VALIDATE INPUTS ####
      # ################################################################# #
      
      apollo_inputs = apollo_validateInputs()
      
      # ################################################################# #
      #### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
      # ################################################################# #
      
      apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){
      
      ### Function initialisation: do not change the following three commands
      ### Attach inputs and detach after function exit
      apollo_attach(apollo_beta, apollo_inputs)
      on.exit(apollo_detach(apollo_beta, apollo_inputs))
      
      ### Create list of probabilities P
      P = list()
      
      ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
      V = list()
      V[["ecigdisp_col"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
      V[["ecigrech_col"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
      V[["cig_col"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
      V[["optout_col"]] = 0
      
      ### Compute probabilities for 'Colombia' of the data using MNL model
      mnl_settings_col = list(
      alternatives = c(ecigdisp_col=1, ecigrech_col=2, cig_col=3, optout_col=4),
      avail = list(ecigdisp_col=1, ecigrech_col=1, cig_col=1, optout_col=1),
      choiceVar = chosen_option,
      utilities = list(ecigdisp_col= mu_col*V[["ecigdisp_col"]],
      ecigrech_col= mu_col*V[["ecigrech_col"]],
      cig_col = mu_col*V[["cig_col"]],
      optout_col = mu_col*V[["optout_col"]]),
      rows = (col==1),
      componentName = "col"
      )
      
      ### Compute probabilities using MNL model
      P[["col"]] = apollo_mnl(mnl_settings_col, functionality)
      
      ### Compute probabilities for Argentina sub sample
      V = list()
      V[["ecigdisp_arg"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
      V[["ecigrech_arg"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
      V[["cig_arg"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
      V[["optout_arg"]] = 0
      
      ### Compute probabilities for 'Argentina' of the data using MNL model
      mnl_settings_arg = list(
      alternatives = c(ecigdisp_arg=1, ecigrech_arg=2, cig_arg=3, optout_arg=4),
      avail = list(ecigdisp_arg=1, ecigrech_arg=1, cig_arg=1, optout_arg=1),
      choiceVar = chosen_option,
      utilities = list(ecigdisp_arg= mu_arg*V[["ecigdisp_arg"]],
      ecigrech_arg= mu_arg*V[["ecigrech_arg"]],
      cig_arg = mu_arg*V[["cig_arg"]],
      optout_arg = mu_arg*V[["optout_arg"]]),
      rows = (arg==1),
      componentName = "arg"
      )
      
      ### Compute probabilities using MNL model
      P[["arg"]] = apollo_mnl(mnl_settings_arg, functionality)
      
      ### Compute probabilities for Chile sub sample
      V = list()
      V[["ecigdisp_chi"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
      V[["ecigrech_chi"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
      V[["cig_chi"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
      V[["optout_chi"]] = 0
      
      ### Compute probabilities for 'Chile' of the data using MNL model
      mnl_settings_chi = list(
      alternatives = c(ecigdisp_chi=1, ecigrech_chi=2, cig_chi=3, optout_chi=4),
      avail = list(ecigdisp_chi=1, ecigrech_chi=1, cig_chi=1, optout_chi=1),
      choiceVar = chosen_option,
      utilities = list(ecigdisp_chi= mu_chi*V[["ecigdisp_chi"]],
      ecigrech_chi= mu_chi*V[["ecigrech_chi"]],
      cig_chi = mu_chi*V[["cig_chi"]],
      optout_chi = mu_chi*V[["optout_chi"]]),
      rows = (chi==1),
      componentName = "chi"
      )
      
      ### Compute probabilities using MNL model
      P[["chi"]] = apollo_mnl(mnl_settings_chi, functionality)
      
      ## Combined model - Joint estimation
      P = apollo_combineModels(P,apollo_inputs, functionality)
      
      ### Take product across observation for same individual
      P = apollo_panelProd(P, apollo_inputs, functionality)
      
      ### Average across inter-individual draws
      P = apollo_avgInterDraws(P, apollo_inputs, functionality)
      
      ### Prepare and return outputs of function
      P = apollo_prepareProb(P, apollo_inputs, functionality)
      return(P)
      }
      
      > # ################################################################# #
      > #### MODEL ESTIMATION ####
      > # ################################################################# #
      >
      > model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
      INFORMATION: You can use multiple processor cores to speed up
      estimation. To do so, specify the desired number of
      cores using the setting nCores inside
      "apollo_control", for example "nCores=2". This
      computer has 20 available cores. We recommend using no
      more than 19 cores.
      Preparing user-defined functions.
      
      Testing likelihood function...
      
      Overview of choices for MNL model component col:
      ecigdisp_col ecigrech_col
      Times available 3880.00 3880.00
      Times chosen 991.00 1376.00
      Percentage chosen overall 25.54 35.46
      Percentage chosen when available 25.54 35.46
      cig_col optout_col
      Times available 3880.00 3880.00
      Times chosen 389.00 1124.00
      Percentage chosen overall 10.03 28.97
      Percentage chosen when available 10.03 28.97
      
      Overview of choices for MNL model component arg:
      ecigdisp_arg ecigrech_arg
      Times available 4760.00 4760.00
      Times chosen 1324.00 1499.00
      Percentage chosen overall 27.82 31.49
      Percentage chosen when available 27.82 31.49
      cig_arg optout_arg
      Times available 4760.00 4760.00
      Times chosen 840.00 1097.00
      Percentage chosen overall 17.65 23.05
      Percentage chosen when available 17.65 23.05
      
      Overview of choices for MNL model component chi:
      ecigdisp_chi ecigrech_chi
      Times available 4480.00 4480.00
      Times chosen 1228.00 1413.00
      Percentage chosen overall 27.41 31.54
      Percentage chosen when available 27.41 31.54
      cig_chi optout_chi
      Times available 4480.0 4480.00
      Times chosen 802.0 1037.00
      Percentage chosen overall 17.9 23.15
      Percentage chosen when available 17.9 23.15
      
      Pre-processing likelihood function...
      
      Testing influence of parameters
      Starting main estimation
      
      BGW using analytic model derivatives supplied by caller...
      
      Iterates will be written to:
      output/MMNL_cont_sinInt_pool_NOFUMA1825_iterations.csv
      it nf F RELDF PRELDF RELDX MODEL stppar
      0 1 1.818818202e+04
      1 9 1.818817948e+04 1.395e-07 4.638e-07 9.77e-05 G 2.82e+02
      2 12 1.818817936e+04 6.614e-09 8.489e-08 5.93e-05 G-S 4.48e+02
      3 13 1.818817903e+04 1.824e-08 5.293e-08 9.42e-04 S 2.66e+00
      4 15 1.818817896e+04 3.590e-09 1.267e-08 5.04e-04 S 6.21e+00
      5 17 1.818817895e+04 9.231e-10 1.824e-09 1.94e-04 S 1.05e+01
      6 19 1.818817895e+04 5.048e-12 1.945e-11 1.83e-05 S 1.25e-03
      7 20 1.805986495e+04 7.055e-03 3.276e-08 7.70e-02 S 5.60e-04
      8 22 1.805856299e+04 7.209e-05 2.942e-04 9.95e-02 S 6.39e-07
      9 23 1.805021614e+04 4.622e-04 2.408e-04 7.91e-03 S 9.48e-34
      10 24 1.804022132e+04 5.537e-04 2.407e-04 3.01e-02 S 7.20e-34
      11 25 1.803586974e+04 2.412e-04 2.661e-04 1.06e-01 S 3.80e-06
      12 27 1.802502526e+04 6.013e-04 3.367e-04 6.35e-02 S 4.06e-05
      13 29 1.800653344e+04 1.026e-03 4.524e-04 1.53e-02 S 3.47e-04
      14 30 1.798339469e+04 1.285e-03 6.068e-04 3.09e-02 S 7.18e-06
      15 31 1.796664806e+04 9.312e-04 6.901e-04 8.23e-02 S 7.46e-35
      16 32 1.794793142e+04 1.042e-03 7.676e-04 6.00e-02 S -7.36e-35
      17 34 1.791729655e+04 1.707e-03 8.692e-04 2.94e-02 S 6.75e-05
      18 35 1.789615002e+04 1.180e-03 9.376e-04 3.87e-02 S 3.15e-35
      19 36 1.786156411e+04 1.933e-03 1.038e-03 5.94e-02 S 2.07e-36
      20 38 1.782732713e+04 1.917e-03 1.214e-03 8.15e-02 S 1.81e-05
      21 39 1.778277460e+04 2.499e-03 1.500e-03 6.31e-02 S 5.32e-37
      22 40 1.771889114e+04 3.592e-03 1.733e-03 7.45e-02 S 1.26e-37
      23 41 1.764995944e+04 3.890e-03 1.900e-03 1.83e-01 S 3.11e-38
      24 42 1.758325798e+04 3.779e-03 2.044e-03 4.90e-01 S 7.76e-39
      25 43 1.750241203e+04 4.598e-03 2.301e-03 5.85e-01 S 1.94e-39
      26 44 1.740544473e+04 5.540e-03 3.904e-03 5.96e-01 S 2.25e-39
      27 45 1.733492158e+04 4.052e-03 3.826e-03 2.72e-01 S 5.64e-39
      28 46 1.724556716e+04 5.155e-03 3.623e-03 1.76e-01 S 1.81e-38
      29 47 1.708002725e+04 9.599e-03 3.674e-03 1.30e-01 S 1.35e-38
      30 57 1.706870580e+04 6.628e-04 5.234e-03 4.31e-04 S 4.49e-04
      31 60 1.702178985e+04 2.749e-03 2.486e-03 9.83e-06 S 3.16e+00
      32 61 1.694418191e+04 4.559e-03 3.487e-03 1.52e-05 S 9.02e-04
      33 63 1.690452684e+04 2.340e-03 2.118e-03 7.42e-06 S 4.96e+00
      34 64 1.684153531e+04 3.726e-03 3.029e-03 1.31e-05 S 6.50e-03
      35 65 1.677140908e+04 4.164e-03 3.075e-03 6.21e-05 S 1.01e-02
      36 66 1.673005056e+04 2.466e-03 2.029e-03 5.97e-05 S 1.01e-02
      37 67 1.671137188e+04 1.116e-03 1.065e-03 5.82e-05 S 3.54e-03
      38 69 1.670670192e+04 2.794e-04 2.669e-04 5.33e-06 S 1.02e+00
      39 70 1.670477523e+04 1.153e-04 1.254e-04 1.02e-05 S 1.09e-01
      40 71 1.670363107e+04 6.849e-05 6.585e-05 8.90e-05 S 1.46e-35
      41 72 1.670361441e+04 9.970e-07 9.872e-07 5.51e-05 S -1.46e-35
      42 73 1.670361441e+04 7.084e-11 7.051e-11 5.77e-05 S -1.46e-35
      
      ***** Singular convergence *****
      
      Estimated parameters:
      Estimate
      asc_disp_mu 6.954e+04
      asc_disp_sig 30.964
      asc_rech_mu 1.052e+05
      asc_rech_sig 101.607
      asc_cig_mu -2.898e+04
      asc_cig_sig 116.622
      bprice_mu 52.944
      bprice_sig 31.213
      bext1_mu 5.333e+04
      bext1_sig -4.149
      bext2_mu -4.399e+04
      bext2_sig 45.448
      bext3_mu 4.087e+04
      bext3_sig 5.883
      bharm_mu 1.134e+04
      bharm_sig 89.968
      bhide_mu 4.596e+04
      bhide_sig -84.626
      bflavour1_mu 4.336e+04
      bflavour1_sig -12.240
      bflavour2_mu 6.809e+04
      bflavour2_sig -26.809
      mu_col 0.000
      mu_arg 2.849e-37
      mu_chi 6.284e-38
      
      Final LL: -16703.6144
      
      WARNING: Your model did not converge properly, and some of your
      parameter values are tending to +/- infinity. This
      could point to an identification issue. If you want to
      retain these parameters in the model, you may wish to
      set their value(s) in apollo_beta to the estimated
      value(s), include the parameter name(s) in
      apollo_fixed, and re-estimate the model.
      WARNING: Estimation failed. No covariance matrix to compute.
      Calculating log-likelihood at equal shares (LL(0)) for
      applicable models...
      Calculating log-likelihood at observed shares from
      estimation data (LL(c)) for applicable models...
      Calculating LL of each model component...
      Calculating other model fit measures
      INFORMATION: Your model took more than 10 minutes to estimate, so it
      was saved to file
      output/MMNL_cont_sinInt_pool_NOFUMA1825_model.rds
      before calculating its covariance matrix. If
      calculation of the covariance matrix fails or is
      stopped before finishing, you can load the model up to
      this point using apollo_loadModel.
      
      Your model was estimated using the BGW algorithm. Please
      acknowledge this by citing Bunch et al. (1993) - DOI
      10.1145/151271.151279
      
      > apollo_modelOutput(model)
      Model run by paul.rodriguez using Apollo 0.3.3 on R 4.3.2 for Windows.
      Please acknowledge the use of Apollo by citing Hess & Palma (2019)
      DOI 10.1016/j.jocm.2019.100170
      www.ApolloChoiceModelling.com
      
      Model name : MMNL_cont_sinInt_pool_NOFUMA1825
      Model description : Mixed Logit (MMNL) model, based on final data
      Model run at : 2025-05-06 13:32:22.291758
      Estimation method : bgw
      Model diagnosis : Singular convergence
      Number of individuals : 1597
      Number of rows in database : 13120
      Number of modelled outcomes : 13120
      col : 3880
      arg : 4760
      chi : 4480
      
      Number of cores used : 1
      Number of inter-individual draws : 1000 (mlhs)
      
      LL(start) : -18188.18
      LL (whole model) at equal shares, LL(0) : -18188.18
      LL (whole model) at observed shares, LL(C) : -17676.48
      LL(final, whole model) : -16703.61
      Rho-squared vs equal shares : 0.0816
      Adj.Rho-squared vs equal shares : 0.0803
      Rho-squared vs observed shares : 0.055
      Adj.Rho-squared vs observed shares : 0.0542
      AIC : 33455.23
      BIC : 33605.55
      
      LL(0,col) : -5378.82
      LL(final,col) : -5378.82
      LL(0,arg) : -6598.76
      LL(final,arg) : -5766.41
      LL(0,chi) : -6210.6
      LL(final,chi) : -5558.38
      
      Estimated parameters : 24
      Time taken (hh:mm:ss) : 01:54:20.74
      pre-estimation : 00:05:31.43
      estimation : 01:46:59.91
      post-estimation : 00:01:49.39
      Iterations : 42 (Singular convergence)
      
      Unconstrained optimisation.
      
      Estimates:
      Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
      asc_disp_mu 6.954e+04 NA NA NA NA
      asc_disp_sig 30.964 NA NA NA NA
      asc_rech_mu 1.052e+05 NA NA NA NA
      asc_rech_sig 101.607 NA NA NA NA
      asc_cig_mu -2.898e+04 NA NA NA NA
      asc_cig_sig 116.622 NA NA NA NA
      bprice_mu 52.944 NA NA NA NA
      bprice_sig 31.213 NA NA NA NA
      bext1_mu 5.333e+04 NA NA NA NA
      bext1_sig -4.149 NA NA NA NA
      bext2_mu -4.399e+04 NA NA NA NA
      bext2_sig 45.448 NA NA NA NA
      bext3_mu 4.087e+04 NA NA NA NA
      bext3_sig 5.883 NA NA NA NA
      bharm_mu 1.134e+04 NA NA NA NA
      bharm_sig 89.968 NA NA NA NA
      bhide_mu 4.596e+04 NA NA NA NA
      bhide_sig -84.626 NA NA NA NA
      bflavour1_mu 4.336e+04 NA NA NA NA
      bflavour1_sig -12.240 NA NA NA NA
      bflavour2_mu 6.809e+04 NA NA NA NA
      bflavour2_sig -26.809 NA NA NA NA
      mu_col 0.000 NA NA NA NA
      mu_arg 2.849e-37 NA NA NA NA
      mu_chi 6.284e-38 NA NA NA NA

Post by **stephanehess** » 19 May 2025, 18:46

Hi

you have initialised the scale parameters to 0, instead of 1, and also fixed the one for Colombia to 0, which would mean that none of the other parameters can actually influence the likelihood.

Stephane

androdri · Post by **androdri** » 23 May 2025, 17:56

Many thanks! Good catch

ApolloChoiceModelling forum

Mixed Logit Joint Estimation Fails with Singular Convergence

Mixed Logit Joint Estimation Fails with Singular Convergence

Re: Mixed Logit Joint Estimation Fails with Singular Convergence

Re: Mixed Logit Joint Estimation Fails with Singular Convergence