Page 1 of 1

Mixed Logit Joint Estimation Fails with Singular Convergence

Posted: 07 May 2025, 14:08
by androdri
Dear Apollo Forum Users,

We are currently estimating a mixed logit model from a discrete choice experiment conducted in three countries. The experiment design was consistent across countries: respondents chose among three product alternatives plus an opt-out, with four non-price attributes and price (in USD cents).

We have successfully estimated both multinomial logit (MNL) and mixed logit (MMNL) models separately for each country, as well as for the pooled sample (without distinguishing countries). However, we would prefer to use a joint estimation strategy that models all countries simultaneously while accounting for heterogeneity across them.

To this end, we implemented a joint MMNL specification, interacting each utility function with a country-specific scale parameter (μ), and allowing all other parameters to vary randomly (normal distributions for all but price, which is log-normal). The code structure follows the Apollo guidance on combining multiple MNL components.

Unfortunately, while the joint MNL model converges successfully, the joint MMNL model results in "singular convergence". We’ve tested various sets of starting values, including:
  • All coefficients initialized at 0.
    • Coefficients taken from the pooled MMNL estimation (i.e., without country interactions), which converged without issues.
    Still, the joint MMNL estimation fails. We are wondering if:
    • The issue arises due to a lack of identification when combining random parameters with country-specific interactions?
      • There is a need to impose further restrictions (e.g., fixing some random parameter variances across countries)?
        • There are known issues with joint MMNL specifications in Apollo that we might be encountering?

          I have attached the relevant portion of the code below for reference. Any insights, suggestions, or diagnostic tips would be very welcome!

          Thank you in advance for your help.

          Best regards,

          ### Initialise code
          rm(list=ls())
          library(apollo)
          library(readr)
          library(purrr)
          library(writexl)
          library(tidyverse)
          library(gridExtra)

          apollo_initialise()

          ### Set core controls
          apollo_control = list(
          modelName = "MMNL_cont_sinInt_pool_1",
          modelDescr = "Mixed Logit (MMNL) model, based on final data",
          indivID = "unique_id", # Name of column in the database with each individual's ID
          mixing = TRUE,
          outputDirectory = "output"
          )


          # ################################################################# #
          #### DEFINE MODEL PARAMETERS ####
          # ################################################################# #

          ### Vector of parameters, including any that are kept fixed in estimation
          apollo_beta=c(asc_disp_mu = 2.66235627298614, asc_disp_sig = 2.04190609548596, asc_rech_mu = 2.88144773307611, asc_rech_sig = -2.36948135361606, asc_cig_mu = 2.63656072629063, asc_cig_sig = -2.81064812025337, bprice_mu = -17.3846372858944, bprice_sig = 21.2582770442041, bext1_mu = -0.518582512908777, bext1_sig = -0.094131702107882, bext2_mu = -1.04902237806684, bext2_sig = -1.02619572257211, bext3_mu = -0.586525057171061, bext3_sig = 0.299366413153592, bharm_mu = -0.507369735339598, bharm_sig = 0.717126371047121, bhide_mu = -0.123408428641911, bhide_sig = -0.467480418535583, bflavour1_mu = 0.175191161821176, bflavour1_sig = 0.0395994335525264, bflavour2_mu = 0.219332050759916, bflavour2_sig = -0.550427457565768,
          mu_col = 0,
          mu_arg = 0,
          mu_chi = 0
          )

          apollo_fixed = c("mu_col")


          # ################################################################# #
          #### DEFINE RANDOM COMPONENTS ####
          # ################################################################# #

          ### Set parameters for generating draws
          apollo_draws = list(
          interDrawsType = "mlhs",
          interNDraws = 1000,
          interNormDraws = c("draws_asc_disp","draws_asc_rech","draws_asc_cig",
          "draws_bprice",
          "draws_bext1","draws_bext2","draws_bext3",
          "draws_bharm","draws_bhide",
          "draws_bflavour1","draws_bflavour2")
          )

          ### Create random parameters
          apollo_randCoeff = function(apollo_beta, apollo_inputs){
          randcoeff = list()

          randcoeff[["asc_disp"]] = asc_disp_mu + asc_disp_sig * draws_asc_disp
          randcoeff[["asc_rech"]] = asc_rech_mu + asc_rech_sig * draws_asc_rech
          randcoeff[["asc_cig"]] = asc_cig_mu + asc_cig_sig * draws_asc_cig

          randcoeff[["bprice"]] = -exp(bprice_mu + bprice_sig * draws_bprice)
          randcoeff[["bext1"]] = bext1_mu + bext1_sig * draws_bext1
          randcoeff[["bext2"]] = bext2_mu + bext2_sig * draws_bext2
          randcoeff[["bext3"]] = bext3_mu + bext3_sig * draws_bext3
          randcoeff[["bharm"]] = bharm_mu + bharm_sig * draws_bharm
          randcoeff[["bhide"]] = bhide_mu + bhide_sig * draws_bhide
          randcoeff[["bflavour1"]] = bflavour1_mu + bflavour1_sig * draws_bflavour1
          randcoeff[["bflavour2"]] = bflavour2_mu + bflavour2_sig * draws_bflavour2

          return(randcoeff)
          }


          # ################################################################# #
          #### GROUP AND VALIDATE INPUTS ####
          # ################################################################# #

          apollo_inputs = apollo_validateInputs()

          # ################################################################# #
          #### DEFINE MODEL AND LIKELIHOOD FUNCTION ####
          # ################################################################# #

          apollo_probabilities=function(apollo_beta, apollo_inputs, functionality="estimate"){

          ### Function initialisation: do not change the following three commands
          ### Attach inputs and detach after function exit
          apollo_attach(apollo_beta, apollo_inputs)
          on.exit(apollo_detach(apollo_beta, apollo_inputs))

          ### Create list of probabilities P
          P = list()

          ### List of utilities: these must use the same names as in mnl_settings, order is irrelevant
          V = list()
          V[["ecigdisp_col"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
          V[["ecigrech_col"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
          V[["cig_col"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
          V[["optout_col"]] = 0

          ### Compute probabilities for 'Colombia' of the data using MNL model
          mnl_settings_col = list(
          alternatives = c(ecigdisp_col=1, ecigrech_col=2, cig_col=3, optout_col=4),
          avail = list(ecigdisp_col=1, ecigrech_col=1, cig_col=1, optout_col=1),
          choiceVar = chosen_option,
          utilities = list(ecigdisp_col= mu_col*V[["ecigdisp_col"]],
          ecigrech_col= mu_col*V[["ecigrech_col"]],
          cig_col = mu_col*V[["cig_col"]],
          optout_col = mu_col*V[["optout_col"]]),
          rows = (col==1),
          componentName = "col"
          )

          ### Compute probabilities using MNL model
          P[["col"]] = apollo_mnl(mnl_settings_col, functionality)


          ### Compute probabilities for Argentina sub sample
          V = list()
          V[["ecigdisp_arg"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
          V[["ecigrech_arg"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
          V[["cig_arg"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
          V[["optout_arg"]] = 0

          ### Compute probabilities for 'Argentina' of the data using MNL model
          mnl_settings_arg = list(
          alternatives = c(ecigdisp_arg=1, ecigrech_arg=2, cig_arg=3, optout_arg=4),
          avail = list(ecigdisp_arg=1, ecigrech_arg=1, cig_arg=1, optout_arg=1),
          choiceVar = chosen_option,
          utilities = list(ecigdisp_arg= mu_arg*V[["ecigdisp_arg"]],
          ecigrech_arg= mu_arg*V[["ecigrech_arg"]],
          cig_arg = mu_arg*V[["cig_arg"]],
          optout_arg = mu_arg*V[["optout_arg"]]),
          rows = (arg==1),
          componentName = "arg"
          )

          ### Compute probabilities using MNL model
          P[["arg"]] = apollo_mnl(mnl_settings_arg, functionality)

          ### Compute probabilities for Chile sub sample
          V = list()
          V[["ecigdisp_chi"]] = asc_disp + bprice*price1 + bext1*(ext1==1) + bext2*(ext1==2) + bext3*(ext1==3) + bharm*harm1 + bhide*hide1 + bflavour1*(flavour1==1) + bflavour2*(flavour1==2)
          V[["ecigrech_chi"]] = asc_rech + bprice*price2 + bext1*(ext2==1) + bext2* (ext2==2) + bext3* (ext2==3) + bharm*harm2 + bhide*hide2 + bflavour1*(flavour2==1) + bflavour2*(flavour2==2)
          V[["cig_chi"]] = asc_cig + bprice*price3 + bext2*ext3 + bharm*harm3 + bhide*hide3 + bflavour1*(flavour3==1) + bflavour2*(flavour3==2)
          V[["optout_chi"]] = 0

          ### Compute probabilities for 'Chile' of the data using MNL model
          mnl_settings_chi = list(
          alternatives = c(ecigdisp_chi=1, ecigrech_chi=2, cig_chi=3, optout_chi=4),
          avail = list(ecigdisp_chi=1, ecigrech_chi=1, cig_chi=1, optout_chi=1),
          choiceVar = chosen_option,
          utilities = list(ecigdisp_chi= mu_chi*V[["ecigdisp_chi"]],
          ecigrech_chi= mu_chi*V[["ecigrech_chi"]],
          cig_chi = mu_chi*V[["cig_chi"]],
          optout_chi = mu_chi*V[["optout_chi"]]),
          rows = (chi==1),
          componentName = "chi"
          )

          ### Compute probabilities using MNL model
          P[["chi"]] = apollo_mnl(mnl_settings_chi, functionality)


          ## Combined model - Joint estimation
          P = apollo_combineModels(P,apollo_inputs, functionality)

          ### Take product across observation for same individual
          P = apollo_panelProd(P, apollo_inputs, functionality)

          ### Average across inter-individual draws
          P = apollo_avgInterDraws(P, apollo_inputs, functionality)

          ### Prepare and return outputs of function
          P = apollo_prepareProb(P, apollo_inputs, functionality)
          return(P)
          }




          > # ################################################################# #
          > #### MODEL ESTIMATION ####
          > # ################################################################# #
          >
          > model = apollo_estimate(apollo_beta, apollo_fixed, apollo_probabilities, apollo_inputs)
          INFORMATION: You can use multiple processor cores to speed up
          estimation. To do so, specify the desired number of
          cores using the setting nCores inside
          "apollo_control", for example "nCores=2". This
          computer has 20 available cores. We recommend using no
          more than 19 cores.
          Preparing user-defined functions.

          Testing likelihood function...

          Overview of choices for MNL model component col:
          ecigdisp_col ecigrech_col
          Times available 3880.00 3880.00
          Times chosen 991.00 1376.00
          Percentage chosen overall 25.54 35.46
          Percentage chosen when available 25.54 35.46
          cig_col optout_col
          Times available 3880.00 3880.00
          Times chosen 389.00 1124.00
          Percentage chosen overall 10.03 28.97
          Percentage chosen when available 10.03 28.97


          Overview of choices for MNL model component arg:
          ecigdisp_arg ecigrech_arg
          Times available 4760.00 4760.00
          Times chosen 1324.00 1499.00
          Percentage chosen overall 27.82 31.49
          Percentage chosen when available 27.82 31.49
          cig_arg optout_arg
          Times available 4760.00 4760.00
          Times chosen 840.00 1097.00
          Percentage chosen overall 17.65 23.05
          Percentage chosen when available 17.65 23.05


          Overview of choices for MNL model component chi:
          ecigdisp_chi ecigrech_chi
          Times available 4480.00 4480.00
          Times chosen 1228.00 1413.00
          Percentage chosen overall 27.41 31.54
          Percentage chosen when available 27.41 31.54
          cig_chi optout_chi
          Times available 4480.0 4480.00
          Times chosen 802.0 1037.00
          Percentage chosen overall 17.9 23.15
          Percentage chosen when available 17.9 23.15


          Pre-processing likelihood function...

          Testing influence of parameters
          Starting main estimation

          BGW using analytic model derivatives supplied by caller...


          Iterates will be written to:
          output/MMNL_cont_sinInt_pool_NOFUMA1825_iterations.csv
          it nf F RELDF PRELDF RELDX MODEL stppar
          0 1 1.818818202e+04
          1 9 1.818817948e+04 1.395e-07 4.638e-07 9.77e-05 G 2.82e+02
          2 12 1.818817936e+04 6.614e-09 8.489e-08 5.93e-05 G-S 4.48e+02
          3 13 1.818817903e+04 1.824e-08 5.293e-08 9.42e-04 S 2.66e+00
          4 15 1.818817896e+04 3.590e-09 1.267e-08 5.04e-04 S 6.21e+00
          5 17 1.818817895e+04 9.231e-10 1.824e-09 1.94e-04 S 1.05e+01
          6 19 1.818817895e+04 5.048e-12 1.945e-11 1.83e-05 S 1.25e-03
          7 20 1.805986495e+04 7.055e-03 3.276e-08 7.70e-02 S 5.60e-04
          8 22 1.805856299e+04 7.209e-05 2.942e-04 9.95e-02 S 6.39e-07
          9 23 1.805021614e+04 4.622e-04 2.408e-04 7.91e-03 S 9.48e-34
          10 24 1.804022132e+04 5.537e-04 2.407e-04 3.01e-02 S 7.20e-34
          11 25 1.803586974e+04 2.412e-04 2.661e-04 1.06e-01 S 3.80e-06
          12 27 1.802502526e+04 6.013e-04 3.367e-04 6.35e-02 S 4.06e-05
          13 29 1.800653344e+04 1.026e-03 4.524e-04 1.53e-02 S 3.47e-04
          14 30 1.798339469e+04 1.285e-03 6.068e-04 3.09e-02 S 7.18e-06
          15 31 1.796664806e+04 9.312e-04 6.901e-04 8.23e-02 S 7.46e-35
          16 32 1.794793142e+04 1.042e-03 7.676e-04 6.00e-02 S -7.36e-35
          17 34 1.791729655e+04 1.707e-03 8.692e-04 2.94e-02 S 6.75e-05
          18 35 1.789615002e+04 1.180e-03 9.376e-04 3.87e-02 S 3.15e-35
          19 36 1.786156411e+04 1.933e-03 1.038e-03 5.94e-02 S 2.07e-36
          20 38 1.782732713e+04 1.917e-03 1.214e-03 8.15e-02 S 1.81e-05
          21 39 1.778277460e+04 2.499e-03 1.500e-03 6.31e-02 S 5.32e-37
          22 40 1.771889114e+04 3.592e-03 1.733e-03 7.45e-02 S 1.26e-37
          23 41 1.764995944e+04 3.890e-03 1.900e-03 1.83e-01 S 3.11e-38
          24 42 1.758325798e+04 3.779e-03 2.044e-03 4.90e-01 S 7.76e-39
          25 43 1.750241203e+04 4.598e-03 2.301e-03 5.85e-01 S 1.94e-39
          26 44 1.740544473e+04 5.540e-03 3.904e-03 5.96e-01 S 2.25e-39
          27 45 1.733492158e+04 4.052e-03 3.826e-03 2.72e-01 S 5.64e-39
          28 46 1.724556716e+04 5.155e-03 3.623e-03 1.76e-01 S 1.81e-38
          29 47 1.708002725e+04 9.599e-03 3.674e-03 1.30e-01 S 1.35e-38
          30 57 1.706870580e+04 6.628e-04 5.234e-03 4.31e-04 S 4.49e-04
          31 60 1.702178985e+04 2.749e-03 2.486e-03 9.83e-06 S 3.16e+00
          32 61 1.694418191e+04 4.559e-03 3.487e-03 1.52e-05 S 9.02e-04
          33 63 1.690452684e+04 2.340e-03 2.118e-03 7.42e-06 S 4.96e+00
          34 64 1.684153531e+04 3.726e-03 3.029e-03 1.31e-05 S 6.50e-03
          35 65 1.677140908e+04 4.164e-03 3.075e-03 6.21e-05 S 1.01e-02
          36 66 1.673005056e+04 2.466e-03 2.029e-03 5.97e-05 S 1.01e-02
          37 67 1.671137188e+04 1.116e-03 1.065e-03 5.82e-05 S 3.54e-03
          38 69 1.670670192e+04 2.794e-04 2.669e-04 5.33e-06 S 1.02e+00
          39 70 1.670477523e+04 1.153e-04 1.254e-04 1.02e-05 S 1.09e-01
          40 71 1.670363107e+04 6.849e-05 6.585e-05 8.90e-05 S 1.46e-35
          41 72 1.670361441e+04 9.970e-07 9.872e-07 5.51e-05 S -1.46e-35
          42 73 1.670361441e+04 7.084e-11 7.051e-11 5.77e-05 S -1.46e-35

          ***** Singular convergence *****

          Estimated parameters:
          Estimate
          asc_disp_mu 6.954e+04
          asc_disp_sig 30.964
          asc_rech_mu 1.052e+05
          asc_rech_sig 101.607
          asc_cig_mu -2.898e+04
          asc_cig_sig 116.622
          bprice_mu 52.944
          bprice_sig 31.213
          bext1_mu 5.333e+04
          bext1_sig -4.149
          bext2_mu -4.399e+04
          bext2_sig 45.448
          bext3_mu 4.087e+04
          bext3_sig 5.883
          bharm_mu 1.134e+04
          bharm_sig 89.968
          bhide_mu 4.596e+04
          bhide_sig -84.626
          bflavour1_mu 4.336e+04
          bflavour1_sig -12.240
          bflavour2_mu 6.809e+04
          bflavour2_sig -26.809
          mu_col 0.000
          mu_arg 2.849e-37
          mu_chi 6.284e-38

          Final LL: -16703.6144

          WARNING: Your model did not converge properly, and some of your
          parameter values are tending to +/- infinity. This
          could point to an identification issue. If you want to
          retain these parameters in the model, you may wish to
          set their value(s) in apollo_beta to the estimated
          value(s), include the parameter name(s) in
          apollo_fixed, and re-estimate the model.
          WARNING: Estimation failed. No covariance matrix to compute.
          Calculating log-likelihood at equal shares (LL(0)) for
          applicable models...
          Calculating log-likelihood at observed shares from
          estimation data (LL(c)) for applicable models...
          Calculating LL of each model component...
          Calculating other model fit measures
          INFORMATION: Your model took more than 10 minutes to estimate, so it
          was saved to file
          output/MMNL_cont_sinInt_pool_NOFUMA1825_model.rds
          before calculating its covariance matrix. If
          calculation of the covariance matrix fails or is
          stopped before finishing, you can load the model up to
          this point using apollo_loadModel.

          Your model was estimated using the BGW algorithm. Please
          acknowledge this by citing Bunch et al. (1993) - DOI
          10.1145/151271.151279



          > apollo_modelOutput(model)
          Model run by paul.rodriguez using Apollo 0.3.3 on R 4.3.2 for Windows.
          Please acknowledge the use of Apollo by citing Hess & Palma (2019)
          DOI 10.1016/j.jocm.2019.100170
          www.ApolloChoiceModelling.com

          Model name : MMNL_cont_sinInt_pool_NOFUMA1825
          Model description : Mixed Logit (MMNL) model, based on final data
          Model run at : 2025-05-06 13:32:22.291758
          Estimation method : bgw
          Model diagnosis : Singular convergence
          Number of individuals : 1597
          Number of rows in database : 13120
          Number of modelled outcomes : 13120
          col : 3880
          arg : 4760
          chi : 4480

          Number of cores used : 1
          Number of inter-individual draws : 1000 (mlhs)

          LL(start) : -18188.18
          LL (whole model) at equal shares, LL(0) : -18188.18
          LL (whole model) at observed shares, LL(C) : -17676.48
          LL(final, whole model) : -16703.61
          Rho-squared vs equal shares : 0.0816
          Adj.Rho-squared vs equal shares : 0.0803
          Rho-squared vs observed shares : 0.055
          Adj.Rho-squared vs observed shares : 0.0542
          AIC : 33455.23
          BIC : 33605.55

          LL(0,col) : -5378.82
          LL(final,col) : -5378.82
          LL(0,arg) : -6598.76
          LL(final,arg) : -5766.41
          LL(0,chi) : -6210.6
          LL(final,chi) : -5558.38

          Estimated parameters : 24
          Time taken (hh:mm:ss) : 01:54:20.74
          pre-estimation : 00:05:31.43
          estimation : 01:46:59.91
          post-estimation : 00:01:49.39
          Iterations : 42 (Singular convergence)

          Unconstrained optimisation.

          Estimates:
          Estimate s.e. t.rat.(0) Rob.s.e. Rob.t.rat.(0)
          asc_disp_mu 6.954e+04 NA NA NA NA
          asc_disp_sig 30.964 NA NA NA NA
          asc_rech_mu 1.052e+05 NA NA NA NA
          asc_rech_sig 101.607 NA NA NA NA
          asc_cig_mu -2.898e+04 NA NA NA NA
          asc_cig_sig 116.622 NA NA NA NA
          bprice_mu 52.944 NA NA NA NA
          bprice_sig 31.213 NA NA NA NA
          bext1_mu 5.333e+04 NA NA NA NA
          bext1_sig -4.149 NA NA NA NA
          bext2_mu -4.399e+04 NA NA NA NA
          bext2_sig 45.448 NA NA NA NA
          bext3_mu 4.087e+04 NA NA NA NA
          bext3_sig 5.883 NA NA NA NA
          bharm_mu 1.134e+04 NA NA NA NA
          bharm_sig 89.968 NA NA NA NA
          bhide_mu 4.596e+04 NA NA NA NA
          bhide_sig -84.626 NA NA NA NA
          bflavour1_mu 4.336e+04 NA NA NA NA
          bflavour1_sig -12.240 NA NA NA NA
          bflavour2_mu 6.809e+04 NA NA NA NA
          bflavour2_sig -26.809 NA NA NA NA
          mu_col 0.000 NA NA NA NA
          mu_arg 2.849e-37 NA NA NA NA
          mu_chi 6.284e-38 NA NA NA NA

Re: Mixed Logit Joint Estimation Fails with Singular Convergence

Posted: 19 May 2025, 18:46
by stephanehess
Hi

you have initialised the scale parameters to 0, instead of 1, and also fixed the one for Colombia to 0, which would mean that none of the other parameters can actually influence the likelihood.

Stephane

Re: Mixed Logit Joint Estimation Fails with Singular Convergence

Posted: 23 May 2025, 17:56
by androdri
Many thanks! Good catch