Introduction
The standard Gaussian random-effects model assumes that true study effects follow a single normal distribution. This assumption breaks down when the effect distribution is bimodal — for example, when two qualitatively distinct subpopulations of studies exist (e.g., patient subgroups, different operationalisations of the intervention) — or when a small number of studies produce effects so discrepant that they distort the pooled estimate.
bayesma provides two random-effects mixture models to handle these situations:
- Two-component Gaussian mixture: models the random-effects distribution as a mixture of two normal components with different means and variances.
- Robust outlier mixture: identifies individual outlier studies without requiring them to belong to a separate substantive subpopulation. See Robust Outlier Mixture (Cruz).
This vignette covers the two-component mixture. For the outlier variant see Robust Outlier Mixture (Cruz).
Two-component Gaussian mixture model
Model specification
The true-effect distribution is a mixture of two Gaussians:
Marginalising over gives the integrated observation-level likelihood:
The two components have means (an ordering constraint resolves label switching) and heterogeneities , .
Priors
The Beta(2, 2) prior on the mixing proportion favours balanced mixtures slightly over degenerate ones (all mass in one component), while still allowing unequal mixtures.
Fitting the model
summary() reports posteriors for , , , , and .
Key estimands
Component means and
The two means characterise the two subpopulations of studies. If the posteriors for and substantially overlap, the data do not support a bimodal structure.
Mixing proportion
is the estimated fraction of studies in the first (lower-effect) component. A posterior for concentrated near 0 or 1 indicates near-degenerate mixing — the data are consistent with a single Gaussian. Intermediate values (e.g., 0.2–0.8) suggest genuine bimodality.
Weighted pooled effect
The model-averaged pooled effect is
This is reported as b_Intercept in the output and serves as the overall summary when reporting a single number is required.
Assessing whether the mixture is supported
Compare the mixture model against the standard Gaussian RE model using WAIC or LOO:
#| eval: false
fit_re <- bayesma(data)
fit_mix <- bayesma(data, model = "re_mixture")
compare_models(fit_re, fit_mix, labels = c("Gaussian RE", "RE mixture"),
criterion = "loo")A substantially lower LOO for the mixture model indicates that the two-component structure fits meaningfully better. If the improvement is small, the standard RE model is adequate and more parsimonious.
Interpreting a well-separated mixture
When and are well-separated (credible intervals non-overlapping), investigate moderators that distinguish the two study clusters. The mixture model can then motivate a meta-regression with the suspected moderator:
#| eval: false
fit_reg <- meta_reg(data, moderators = ~ patient_type)See Meta-Regression.
Limitations
- Label switching: despite the constraint, MCMC chains can occasionally swap component labels for extreme values of . Check trace plots for both components.
- Identifiability degrades with small : the mixture adds three extra parameters (, , ) and requires enough studies for both components to be adequately informed.
- The two-component model cannot capture more complex multimodal structures. Three-component mixtures are available but rarely warranted in practice.
For the Stan code underlying this model, see Stan Code — RE Mixture Model.
