Skip to contents

Introduction

The bias-corrected Bayesian non-parametric (BC-BNP) model of Verde and Rosner (2025) is a flexible approach to publication bias adjustment that does not impose a parametric form on the selection mechanism. Unlike selection weight models (which specify a piecewise constant or smooth weight function), or PET-PEESE (which assumes a linear relationship between effect and standard error), the BC-BNP model uses a Bayesian non-parametric mixture to model the joint distribution of true effects and publication indicators.

The key properties of the BC-BNP model are:

  • Non-parametric flexibility: The distribution of true effects is modelled as an infinite mixture using a Dirichlet process prior. This avoids committing to a parametric form for heterogeneity.
  • Bias correction: Publication probabilities are estimated from the data, not fixed in advance.
  • Coherent uncertainty: Uncertainty about the selection mechanism propagates into the posterior for the pooled effect.

Model specification

Let (θi,δi)(\theta_i, \delta_i) denote the true effect and publication indicator for study ii, where δi=1\delta_i = 1 if the study is published. Only published studies are observed.

The joint distribution of effects and publication indicators is modelled as

(θi,δi)G (\theta_i, \delta_i) \sim G

where GG is drawn from a Dirichlet process:

GDP(α0,G0) G \sim \text{DP}(\alpha_0, G_0)

with concentration parameter α0\alpha_0 and base distribution G0G_0.

The selection mechanism enters through the publication probability:

P(δi=1θi,si)=π(si,θi) P(\delta_i = 1 \mid \theta_i, s_i) = \pi(s_i, \theta_i)

In the BC-BNP model, π\pi is estimated non-parametrically as a function of the observed precision 1/si1/s_i and the effect θi\theta_i.

The likelihood for the observed data is

p(𝐲G)=i=1k𝒩(yiθi,si2)π(si,θi)dG(θi) p(\mathbf{y} \mid G) = \prod_{i=1}^{k} \int \mathcal{N}(y_i \mid \theta_i, s_i^2) \cdot \pi(s_i, \theta_i) \cdot dG(\theta_i)

Key estimands

The primary estimand is the marginal distribution of θ\theta under GG, averaged over studies:

μadj=𝔼G[θ] \mu_{\text{adj}} = \mathbb{E}_G[\theta]

This is the bias-corrected pooled effect. The BC-BNP model also provides a posterior predictive distribution for θ\theta in a new study drawn from the same (corrected) distribution.

Concentration parameter

The Dirichlet process concentration α0\alpha_0 controls the prior expected number of distinct components in the mixture. Higher α0\alpha_0 allows more clusters. bayesma uses a Gamma(1,1)\text{Gamma}(1, 1) prior on α0\alpha_0 by default.

Fitting the BC-BNP model

fit_bcbnp <- bayesma(
  data,
  model_type    = "bc_bnp",
  alpha_prior   = gamma(1, 1),
  p_bias_prior  = beta(1, 1)
)

p_bias_prior controls the prior probability that any given study would have been suppressed. A Beta(1,1)\text{Beta}(1, 1) prior (uniform) is weakly informative.

Comparison with parametric models

Property BC-BNP Selection model PET-PEESE
Selection mechanism Non-parametric Piecewise weight function Linear in SE
Effect distribution Dirichlet process Parametric (Normal, tt) Normal
Computational cost High Moderate Low
Sensitivity to kk High (k>20k > 20 recommended) Moderate Low

The BC-BNP model is the most flexible of the three. It is most informative in large meta-analyses (k>20k > 20) where the non-parametric component can be identified from the data. In smaller meta-analyses, the posterior for μadj\mu_{\text{adj}} is sensitive to the priors on α0\alpha_0 and π\pi.

Prior sensitivity

Because the Dirichlet process prior has a strong influence on the posterior when kk is moderate, a sensitivity analysis over α0\alpha_0 is recommended. Compare the bias-corrected estimate across:

fit_alpha1  <- bayesma(data, model_type = "bc_bnp", alpha_prior = gamma(1, 1))
fit_alpha5  <- bayesma(data, model_type = "bc_bnp", alpha_prior = gamma(1, 0.2))
fit_alpha10 <- bayesma(data, model_type = "bc_bnp", alpha_prior = gamma(1, 0.1))

If the bias-corrected estimate is stable across these, the conclusion is robust to the non-parametric specification.