Skip to contents

Introduction

Selection models address publication bias by explicitly modelling the mechanism by which studies enter (or fail to enter) the published literature. Rather than adjusting the effect estimate via regression, selection models embed the publication process into the likelihood, correcting inference about μ\mu by down-weighting observations that are over-represented due to selection.

bayesma implements two complementary families:

  1. The Copas selection model — a latent-variable model in which selection depends on study precision.
  2. Vevea-Hedges weight-function models — a step-function model in which selection depends on the reported p-value.

Both are available through bayesma().

Copas selection model

Mechanism

Copas & Shi (2000) model each study as having been published if a latent selection variable exceeds zero:

Zi=γ0+γ1SEi+δi,δi𝒩(0,1) Z_i = \gamma_0 + \frac{\gamma_1}{SE_i} + \delta_i, \qquad \delta_i \sim \mathcal{N}(0, 1)

Study ii is observed if Zi>0Z_i > 0. The selection parameters (γ0,γ1)(\gamma_0, \gamma_1) control the shape of the selection mechanism:

  • γ0\gamma_0 is a baseline selection intercept (more negative → more severe selection overall).
  • γ1\gamma_1 governs how much easier large-precision (small-SESE) studies are to publish. Positive γ1\gamma_1 means precise studies are more likely to be published regardless of their result.

The selection variable ZiZ_i is correlated with the observed effect yiy_i through a joint bivariate normal:

(yiZi)𝒩((θiγ0+γ1/SEi),(si2ρsiρsi1)) \begin{pmatrix} y_i \\ Z_i \end{pmatrix} \sim \mathcal{N}\!\left(\begin{pmatrix} \theta_i \\ \gamma_0 + \gamma_1 / SE_i \end{pmatrix},\; \begin{pmatrix} s_i^2 & \rho\,s_i \\ \rho\,s_i & 1 \end{pmatrix}\right)

The correlation ρ\rho captures the tendency for studies with large effects to be published (effect-dependent selection). Under the null ρ=0\rho = 0, selection depends only on precision, not on the effect magnitude.

Priors

μ𝒩(0,1),τHalf-Cauchy(0,0.5) \mu \sim \mathcal{N}(0, 1), \qquad \tau \sim \text{Half-Cauchy}(0, 0.5)

γ0,γ1𝒩(0,1),ρUniform(1,1) \gamma_0, \gamma_1 \sim \mathcal{N}(0, 1), \qquad \rho \sim \text{Uniform}(-1, 1)

Informative priors can be supplied via prior_copas():

#| eval: false
fit_copas <- bayesma(
  data,
  model        = "copas",
  prior_copas  = prior_copas(gamma0 = normal(0, 1), rho = uniform(-1, 0))
)

Restricting ρ0\rho \leq 0 encodes the expectation that publication bias inflates effects upward.

Vevea-Hedges weight-function models

Mechanism

Vevea & Hedges (1995) model selection as a step function of the two-tailed p-value. Studies are retained with probability ωj\omega_j if their p-value falls in interval jj:

p(study observedpi)=ωj,pi(aj1,aj] p(\text{study observed} \mid p_i) = \omega_j, \quad p_i \in (a_{j-1},\; a_j]

The likelihood is reweighted so that over-represented p-value regions (e.g., p<.05p < .05) contribute proportionally less to the pooled estimate.

bayesma implements the one-sided step function by default, with cut-points at .025,.05,.10,.25,.50,1.0.025,\; .05,\; .10,\; .25,\; .50,\; 1.0:

ω1ω2ω6=1 \omega_1 \geq \omega_2 \geq \cdots \geq \omega_6 = 1

The final weight ω6=1\omega_6 = 1 is fixed as the reference (non-significant studies), and each ωj\omega_j gives the relative probability that a study in p-value category jj is published compared to a study with p>.50p > .50.

Priors

The weight parameters are assigned a Dirichlet prior to ensure proper identification:

(ω1,,ω5)Dirichlet(𝟏) (\omega_1, \ldots, \omega_5) \sim \text{Dirichlet}(\mathbf{1})

Custom weights can be specified via prior_weight_function().

One-sided vs. two-sided step function

#| eval: false
fit_wf1 <- bayesma(
  data,
  model       = "weight_function",
  weight_type = "one_sided"   # default
)

fit_wf2 <- bayesma(
  data,
  model       = "weight_function",
  weight_type = "two_sided"
)

The two-sided version uses the absolute p-value and is more appropriate when effects can be in either direction.

Fitting and comparing selection models

#| eval: false
fit_re    <- bayesma(data)
fit_copas <- bayesma(data, model = "copas")
fit_wf    <- bayesma(data, model = "weight_function")

compare_models(fit_re, fit_copas, fit_wf,
               labels = c("Random-effects", "Copas", "Weight function"))

compare_models() overlays the posteriors for μ\mu under each model, making it easy to see how much the publication-bias correction shifts the estimate and whether the two selection models agree.

Interpreting the output

For the Copas model, summary() returns:

Parameter Interpretation
mu Bias-corrected pooled effect
tau Between-study heterogeneity
gamma0 Selection intercept
gamma1 Precision-dependent selection
rho Effect-dependent selection correlation

For the weight-function model:

Parameter Interpretation
mu Bias-corrected pooled effect
tau Between-study heterogeneity
omega_j Relative publication probability for p-value category jj

Limitations

  • Both models require sufficient studies (k20k \geq 20 is a practical minimum for the weight-function model; Copas can work with fewer but is sensitive to the prior on ρ\rho).
  • The Copas model assumes a specific functional form for the selection mechanism. If the true mechanism differs, the correction can be imprecise.
  • Weight-function models assume that selection operates exclusively through p-values. Selection based on effect size direction or magnitude (without reference to p-values) is not captured.
  • Neither model identifies publication bias if the filed studies are completely missing (i.e., the selection mechanism is so strong that no unpublished studies share observable characteristics with the published set).

For a model-averaged approach that includes both PET-PEESE and weight functions as bias components, see Robust Bayesian Meta-Analysis (RoBMA).

References

Copas J, Shi JQ (2000). Meta-analysis, funnel plots and sensitivity analysis. Biostatistics, 1(3), 247–262.

Vevea JL, Hedges LV (1995). A general linear model for estimating effect size in the presence of publication bias. Psychometrika, 60(3), 419–435.