Model description
The Copas selection model (Copas & Shi, 2001) models publication probability as a function of study precision (sample size). Smaller studies — which have higher standard errors — are less likely to be published than larger studies. The model estimates both the pooled effect and the selection mechanism jointly.
Mathematical specification
Outcome model:
Selection mechanism:
where is the standard normal CDF, controls the baseline publication probability, and controls the association between precision () and publication.
Joint bivariate model:
The outcome and the selection indicator are jointly modelled as a bivariate normal:
where is the correlation between the true outcome and the latent propensity to publish, and .
The observed-data likelihood (conditioning on ) is:
where .
Priors:
Stan code
data {
int<lower=1> N;
vector[N] y;
vector<lower=0>[N] se;
}
parameters {
real mu;
real<lower=0> tau;
real gamma0;
real<lower=0> gamma1;
real<lower=-1, upper=1> rho;
}
model {
target += normal_lpdf(mu | 0, 1);
target += cauchy_lpdf(tau | 0, 0.5);
target += normal_lpdf(gamma0 | 0, 1);
target += normal_lpdf(gamma1 | 0, 1);
target += uniform_lpdf(rho | -1, 1);
for (i in 1:N) {
real sigma_i = sqrt(square(se[i]) + square(tau));
real z_sel = gamma0 + gamma1 / se[i];
real z_adj = (z_sel + rho * (y[i] - mu) / sigma_i) / sqrt(1 - square(rho));
target += normal_lpdf(y[i] | mu, sigma_i) + normal_lcdf(z_adj | 0, 1)
- normal_lcdf(z_sel | 0, 1);
}
}
generated quantities {
real b_Intercept = mu;
}How bayesma calls this model
bayesma(
data,
model_type = "selection_copas",
gamma_prior = list(gamma0 = normal(0, 1), gamma1 = half_normal(0, 1))
)Parameterisation notes
- enforces the direction constraint: larger studies (smaller ) are more likely to be published.
- captures the relationship between effect magnitude and publication. Positive means larger effects are more likely to be published (a common publication bias mechanism).
- The normalising denominator
normal_lcdf(z_sel | 0, 1)corrects for the fact that only published studies are observed.
Sensitivity analysis
The Copas model is not identified without prior information on either or . bayesma provides a sensitivity plot over a grid of values:
sensitivity_plot(fit_copas, type = "copas_grid")Known sampling difficulties
The Copas likelihood is non-convex and can have multiple local modes in the space. Use at least 4 chains, inspect all traces, and run with adapt_delta = 0.99.
