6.2 Meta-Analysis
Meta-analysis aims to pool the data from several studies, such as the application of a tutoring program in several schools or treatment using a drug in several clinical trials.
The Bayesian framework is particularly convenient for meta-analysis, because each previous study can be treated as providing a noisy measurement of some underlying quantity of interest. The model then follows directly from two components, a prior on the underlying quantities of interest and a measurement-error style model for each of the studies being analyzed.
Treatment Effects in Controlled Studies
Suppose the data in question arise from a total of \(M\) studies providing paired binomial data for a treatment and control group. For instance, the data might be post-surgical pain reduction under a treatment of ibuprofen Warn, Thompson, and Spiegelhalter (2002) or mortality after myocardial infarction under a treatment of beta blockers (Gelman et al. 2013, Section 5.6).
Data
The clinical data consists of \(J\) trials, each with \(n^t\) treatment cases, \(n^c\) control cases, \(r^t\) successful outcomes among those treated and \(r^c\) successful outcomes among those in the control group. This data can be declared in Stan as follows.13
data {
int<lower=0> J;
int<lower=0> n_t[J]; // num cases, treatment
int<lower=0> r_t[J]; // num successes, treatment
int<lower=0> n_c[J]; // num cases, control
int<lower=0> r_c[J]; // num successes, control
}
Converting to Log Odds and Standard Error
Although the clinical trial data are binomial in its raw format, it may be transformed to an unbounded scale by considering the log odds ratio \[ y_j = \log \left( \frac{r^t_j / (n^t_j - r^t_j)} {r^c_j / (n^c_j - r^c_j)} \right) \ \ = \ \ \log \left( \frac{r^t_j}{n^t_j - r^t_j} \right) - \log \left( \frac{r^c_j}{n^c_j - r^c_j} \right) \] and corresponding standard errors \[ \sigma_j = \sqrt{ \frac{1}{r^T_i} + \frac{1}{n^T_i - r^T_i} + \frac{1}{r^C_i} + \frac{1}{n^C_i - r^C_i} }. \]
The log odds and standard errors can be defined in a transformed parameter block, though care must be taken not to use integer division.14
transformed data {
real y[J];
real<lower=0> sigma[J];
for (j in 1:J)
y[j] = log(r_t[j]) - log(n_t[j] - r_t[j])
- (log(r_c[j]) - log(n_c[j] - r_c[j]);
for (j in 1:J)
sigma[j] = sqrt(1 / r_t[j] + 1 / (n_t[j] - r_t[j])
+ 1 / r_c[j] + 1 / (n_c[j] - r_c[j]));
}
This definition will be problematic if any of the success counts is zero or equal to the number of trials. If that arises, a direct binomial model will be required or other transforms must be used than the unregularized sample log odds.
Non-Hierarchical Model
With the transformed data in hand, two standard forms of meta-analysis can be applied. The first is a so-called “fixed effects” model, which assumes a single parameter for the global odds ratio. This model is coded in Stan as follows.
parameters {
real theta; // global treatment effect, log odds
}
model {
y ~ normal(theta, sigma);
}
The sampling statement for y
is vectorized; it has the same
effect as the following.
for (j in 1:J)
y[j] ~ normal(theta, sigma[j]);
It is common to include a prior for theta
in this model, but it
is not strictly necessary for the model to be proper because y
is fixed and \(\mathsf{normal}(y|\mu,\sigma) = \mathsf{normal}(\mu|y,\sigma)\).
Hierarchical Model
To model so-called “random effects,” where the treatment effect may vary by clinical trial, a hierarchical model can be used. The parameters include per-trial treatment effects and the hierarchical prior parameters, which will be estimated along with other unknown quantities.
parameters {
real theta[J]; // per-trial treatment effect
real mu; // mean treatment effect
real<lower=0> tau; // deviation of treatment effects
}
model {
y ~ normal(theta, sigma);
theta ~ normal(mu, tau);
mu ~ normal(0, 10);
tau ~ cauchy(0, 5);
}
Although the vectorized sampling statement for y
appears
unchanged, the parameter theta
is now a vector. The sampling
statement for theta
is also vectorized, with the
hyperparameters mu
and tau
themselves being given wide
priors compared to the scale of the data.
Rubin (1981) provided a hierarchical Bayesian meta-analysis of the treatment effect of Scholastic Aptitude Test (SAT) coaching in eight schools based on the sample treatment effect and standard error in each school.
Extensions and Alternatives
Smith, Spiegelhalter, and Thomas (1995) and (Gelman et al. 2013, Section 19.4) provides meta-analyses based directly on binomial data. Warn, Thompson, and Spiegelhalter (2002) consider the modeling implications of using alternatives to the log-odds ratio in transforming the binomial data.
If trial-specific predictors are available, these can be included directly in a regression model for the per-trial treatment effects \(\theta_j\).
References
Gelman, Andrew, J. B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. Third. London: Chapman &Hall/CRC Press.
Rubin, Donald B. 1981. “Estimation in Parallel Randomized Experiments.” Journal of Educational Statistics 6: 377–401.
Smith, Teresa C., David J. Spiegelhalter, and Andrew Thomas. 1995. “Bayesian Approaches to Random-Effects Meta-Analysis: A Comparative Study.” Statistics in Medicine 14 (24): 2685–99.
Warn, David E., S. G. Thompson, and David J. Spiegelhalter. 2002. “Bayesian Random Effects Meta-Analysis of Trials with Binary Outcomes: Methods for the Absolute Risk Difference and Relative Risk Scales.” Statistics in Medicine 21: 1601–23.