NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Dahabreh IJ, Trikalinos TA, Lau J, et al. An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2012 Nov.

Cover of An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy

An Empirical Assessment of Bivariate Methods for Meta-Analysis of Test Accuracy [Internet].

Show details

Appendix DWorked Meta-Analysis Example

Here we present a worked meta-analysis example using several of the methods we employed in the report. In addition to the results presented in the main text of the report, we present additional model diagnostics for maximum likelihood and Bayesian methods. Our data were derived from Table 1 of Arends et al. (Med Decis Making, 2008) and pertain to the test performance of aspiration cytologic examination of the breast for the detection of cancer. The original source of the data is Giard and Hermans (Cancer, 1992). We reproduce the data in Table D-1.

Table D-1. Example meta-analysis data.

Table D-1

Example meta-analysis data.

We used Stata version IC/12 (Stata Corp., College Station, TX) to implement all non-Bayesian analyses presented in the report; nonetheless, code should be easy to translate into other statistical packages that provide maximum likelihood implementations for general and generalized linear models. To run our code users will need a version of Stata that includes packages for mixed effects generalized linear models (version 10 or later), as well as the user-contributed packages metan, metandi, metareg, and mvmeta. Our data is saved in a file named example.dta.

/* enter the data */
use example.dta, clear

/* some data transformations */
/* logit sensitivity, with continuity correction */
generate b1 = logit(TP/(TP + FN))
replace b1 = logit((TP+0.5)/(TP+FN+2*0.5)) if TN == 0 | FP == 0 | TP == 0 | FN == 0
generate V11 = 1/TP + 1/FN if TP!= 0 & FN!= 0
replace V11 = 1/(TP+0.5) + 1/(FN+0.5) if TN == 0 | FP == 0 | TP == 0 | FN == 0
generate se_b1 = sqrt(V11)

/* logit specificity, with continuity correction */
generate b2 = logit(TN/(TN+FP))
replace b2 = logit((TN+0.5)/(TN+FP+2*0.5)) if TN == 0 | FP == 0 | TP == 0 | FN == 0
generate V22 = 1/TN + 1/FP
replace V22 = 1/(TN + 0.5) + 1/(FP + 0.5) if TN == 0 | FP == 0 | TP == 0 | FN == 0
generate se_b2 = sqrt(V22)

/***********************************************************/
/* univariate meta-analyses of sensitivity and specificity */
/***********************************************************/
/* normal within-study likelihood */
/* FE inverse variance */
metan b1 se_b1, fixedi nograph z/* summary logit sensitivity */
metan b2 se_b2, fixedi nograph z/* summary logit specificity */

/* normal within-study likelihood */
/* DerSimonian-Laird method for RE*/
metan b1 se_b1, randomi nograph z/* summary logit sensitivity */
metan b2 se_b2, randomi nograph z/* summary logit specificity */

/* normal within-study likelihood */
/* REML estimation for RE */
metareg b1, wsse(se_b1) reml z/* summary logit sensitivity */
metareg b2, wsse(se_b2) reml z/* summary logit specificity */

/* exact within study likelihood */
/* ML for random effects */
generate id = _n/* study id */
generate disease = TP + FN/* total individuals with disease */
xtmelogit TP || id:, binomial(disease) intp(5)/* intercept only model */

/* exact within study likelihood */
/* ML for random effects */
generate healthy = FP + TN/* total individuals without disease */
xtmelogit TN || id:, binomial(healthy) intp(5)/* intercept only model */

/***********************************************************/
/* bivariate meta-analyses of sensitivity and specificity */
/***********************************************************/
/* normal within-study likelihood */
/* generalized DerSimonian-Laird method for RE*/
mvmeta b V, corr(0) mm/* the within-study correlation is set to zero */

/* normal within-study likelihood */
/* REML estimation for RE */
mvmeta b V, corr(0) reml/* the within-study correlation is set to zero */

/* exact within study likelihood */
/* ML for random effects */
metandi TP FP FN TN

/**************/
/* ROC curves */
/**************/

/* sROC unweighted */
generate D = b1 - (1 - b2)/* logit sensitivity minus FPR */
generate S = b1 + (1 - b2)/* logit sensitivity plus FPR */
regress D S/*unweighted Moses-Littenberg model*/

/*save estimates to obtain the graph*/
matrix estimates = e(b)
 local beta_sroc_unweighted = estimates[1,1]
 local alpha_sroc_unweighted = estimates[1,2]
 local a_un = ‘alpha_sroc_unweighted’/(1 - ‘beta_sroc_unweighted’)
 local b_un = (1 + ‘beta_sroc_unweighted’)/(1 - ‘beta_sroc_unweighted’)

/* sROC weighted */
/* obtain the weights, with continuity correction if needed */
generate se = sqrt(1/TP + 1/FP + 1/FN + 1/TN)
replace se = sqrt(1/(TP+0.5) + 1/(FP+0.5) + 1/(FN+0.5) + 1/(TN+0.5)) if se == .
vwls D S, sd(se)/*weighted Moses-Littenberg model*/

matrix estimates = e(b)
 local beta_sroc_weighted = estimates[1,1]
 local alpha_sroc_weighted = estimates[1,2]
 local a_w = ‘alpha_sroc_weighted’/(1 - ‘beta_sroc_weighted’)
 local b_w = (1 + ‘beta_sroc_weighted’)/(1 - ‘beta_sroc_weighted’)
/* joint graph for comparison */
graph two ( function y = invlogit(‘a_un’ + ‘b_un’ * logit(x)),///
           lcol(black) lpat(dash) n(1000) range(0 1))///
          ( function y = invlogit(‘a_w’ + ‘b_w’ * logit(x)),///
           lcol(black) n(1000) range(0 1))///
          ||, ylabel(0 0.2 0.4 0.6 0.8 1.0)///
           aspectratio(1) scheme(s1mono)///
           plotregion(style(none))///
           xtitle(“1 - specificity”)///
           ytitle(“sensitivity”)///
           legend(off)

After the last command, we obtain Figure D-1.

This is a plot of 2 summary receiver operating characteristic curves for the example meta-analysis presented in Table D-1. The graph shows that the curves “track together” throughout the receiver operating characteristic space, indicating that the methods compared produce similar results.

Figure D-1

SROC curves for the example in Table D-1.

The dashed curve is derived from the unweighted analysis. The solid line is derived from the weighted analysis.

/* use mixed effects logistic regression to fit the bivariate model */
/* then use the estimates to fit different curves as in Arends et al. 2008*/

use example.dta, clear

generate persons = _n
generate n1 = TP + FN
generate n0 = TN + FP
generate detect1 = TP
generate detect0 = TN
reshape long n detect, i(persons) j(d1)
generate d0 = -(1 - d1)/* data transformation to replicate analyses in Arends et al. 2008 */

/* fit the bivariate model */
xtmelogit detect d1 d0, nocons || persons: d1 d0,///
 nocons covariance(un)///
 binomial(n) diff intp(10) refineopts(iterate(3))
 matrix estimates = e(b)
 matrix variances = e(V)

 local mean_se = estimates[1,1]
 local mean_sp = estimates[1,2]

nlcom exp(2 * [lns1_1_1]_b[_cons])
 matrix var_mean_se = r(b)
 local var_mean_se = var_mean_se[1,1]
 nlcom exp(2 * [lns1_1_2]_b[_cons])
 matrix var_mean_sp = r(b)
 local var_mean_sp = var_mean_sp[1,1]

nlcom exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])
 matrix estimates = r(b)
 local cov_se_sp = estimates[1,1]

/* now obtain and store estimates for the 5 ROC curves */
/* eta on ksi */
nlcom (exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons]))/exp(2 * [lns1_1_2]_b[_cons])
 matrix estimate = r(b)
 local beta_eta_on_ksi = estimate[1,1]
nlcom _b[d1] - exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])/exp(2 * [lns1_1_2]_b[_cons]) * _b[d0]
 matrix estimate = r(b)
 local alpha_eta_on_ksi = estimate[1,1]

/*ksi on eta*/
nlcom exp(2 * [lns1_1_1]_b[_cons])/((exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])))
 matrix estimate = r(b)
 local beta_ksi_on_eta = estimate[1,1]
nlcom _b[d1] - exp(2 * [lns1_1_1]_b[_cons])/(exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])) * _b[d0]
 matrix estimate = r(b)
 local alpha_ksi_on_eta = estimate[1,1]

/* D on S */
nlcom (exp(2 * [lns1_1_1]_b[_cons]) + (exp([lns1_1_1]_b[_cons]) *
exp([lns1_1_2]_b[_cons])*tanh([atr1_1_1_2]_b[_cons])))/(exp(2 * [lns1_1_2]_b[_cons]) +
(exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])))
 matrix estimate = r(b)
 local beta_d_on_s = estimate[1,1]
nlcom _b[d1] - (exp(2 * [lns1_1_1]_b[_cons]) + (exp([lns1_1_1]_b[_cons]) *
exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])))/(exp(2 *
[lns1_1_2]_b[_cons]) + (exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])))* _b[d0]
 matrix estimate = r(b)
 local alpha_d_on_s = estimate[1,1]

/* R & G */
nlcom sqrt(exp(2 * [lns1_1_1]_b[_cons]))/sqrt(exp(2 * [lns1_1_2]_b[_cons]))
   matrix estimate = r(b)
   local beta_r_g = estimate[1,1]
nlcom _b[d1] - sqrt(exp(2 * [lns1_1_1]_b[_cons]))/sqrt(exp(2 * [lns1_1_2]_b[_cons]))* _b[d0]
   matrix estimate = r(b)
   local alpha_r_g = estimate[1,1]

/* MAR */
nlcom (exp(2 * [lns1_1_1]_b[_cons]) - exp(2 * [lns1_1_2]_b[_cons]) + sqrt((exp(2 * [lns1_1_1]_b[_cons]) - exp(2 * [lns1_1_2]_b[_cons]))^2 + 4*(exp([lns1_1_1]_b[_cons]) *
exp([lns1_1_2]_b[_cons]) *tanh([atr1_1_1_2]_b[_cons]))^2))/(2*exp([lns1_1_1]_b[_cons]) *
exp([lns1_1_2]_b[_cons]) *tanh([atr1_1_1_2]_b[_cons]))
    matrix estimate = r(b)
    local beta_mar = estimate[1,1]

nlcom _b[d1] - _b[d0]*((exp(2*[lns1_1_1]_b[_cons])-
exp(2*[lns1_1_2]_b[_cons])+sqrt((exp(2*[lns1_1_1]_b[_cons])-exp(2*[lns1_1_2]_b[_cons]))^2 +
4*(exp([lns1_1_1]_b[_cons])*exp([lns1_1_2]_b[_cons])*tanh([atr1_1_1_2]_b[_cons]))^2))/
(2*exp([lns1_1_1]_b[_cons]) * exp([lns1_1_2]_b[_cons]) * tanh([atr1_1_1_2]_b[_cons])))
    matrix estimate = r(b)
    local alpha_mar = estimate[1,1]
/* plot the curves */
/* ROC space */
graph two (function y = invlogit(‘alpha_eta_on_ksi’ + ‘beta_eta_on_ksi’*logit(x))///
, lcol(green) range(0 1) n(1000))///
  (function y = invlogit(‘alpha_ksi_on_eta’ + ‘beta_ksi_on_eta’*logit(x)), lcol(red) range(0 1)
n(1000))///
  (function y = invlogit(‘alpha_d_on_s’ + ‘beta_d_on_s’*logit(x)), lcol(blue) range(0 1) n(1000)
)///
  (function y = invlogit(‘alpha_r_g’ + ‘beta_r_g’*logit(x)), lcol(orange) range(0 1) n(1000))
///
  (function y = invlogit(‘alpha_mar’ + ‘beta_mar’*logit(x)), lcol(purple) range(0 1) n(1000))
///
  ||, scheme(s1mono) plotregion(style(none))///
    aspectratio(1)///
    xtitle(“ ” “1 - Specificity”, size(*0.7)) ytitle(“ ” “Sensitivity”, size(*0.7))///
    xlabel(0 “0” 0.2 “0.2” 0.4 “0.4” 0.6 “0.6” 0.8 “0.8” 1 “1.0”, labsize(*0.7))///
    ylabel(0 “0” 0.2 “0.2” 0.4 “0.4” 0.6 “0.6” 0.8 “0.8” 1 “1.0”, angle(0) labsize(*0.7))///
    legend( lab(1 “ {&eta}∼{&xi}”) lab(2 “{&xi}∼{&eta}”) lab(3 “D∼S”) lab(4 “R & G”) lab(5
“MAR”))

After the last command, we obtain Figure D-2.

This is a plot of 5 summary receiver operating characteristic curves based on the bivariate meta-analysis model. In this example (the same presented in Table D-1 and Figure D-1) the graphs from the five methods “track together” throughout the receiver operating characteristic space indicating that the methods compared produce fairly similar, but not identical, estimates of the curve parameters. Please see the text for a discussion of why the curves may produce different results.

Figure D-2

Alternative HSROC curves for the example in Table D-1.

/* logit space */
graph two (function y = (‘alpha_eta_on_ksi’ + ‘beta_eta_on_ksi’*(x)), lcol(green) range(-6 1)
n(1000))///
 (function y = (‘alpha_ksi_on_eta’ + ‘beta_ksi_on_eta’*(x)), lcol(red) range(-6 1) n(1000))///
 (function y = (‘alpha_d_on_s’ + ‘beta_d_on_s’*(x)), lcol(blue) range(-6 1) n(1000))///
 (function y = (‘alpha_r_g’ + ‘beta_r_g’*(x)), lcol(orange) range(-6 1) n(1000))///
 (function y = (‘alpha_mar’ + ‘beta_mar’*(x)), lcol(purple) range(-6 1) n(1000))///
 ||, scheme(s1mono) plotregion(style(none))///
     xtitle(“ ” “logit(1 - specificity)”, size(*0.7)) ytitle(“ ” “logit(sensitivity)”, size(*0.7))///
     legend(lab(1 “ {&eta}∼{&xi}”) lab(2 “{&xi}∼{&eta}”) lab(3 “D∼S”) lab(4 “R & G”) lab(5
“MAR”))///
     aspectratio(1)

After the last command, we obtain Figure D-3.

This is a plot of 5 summary receiver operating characteristic curves based on the bivariate meta-analysis model; these are the sane curves shown in Figure D-2 but plotted in the logit space. This graph (because of the transformation to the logit scale) emphasizes the point that the alternative parameterizations of the summary receiver operating characteristic curve can result in different intercepts and slopes for the fitted curve. Please see the main text of the report for a discussion of why these differences occur.

Figure D-3

Alternative HSROC curves for the example in Table D-1 (logit space).

For Bayesian analyses, users should use the model presented in Appendix B. In this specific example, for the last 10,000 iterations (of the 20,000 run as burn-in) for three chains initialized using different starting values, we obtained the following trace plots for logit-sensitivity, logit-specificity, and the between-study correlation (Figure D-4; of course other parameters need to be monitored as well).

This figure presents trace plots (a kind of line graph) for 3 model parameters of the Bayesian analysis (as described in the text). For each parameter 3 chains have been initialized. The chains – for all 3 parameters – are mixing well, indicating that the algorithm has converged successfully.

Figure D-4

Trace plots for summary sensitivity, specificity, and correlation.

Dashed lines indicate the medians of the posterior distributions.

Convergence was also assessed with the Gelman-Rubin diagnostic. In this example, after 20,000 iterations, the final median values for the statistic were 0.99< R <1.01, for logit-sensitivity, logit-specificity, the between-study correlation, and the between-study variances of sensitivity and specificity, indicating that the model had converged. After convergence, we run the model for an additional 10,000 iterations and used the results to obtain density plots and summary statistics for the parameters of interest. For example, we obtained the following density plots for the posterior distributions of the summary sensitivity, specificity, and their correlation (Figure D-5).

This figure presents posterior density plots for three model parameters of the Bayesian analysis (as described in the text). All three posterior distributions are unimodal but non-symmetric, indicating that the Bayesian 95% credibility intervals may better reflect uncertainty in the model estimates (compared to the 95% confidence intervals obtained from maximum likelihood analyses).

Figure D-5

Posterior densities for sensitivity, specificity, and correlation. Red lines are kernel densities.

For comparison, summary results from all meta-analysis methods for this example are summarized in Table D-2.

Table D-2. Summary results from all meta-analysis methods (for the example in Table D1).

Table D-2

Summary results from all meta-analysis methods (for the example in Table D1).

Bookshelf ID: NBK115739

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (4.2M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...