Systematic reviews of diagnostic test accuracy.
Aertgeerts B, Altman D, Antes G, Bachmann L, Bossuyt P, Buchner H, Bunting P, Buntinx F, Craig J, D'Amico R, de Vet R, Deeks J, Doust R, Egger M, Eisinga A, Fillipini G, Flack-Ytter Y, Gatsonis C, Glas A, Glasziou P, Grossenbacher F, Harbord R, Hilden J, Hooft L, Horvath A, Hyde C, Irwig L, Kjeldstrøm M, Macaskill P, Mallett S, Mitchell R, Moore T, Moustgaard R, Oosterhuis W, Pai M, Paliwal P, Pewsner D, Reitsma H, Riis J, Riphagen I, Rutjes A, Scholten R, Smidt N, Sterne J, Takwoingi Y, van der Windt D, Vlassov V, Watine J, Whiting P.
- 1
- Dutch Cochrane Centre and Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.
Abstract
More and more systematic reviews of diagnostic test accuracy studies are being published, but they can be methodologically challenging. In this paper, the authors present some of the recent developments in the methodology for conducting systematic reviews of diagnostic test accuracy studies. Restrictive electronic search filters are discouraged, as is the use of summary quality scores. Methods for meta-analysis should take into account the paired nature of the estimates and their dependence on threshold. Authors of these reviews are advised to use the hierarchical summary receiver-operating characteristic or the bivariate model for the data analysis. Challenges that remain are the poor reporting of original diagnostic test accuracy studies and difficulties with the interpretation of the results of diagnostic test accuracy research.
Figure 1
Review authors’ judgments about quality items presented as percentages across all included studies. Based on a re-analysis of data from a systematic review on magnetic resonance imaging for multiple sclerosis29. The item “acceptable delay between tests” did not apply in this review. The authors considered the relative lack of acceptable reference standard as the main weakness of the review.
Ann Intern Med. 2008 Dec 16;149(12):889-897.
Figure 2
a and b: ROC showing pairs of sensitivity and specificity values for the included studies. The height of the rectangles is proportional to the number of patients with bladder cancer across studies, the width of the rectangles corresponds to the number of patients without bladder cancer. Figure 1.3a shows the summary ROC curve that can be drawn through these values. Figure 1.3b shows the summary point estimate (black spot) and its 95% confidence region around it. Based on a re-analysis of the data from Glas et al.10.
Ann Intern Med. 2008 Dec 16;149(12):889-897.
Figure 3
Forest plots of sensitivity and specificity of a tumor marker for bladder cancer. Based on a re-analysis of the data from Glas et al.10.
Ann Intern Med. 2008 Dec 16;149(12):889-897.
Figure 4
Direct comparison of two index tests for bladder cancer: cytology (squares) and bladder tumor antigen (diamonds). Figure 1.4a shows the summary ROC curve that can be drawn through these values. Figure 1.4b shows the summary point estimate of sensitivity and specificity (black spot) and its 95% confidence region around it. The two tests clearly show a trade-off between sensitivity and specificity: cytology has a significantly higher specificity (ellipse closest to Y-axis lower arrow points at ROC curve) and BTA has a significantly higher sensitivity (higher ellipse and arrow points at highest ROC curve). It will depend on the role of the test in practice which test is considered ‘best’. Based on a re-analysis of the data from Glas et al.10.
Ann Intern Med. 2008 Dec 16;149(12):889-897.
Publication types
MeSH terms
Grant support
Full Text Sources
Other Literature Sources
Medical