Send to

Choose Destination
Radiology. 2015 Mar;274(3):781-9. doi: 10.1148/radiol.14141160. Epub 2014 Oct 27.

Reporting diagnostic accuracy studies: some improvements after 10 years of STARD.

Author information

From the Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1105 AZ, Amsterdam, the Netherlands (D.A.K., J.W., M.M.L., P.M.M.B.); Dutch Cochrane Centre, Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands (W.A.V.E.); Dutch Cochrane Centre, Julius Center for Health Sciences and Primary Care, University Medical Centre Utrecht, Utrecht, the Netherlands (L.H.); and Department of Epidemiology, University Medical Center Groningen, Groningen, the Netherlands (N.S.).



To evaluate how diagnostic accuracy study reports published in 2012 adhered to the Standards for Reporting of Diagnostic Accuracy (STARD) statement and whether there were any differences in reporting compared with 2000 and 2004.


PubMed was searched for studies published in 12 high-impact-factor journals in 2012 that evaluated the accuracy of one or more diagnostic tests against a clinical reference standard. Two independent reviewers scored reporting completeness of each article with the 25-item STARD checklist. Mixed-effects modeling was used to analyze differences in reporting with previous evaluations from articles published in 2000 and 2004.


Included were 112 articles. The overall mean number of STARD items reported in 2012 was 15.3 ± 3.9 (standard deviation; range, 6.0-23.5). There was an improvement of 3.4 items (95% confidence interval: 2.6, 4.3) compared with studies published in 2000, and an improvement of 1.7 items (95% confidence interval: 0.9, 2.5) compared with studies published in 2004. Significantly more items were reported for single-gate studies compared with multiple-gate studies (16.8 vs 12.1, respectively; P < .001) and for studies that evaluated imaging tests compared with laboratory tests and other types of tests (17.0 vs 14.0 vs 14.5, respectively; P < .001).


Completeness of reporting improved in the 10 years after the launch of STARD, but it remains suboptimal for many articles. Reporting of inclusion criteria and sampling methods for recruiting patients, information about blinding, and confidence intervals for accuracy estimates are in need of further improvement.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center