Find duplicates among the PubMed, EMBASE, and Cochrane Library Databases in systematic review

PLoS One. 2013 Aug 20;8(8):e71838. doi: 10.1371/journal.pone.0071838. eCollection 2013.

Abstract

Background: Finding duplicates is an important phase of systematic review. However, no consensus regarding the methods to find duplicates has been provided. This study aims to describe a pragmatic strategy of combining auto- and hand-searching duplicates in systematic review and to evaluate the prevalence and characteristics of duplicates.

Methods and findings: Literatures regarding portal vein thrombosis (PVT) and Budd-Chiari syndrome (BCS) were searched by the PubMed, EMBASE, and Cochrane library databases. Duplicates included one index paper and one or more redundant papers. They were divided into type-I (duplicates among different databases) and type-II (duplicate publications in different journals/issues) duplicates. For type-I duplicates, reference items were further compared between index and redundant papers. Of 10936 papers regarding PVT, 2399 and 1307 were identified as auto- and hand-searched duplicates, respectively. The prevalence of auto- and hand-searched redundant papers was 11.0% (1201/10936) and 6.1% (665/10936), respectively. They included 3431 type-I and 275 type-II duplicates. Of 11403 papers regarding BCS, 3275 and 2064 were identified as auto- and hand-searched duplicates, respectively. The prevalence of auto- and hand-searched redundant papers was 14.4% (1640/11403) and 9.1% (1039/11403), respectively. They included 5053 type-I and 286 type-II duplicates. Most of type-I duplicates were identified by auto-searching method (69.5%, 2385/3431 in PVT literatures; 64.6%, 3263/5053 in BCS literatures). Nearly all type-II duplicates were identified by hand-searching method (94.9%, 261/275 in PVT literatures; 95.8%, 274/286 in BCS literatures). Compared with those identified by auto-searching method, type-I duplicates identified by hand-searching method had a significantly higher prevalence of wrong items (47/2385 versus 498/1046, p<0.0001 in PVT literatures; 30/3263 versus 778/1790, p<0.0001 in BCS literatures). Most of wrong items originated from EMBASE database.

Conclusion: Given the inadequacy of a single strategy of auto-searching method, a combined strategy of auto- and hand-searching methods should be employed to find duplicates in systematic review.

MeSH terms

  • Budd-Chiari Syndrome / therapy
  • Duplicate Publications as Topic*
  • Humans
  • Portal Vein / pathology
  • PubMed* / statistics & numerical data
  • Systematic Reviews as Topic
  • Thrombosis / therapy

Grants and funding

These authors have no support or funding to report.