Format

Send to

Choose Destination
See comment in PubMed Commons below
PLoS One. 2012;7(12):e52249. doi: 10.1371/journal.pone.0052249. Epub 2012 Dec 20.

FastUniq: a fast de novo duplicates removal tool for paired short reads.

Author information

  • 1National Engineering Laboratory for Breeding of Endangered Medicinal Materials, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences & Peking Union Medical College, Beijing, People's Republic of China.

Abstract

The presence of duplicates introduced by PCR amplification is a major issue in paired short reads from next-generation sequencing platforms. These duplicates might have a serious impact on research applications, such as scaffolding in whole-genome sequencing and discovering large-scale genome variations, and are usually removed. We present FastUniq as a fast de novo tool for removal of duplicates in paired short reads. FastUniq identifies duplicates by comparing sequences between read pairs and does not require complete genome sequences as prerequisites. FastUniq is capable of simultaneously handling reads with different lengths and results in highly efficient running time, which increases linearly at an average speed of 87 million reads per 10 minutes. FastUniq is freely available at http://sourceforge.net/projects/fastuniq/.

PMID:
23284954
PMCID:
PMC3527383
DOI:
10.1371/journal.pone.0052249
[PubMed - indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Public Library of Science Icon for PubMed Central
    Loading ...
    Support Center