-
PCAP: a whole-genome assembly program.
Department of Computer Science Iowa State University, Ames, Iowa 50011-1040, USA. xqhuang@cs.iastate.edu
We describe a whole-genome assembly program named PCAP for processing tens of millions of reads. The PCAP program has several features to address efficiency and accuracy issues in assembly. Multiple processors are used to perform most time-consuming computations in assembly. A more sensitive method is used to avoid missing overlaps caused by sequencing errors. Repetitive regions of reads are detected on the basis of many overlaps with other reads, instead of many shorter word matches with other reads. Contaminated end regions of reads are identified and removed. Generation of a consensus sequence for a contig is based on an alignment of reads in the contig, in which both base quality values and coverage information are used to determine every consensus base. The PCAP program was tested on a mouse whole-genome data set of 30 million reads and a human Chromosome 20 data set of 1.7 million reads. The program is freely available for academic use.
PMID: 12952883 [PubMed - indexed for MEDLINE]
PMCID: PMC403719
-
Cited by 33 PubMed Central articles
-
Perl module and PISE wrappers for the integrated analysis of sequence data and SNP features.
Jayashree B, Bhanuprakash A, Jami A, Reddy PS, Nayak S, Varshney RK.
BMC Res Notes. 2009 May 24; 2:92. Epub 2009 May 24.
[BMC Res Notes. 2009]
-
Influence of fecal sample storage on bacterial community diversity.
Roesch LF, Casella G, Simell O, Krischer J, Wasserfall CH, Schatz D, Atkinson MA, Neu J, Triplett EW.
Open Microbiol J. 2009; 3:40-6. Epub 2009 Mar 24.
[Open Microbiol J. 2009]
-
Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery.
Hale MC, McCormick CR, Jackson JR, Dewoody JA.
BMC Genomics. 2009 Apr 29; 10:203. Epub 2009 Apr 29.
[BMC Genomics. 2009]
- » See all...