Format

Send to

Choose Destination
BMC Genomics. 2016 Mar 5;17:187. doi: 10.1186/s12864-016-2531-7.

Evaluation of DISCOVAR de novo using a mosquito sample for cost-effective short-read genome assembly.

Author information

1
Eck Institute for Global Health, University of Notre Dame, South Bend, IN, 46556, USA. rlove1@nd.edu.
2
Department of Biological Sciences, University of Notre Dame, South Bend, IN, 46556, USA. rlove1@nd.edu.
3
Genome Sequencing and Analysis Program, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02142, USA. neilw@broadinstitute.org.
4
Genome Sequencing and Analysis Program, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02142, USA. jaffe@broadinstitute.org.
5
Eck Institute for Global Health, University of Notre Dame, South Bend, IN, 46556, USA. nbesansk@nd.edu.
6
Department of Biological Sciences, University of Notre Dame, South Bend, IN, 46556, USA. nbesansk@nd.edu.
7
Genome Sequencing and Analysis Program, Broad Institute of MIT and Harvard, 415 Main St, Cambridge, MA, 02142, USA. neafsey@broadinstitute.org.

Abstract

BACKGROUND:

De novo reference assemblies that are affordable, practical to produce, and of sufficient quality for most downstream applications, remain an unattained goal for many taxa. Insects, which may yield too little DNA from individual specimens for long-read sequencing library construction and often have highly heterozygous genomes, can be particularly hard to assemble using inexpensive short-read sequencing data. The large number of insect species with medical or economic importance makes this a critical problem to address.

RESULTS:

Using the assembler DISCOVAR de novo, we assembled the genome of the African malaria mosquito Anopheles arabiensis using 250 bp reads from a single library. The resulting assembly had a contig N50 of 22,433 bp, and recovered the gene set nearly as well as the ALLPATHS-LG AaraD1 An. arabiensis assembly produced with reads from three sequencing libraries and much greater resources. DISCOVAR de novo appeared to perform better than ALLPATHS-LG in regions of low complexity.

CONCLUSIONS:

DISCOVAR de novo performed well assembling the genome of an insect of medical importance, using simpler sequencing input than previous anopheline assemblies. We have shown that this program is a viable tool for cost-effective assembly of a modestly-sized insect genome.

PMID:
26944054
PMCID:
PMC4779211
DOI:
10.1186/s12864-016-2531-7
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center