Download Assemblies

Send to:

Choose Destination


Organism name:
Glossina pallidipes (tsetse fly)
Glossina Genomes Consortium
Assembly level:
Genome representation:
RefSeq category:
representative genome
GenBank assembly accession:
GCA_000688715.1 (latest)
RefSeq assembly accession:
RefSeq assembly and GenBank assembly identical:
WGS Project:
Assembly method:
ALLPATHS-LG v. August 2013
Genome coverage:
Sequencing technology:

IDs: 180601 [UID] 1085778 [GenBank]

See Genome Information for Glossina pallidipes

History (Show revision history)


Background: A total of 6 female individual DNA isolates were provided in TE buffer for tsetse fly Glossina pallidipes courtesy of Dr. Serap Aksoy, Yale University. The sequencing plan followed the recommendations provided in the ALLPATHS-LG assembler manual. This ... model requires 45x sequence coverage each of fragments (overlapping paired reads approx. 180bp length) and 3kb paired end (PE) reads as well as 5x coverage of 8kb PE reads. For fragments we used DNA samples pooled from mother named M2 and for all jumping libraries (3 and 8kb) we used a pool of these offspring DNA samples MD2 1-4. Various assembly metrics are summarized below. Total assembled sequence coverage of Illumina instrument reads was 78X (overlapping reads 49x, 2.0kb PE 9x, 3.5kb PE 13x, 4.1kb PE 7x) using a genome size estimate of 400Mb using the ALLPATHS-LG software (Broad Institute). This first draft assembly was referred to as G. pallidipes 1.0. In the G. pallidipes 1.0 assembly small scaffold gaps were closed with Illumina read mapping and local assembly. Contaminating contigs, trimmed vector in the form of X's and ambiguous bases as N's in the sequence were removed. NCBI requires that all contigs 200bp and smaller be removed. Removing these contigs was the final step in preparation for submitting the 1.0.3 assembly. The G. pallidipes 1.0.3 assembly is made up of a total of 1726 scaffolds with an N50 scaffold length of over 1Mb (N50 contig length was 167kb). The total contigs assembly spans 356Mb. 
 For questions regarding this G. pallidipes assembly please contact Dr. Wesley Warren, Washington University School of Medicine ( or Dr. Serap Aksoy, Yale University ( Downloads of the sequence data are available via the NCBI SRA database. Funding for the sequence characterization of the Glossina pallidipes was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).
 DNA samples can be obtained from: Dr. Serap Aksoy, Department of Epidemiology of Microbial Diseases, Yale School of Public Health, 60 College St., 626 LEPH, New Haven, CT 06510
 This work was supported by NIH-NHGRI grant 5U54HG00307907 to RKW, Director of The Genome Institute at Washington University.
 DNA source - Dr. Serap Aksoy, Yale University, Hartford, CT. 
 Sequencing - The Genome Institute, Washington University School of Medicine, St Louis, MO. 
 Sequence assembly - The Genome Institute, Washington University School of Medicine, St Louis, MO.
 Citation upon use of this assembly in a manuscript: 

 It is requested that users of this Glossina pallidipes sequence assembly acknowledge Dr. Serap Aksoy and The Genome Institute, Washington University School of Medicine in any publications that result from use of this sequence assembly.
 Assembly stats:
 *** Contiguity: Contig *** Total contig number: 6859 Total contig bases: 351919810 bp Average contig length: 51308 bp Maximum contig length: 2108708 bp N50 contig length: 167200 bp N50 contig number: 560
 *** Contiguity: Supercontig *** Total supercontig number: 1726 Average supercontig length: 203893 bp Maximum supercontig length: 5818829 bp N50 supercontig length: 1031162 bp N50 supercontig number: 94
 *** Scaffold Distribution *** Scaffolds > 1M: 101 Scaffold 250K--1M: 245 Scaffold 100K--250K: 181 Scaffold 10--100K: 283 Scaffold 5--10K: 84 Scaffold 2--5K: 289 Scaffold 0--2K: 543  more

Global statistics

Total sequence length357,332,231
Total assembly gap length5,412,421
Gaps between scaffolds0
Number of scaffolds1,726
Scaffold N501,038,751
Scaffold L5095
Number of contigs6,859
Contig N50167,200
Contig L50560
Total number of chromosomes and plasmids0
Number of component sequences (WGS or clone)6,859

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
The primary assembly unit does not have any assembled chromosomes or linkage groups.
Please download the full sequence report for information on the scaffolds.

Assembly statistics

Support Center