Format

Send to:

Choose Destination

Download Assembly



Glossina_palpalis_gambiensis-2.0.1

Organism name:
Glossina palpalis gambiensis (tsetse fly)
Isolate:
146720
Sex:
female
BioSample:
SAMN01796024
BioProject:
PRJNA172847
Submitter:
Glossina Genomes Consortium
Date:
2015/01/15
Assembly level:
Scaffold
Genome representation:
full
RefSeq category:
representative genome
GenBank assembly accession:
GCA_000818775.1 (latest)
RefSeq assembly accession:
n/a
RefSeq assembly and GenBank assembly identical:
n/a
WGS Project:
JXJN01
Assembly method:
ALLPATHS-LG November 2014
Genome coverage:
138x
Sequencing technology:
Illumina

IDs: 243061 [UID] 1477418 [GenBank]

See Genome Information for Glossina palpalis

History (Show revision history)

Comment


 Background:
 Multiple DNA isolates were provided in TE buffer for tsetse fly Glossina p. gambiensis courtesy of Dr. Serap Aksoy, Yale University. Only Glossina palpalis gambiensis females (no mothers or progeny) were used that received 3 tetracycline treated ... blood meals. The pupae were obtained from the colony maintained in Institute of Zoology, Department of Entomology, Slovak Academy of Science. The sequencing plan followed the recommendations provided in the ALLPATHS-LG assembler manual. This model requires 45x sequence coverage each of fragments (overlapping paired reads 
180bp length) and 3kb paired end (PE) reads as well as 5x coverage of 8kb PE reads. The various assembly metrics are summarized below. Total assembled sequence coverage of Illumina instrument reads was 138X (overlapping reads 83x, 3.0kb PE 42x, 7.0kb PE 13x) using a genome size estimate of 400Mb using the ALLPATHS-LG software (Broad Institute). This first working draft assembly was referred to as G. p. gambiensis 2.0. In the G. p. gambiensis 2.0 assembly small scaffold gaps were closed with Illumina read mapping and local assembly. Contaminating contigs, trimmed vector in the form of X's and ambiguous bases as N's in the sequence were removed. NCBI requires that all contigs 200bp and smaller be removed. Removing these contigs was the final step in preparation for submitting the G. p. gambiensis 2.0.1 assembly. The G. p. gambiensis 2.0.1 assembly is made up of a total of 3927 scaffolds with an N50 scaffold length of nearly 532kb (N50 contig length is 24.5kb). The total scaffold assembly spans 386Mb. 
 For questions regarding this G. p. gambiensis assembly please contact Dr. Wesley Warren, Washington University School of Medicine (wwarren@genome.wustl.edu) or Dr. Serap Aksoy, Yale University (serap.aksoy@yale.edu). Downloads of the sequence data are available via the NCBI SRA database. Funding for the sequence characterization of the Glossina p. gambiensis was provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).
 DNA samples can be obtained from: Dr. Serap Aksoy, Department of Epidemiology of Microbial Diseases, Yale School of Public Health, 60 College St., 626 LEPH, New Haven, CT 06510
 Credits:
 This work was supported by NIH-NHGRI grant 5U54HG00307907 to RKW, Director of The Genome Institute at Washington University.
 DNA source - Dr. Serap Aksoy, Yale University, Hartford, CT. 
 Sequencing - The Genome Institute, Washington University School of Medicine, St Louis, MO. 
 Sequence assembly - The Genome Institute, Washington University School of Medicine, St Louis, MO.
 Citation upon use of this assembly in a manuscript: 

 It is requested that users of this Glossina p. gambiensis sequence assembly acknowledge Dr. Serap Aksoy and The Genome Institute, Washington University School of Medicine in any publications that result from use of this sequence assembly. 
 Assembly stats:
 *** Contiguity: Contig *** Total contig number: 31322 Total contig bases: 346051769 bp Average contig length: 11048 bp Maximum contig length: 483013 bp N50 contig length: 24489 bp N50 contig number: 3801
 *** Contiguity: Supercontig *** Total supercontig number: 3927 Average supercontig length: 88121 bp Maximum supercontig length: 3440817 bp N50 supercontig length: 531872 bp N50 supercontig number: 181
 *** Scaffold Distribution *** Scaffolds > 1M: 56 Scaffold 250K--1M: 357 Scaffold 100K--250K: 326 Scaffold 10--100K: 567 Scaffold 5--10K: 509 Scaffold 2--5K: 1147 Scaffold 0--2K: 965  more

Global statistics

Total sequence length380,104,241
Total ungapped length346,049,078
Gaps between scaffolds0
Number of scaffolds3,926
Scaffold N50575,037
Scaffold L50187
Number of contigs31,320
Contig N5024,489
Contig L503,801
Total number of chromosomes and plasmids0
Number of component sequences (WGS or clone)31,320

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
The primary assembly unit does not have any assembled chromosomes or linkage groups.
Please download the full sequence report for information on the scaffolds.

Assembly statistics

MoleculeTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
unplaced380,104,2413,926346,049,078575,03727,3940
Support Center