U.S. flag

An official website of the United States government


Send to:

Choose Destination

Download Assembly

Try the new Datasets Genome page


Organism name:
Cyprinodon variegatus (sheepshead minnow)
Aquatic Genome Models
Assembly level:
Genome representation:
RefSeq category:
representative genome
GenBank assembly accession:
GCA_000732505.1 (latest)
RefSeq assembly accession:
GCF_000732505.1 (latest)
RefSeq assembly and GenBank assembly identical:
no (hide details)
  • Only in RefSeq: chromosome MT (in non-nuclear assembly-unit)
  • Data displayed for RefSeq version
WGS Project:
Assembly method:
AllPaths v. May 2014
Genome coverage:
Sequencing technology:

IDs: 199731 [UID] 1177628 [GenBank] 2779528 [RefSeq]

See Genome Information for Cyprinodon variegatus

History (Show revision history)


Background: The Sheepshead minnow, Cyprinodon variegatus, has been widely used in assessing water quality via toxicity tests, and more recently in identifying the effects of environmental exposure on gene expression patterns. Sequencing the genome of the Sheepshead Minnow will ... provide novel insights into both short and long term effects of environmental exposure. DNA used for sequencing the Sheepshead minnow provided courtesy of Dr. Diane Nacci, was derived from a single female animal identified as N-32, collected on 17 September 2010 at Navarre, FL, USA. The sequencing plan followed the recommendations provided in the AllPaths-LG assembler manual. This model requires 45x sequence coverage each of overlapping Illumina reads (frags) and 15x 3kb Illumina paired end reads as well as 5x coverage of 8kb PE Illumina reads. Total assembled sequence coverage of Illumina instrument reads was 81x using a genome size estimate of 1.0Gb. The first draft assembly was performed with AllPaths-LG (Gnerre et al 2010) and was referred to as C. variegatus 1.0. In the C. variegatus 1.0 assembly small scaffold gaps were closed with Illumina read mapping and local assembly. Contaminating contigs, trimmed vector in the form of X's and ambiguous bases as N's in the sequence were removed. All scaffolds (singletons) and contigs within scaffolds that were 200bp and less in length were removed from the assembly. Removing these contigs was the last step in preparation for submitting the final 1.0.2 assembly. The C. cyprinodon 1.0.2 assembly is made up of a total of 9258 scaffolds with an N50 scaffold length over 790Kb (N50 contig length was 20.8Kb). The total scaffold assembly spans over 1.02Gb. For questions regarding this Sheepshead minnow assembly please contact Dr. Wesley Warren, Washington University School of Medicine (wwarren@genome.wustl.edu). Downloads of the sequence data are available via the NCBI SRA database. 
 This work was supported by NIH grant R24 RR032658-01 to Dr. Warren, Washington University School of Medicine.
 DNA samples can be obtained from: Diane E. Nacci, Ph. D. US Environmental Protection Agency Office of Research and Development Atlantic Ecology Division Population Ecology Branch 27 Tarzwell Drive Narragansett, RI 02882, nacci.diane@epa.gov

 DNA source - Dr. Diane Nacci, Dr. Dina Proestou (USDA, ARS)
 Sequencing - The Genome Institute, Washington University School of Medicine, St Louis, MO. 
 Sequence assembly - The Genome Institute, Washington University School of Medicine, St Louis, MO.
 Citation upon use of this assembly in a manuscript: 
 It is requested that users of this Cyprinodon variegatus sequence assembly acknowledge Dr. Wesley Warren and The Genome Institute, Washington University School of Medicine in any publications that result from use of this sequence assembly. 
 Assembly stats:
 *** Contiguity: Contig *** Total contig number: 110958 Total contig bases: 899524286 bp Average contig length: 8107 bp Maximum contig length: 275912 bp N50 contig length: 20803 bp N50 contig number: 11158
 *** Contiguity: Supercontig *** Total supercontig number: 9258 Average supercontig length: 97162 bp Maximum supercontig length: 3860663 bp N50 supercontig length: 791614 bp N50 supercontig number: 337
 *** Scaffold Distribution *** Scaffolds > 1M: 240 Scaffold 250K--1M: 744 Scaffold 100K--250K: 561 Scaffold 10--100K: 1164 Scaffold 5--10K: 550 Scaffold 2--5K: 1917 Scaffold 0--2K: 4082  more

Global statistics

Total sequence length1,035,184,475
Total ungapped length899,540,766
Gaps between scaffolds0
Number of scaffolds9,259
Scaffold N50835,301
Scaffold L50365
Number of contigs110,959
Contig N5020,803
Contig L5011,158
Total number of chromosomes and plasmids1
Number of component sequences (WGS or clone)110,959

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
The primary assembly unit does not have any assembled chromosomes or linkage groups.
Please download the full sequence report for information on the scaffolds.

Assembly statistics

Mitochondrion MT16,500
Support Center