U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Download Assembly



Chlorocebus_sabeus 1.1

Organism name:
Chlorocebus sabaeus (green monkey)
Isolate:
1994-021
Sex:
male
BioSample:
SAMN01760484
BioProject:
PRJNA168621
Submitter:
Vervet Genomics Consortium
Date:
2014/03/25
Synonyms:
chlSab2
Assembly level:
Chromosome
Genome representation:
full
Excluded from RefSeq:
  • superseded by newer assembly for species
GenBank assembly accession:
GCA_000409795.2 (latest)
RefSeq assembly accession:
GCF_000409795.2 (suppressed) see latest RefSeq assembly for this species
RefSeq assembly and GenBank assembly identical:
no (hide details)
  • Only in RefSeq: chromosome MT
  • Data displayed for GenBank version
WGS Project:
AQIB01
Assembly method:
ALLPATHS and Newbler v. 13-Feb-2013
Genome coverage:
95x
Sequencing technology:
454 Titanium; Illumina HiSeq; ABI

IDs: 132581 [UID] 975068 [GenBank] 1024848 [RefSeq]

See Genome Information for Chlorocebus sabaeus

There are 3 assemblies for this organism

See more

History (Show revision history)

Comment

Chlorocebus aethiops sabeus (vervet) Sequence Assembly Release Notes
 The vervet DNA for shotgun sequencing, and for BAC libraries, is derived from an adult male vervet monkey (Chlorocebus aethiops sabeus; animal id 1994-021) within the vervet research colony housed a ... the Wake Forest Primate Facility to create the BAC library CHORI-252. A total of 362,969 BAC end sequences have been generated from this library. A total of 143 CHORI-252 BACs (approx. 18Mb) have been finished and submitted to Genbank. Of these 29 were finished and submitted for the MHC region. Whole genome sequences were generated on the Roche 454 Titanium instrument at these coverage levels (vervet genome size of approx 2.9Gb): fragment- 10X, 3kbp- 8X, and 8kbp- 1X. Total sequence genome coverage on the Illumina HiSeq instrument was 95x (45x fragments, 45x 3kb and 5x 8kb). 
 Two independent assemblies were built with the appropriate sequence data, using the ALLPATHS (Broad Institute) and Newbler (Roche) assemblers. Based on superior contig and scaffold contiguity the ALLPATHS assembly was chosen as the reference. The unique sequences from the Newbler assembly were then merged into the ALLPATHS assembly using graph accordance methods (Yao et. al. 2011. Oct. 23 Bioinformatics). 
 Post assembly we integrated 170 finished BACs. These 170 BACs (including the MHC region) were merged into the 1.0 assembly. The top scaffold that each BAC mapped to was identified by MEGABLAST (-e 1e-20 --W 200 --p 98). Contigs of the top scaffold that the BAC mapped to were identified by BLASTN (-W150 --F F). A Perl script was used to create a new contig for each BAC, extend the contig if the 5' and 3' overlapping contigs were longer than the BAC and adjust flanking gaps accordingly. We then sorted scaffolds by decreasing length, assigned new sequence identifiers to contigs and scaffolds, and extended 20-bp and 50-bp gaps to 100-bp as per NCBI's guideline. 
 In the final assembly, referred to as Chlorocebus_sabeus 6.0.3, there were 162,907 contigs with an N50 contig length of 88 kb. There were 2205 supercontigs with the N50 supercontig length of 45 Mb. A total of 2.73 Gb of sequence was assembled in contigs. Including estimated gap sizes, over 2.74Gb were ordered and oriented along chromosomes, 27.6Mb along the CAE*_random chromosomes, with only 18.34 Mb remaining unlocalized. After organizing Chlorocebus_sabeus 6.0.3 into chromosomal AGP files, we labeled this first vervet release as 1.0.
 *******************************************
 Chlorocebus aethiops sabeus Sequence and Assembly Credits

 DNA source - Dr. Jay Kaplan, Wake Forest Primate Facility, Wake Forest, NC. Genome Sequence - The Genome Institute, Washington University School of Medicine, St Louis, MO and Department of Human Genetics, McGill University, Montreal, Canada. Sequence Assembly - The Genome Institute, Washington University School of Medicine, St Louis, MO. BAC library - Dr. Pieter DeJong, CHORI, Oakland,CA. Assembly curation - Jessica Wasserscheid, Nikoleta Juretic, Dr. Ken Dewar, McGill University, Montreal, QC Canada. LaDeana Hillier, The Genome Institute, Washington University School of Medicine, St Louis, MO. FISH Mapping Data - Mariano Rocchi, Department of Biology, University of Bari, Bari, Italy. cDNA data - RNA sources was Dr. Nelson Freimer, Semel Institute for Neuroscience and Human Behavior, University of California Los Angeles, CA, USA 
 Funding for the sequence characterization of the vervet genome is being provided by NHGRI. 
 Author List: Nelson Freimer, George Weinstock, Richard K. Wilson, Wesley C. Warren
 *******************************************
 Chromosome Lengths:
 column 1 = chromosome column 2 = chromosome length (including estimated gap sizes)
 CAE1 126035930 CAE2 90373283 CAE3 92142175 CAE4 91010382 CAE5 75399963 CAE6 50890351 CAE7 135778131 CAE8 139301422 CAE9 125710982 CAE10 128595539 CAE11 128539186 CAE12 108555830 CAE13 98384682 CAE14 107702431 CAE15 91754291 CAE16 75148670 CAE17 71996105 CAE18 72318688 CAE19 33263144 CAE20 130588469 CAE21 127223203 CAE22 101219884 CAE23 82825804 CAE24 84932903 CAE25 85787240 CAE26 58131712 CAE27 48547382 CAE28 21531802 CAE29 24206276 CAEX 130038232 CAEY 6181219
 *******************************************
 Chlorocebus_sabeus 6.0.3 assembly statistics:
 *** Contiguity: Contig *** Total contig number: 162907 Total contig bases: 2734267806 bp Average contig length: 16784 bp Maximum contig length: 1051246 bp N50 contig length: 88741 bp N50 contig number: 7870
 *** Contiguity: Supercontig *** Total supercontig number: 2205 Average supercontig length: 1240031 bp Maximum supercontig length: 126332868 bp N50 supercontig length: 45002363 bp N50 supercontig number: 19
 Scaffolds > 1M: 147 Scaffold 250K--1M: 47 Scaffold 100K--250K: 34 Scaffold 10--100K: 473 Scaffold 5--10K: 235 Scaffold 2--5K: 434 Scaffold 0--2K: 835

  more

Global statistics

Total sequence length2,789,639,778
Total ungapped length2,752,002,658
Gaps between scaffolds29
Number of scaffolds2,021
Scaffold N5081,825,804
Scaffold L5014
Number of contigs162,723
Contig N5090,449
Contig L507,735
Total number of chromosomes and plasmids31
Number of component sequences (WGS or clone)163,017

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
Assembly Unit: Primary Assembly (GCA_000409805.2)
Molecule nameGenBank sequenceRefSeq sequenceUnlocalized
sequences count
Chromosome 1CM001941.2=NC_023642.19
Chromosome 2CM001942.1=NC_023643.156
Chromosome 3CM001943.2=NC_023644.12
Chromosome 4CM001944.2=NC_023645.15
Chromosome 5CM001945.1=NC_023646.16
Chromosome 6CM001946.1=NC_023647.113
Chromosome 7CM001947.1=NC_023648.114
Chromosome 8CM001948.2=NC_023649.113
Chromosome 9CM001949.2=NC_023650.173
Chromosome 10CM001950.2=NC_023651.121
Chromosome 11CM001952.2=NC_023652.13
Chromosome 12CM001953.2=NC_023653.15
Chromosome 13CM001954.1=NC_023654.16
Chromosome 14CM001955.2=NC_023655.16
Chromosome 15CM001956.2=NC_023656.15
Chromosome 16CM001957.1=NC_023657.116
Chromosome 17CM001958.2=NC_023658.112
Chromosome 18CM001959.2=NC_023659.13
Chromosome 19CM001960.2=NC_023660.111
Chromosome 20CM001961.2=NC_023661.115
Chromosome 21CM001962.2=NC_023662.115
Chromosome 22CM001963.2=NC_023663.112
Chromosome 23CM001964.1=NC_023664.123
Chromosome 24CM001965.2=NC_023665.14
Chromosome 25CM001966.2=NC_023666.15
Chromosome 26CM001967.1=NC_023667.10
Chromosome 27CM001968.1=NC_023668.133
Chromosome 28CM001969.1=NC_023669.17
Chromosome 29CM001970.2=NC_023670.10
Chromosome XCM001951.2=NC_023671.1122
Chromosome YCM001940.1=NC_023672.125
unplacedn/an/an/a1,432

Assembly statistics

MoleculeSequence RoleTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
AllAssembled molecule2,789,639,7782,0212,752,002,65881,825,804160,70229
Chromosome 1AllAssembled moleculeUnlocalized scaffolds126,131,484126,035,93095,5541129124,627,044124,535,02992,01565,614,30065,614,30014,10120,16520,1614110
Chromosome 2AllAssembled moleculeUnlocalized scaffolds91,259,31390,373,283886,0305825690,109,13789,240,035869,10249,887,08349,887,08323,2701,9081,86840110
Chromosome 3AllAssembled moleculeUnlocalized scaffolds92,195,67892,142,17553,50342291,091,80291,038,29953,50347,752,86647,752,86650,9161,3911,3910110
Chromosome 4AllAssembled moleculeUnlocalized scaffolds91,105,31591,010,38294,93372590,047,63989,953,61594,02445,081,11745,081,11728,6831,4351,4341110
Chromosome 5AllAssembled moleculeUnlocalized scaffolds75,455,24075,399,96355,27782674,275,55274,220,28555,26745,109,68845,109,68810,2002,1932,1921110
Chromosome 6AllAssembled moleculeUnlocalized scaffolds51,332,64650,890,351442,2951521350,047,62049,611,451436,16928,238,67628,238,676309,0863,3783,33543110
Chromosome 7AllAssembled moleculeUnlocalized scaffolds135,996,728135,778,131218,59716214134,909,211134,695,161214,050117,867,177117,867,17721,3872,2612,2556110
Chromosome 8AllAssembled moleculeUnlocalized scaffolds139,606,421139,301,422304,99915213138,455,707138,154,570301,13796,654,81496,654,81452,0972,2502,24010110
Chromosome 9AllAssembled moleculeUnlocalized scaffolds128,653,568125,710,9822,942,58675273127,460,780124,555,7782,905,00288,869,95688,869,956190,5023,8733,697176110
Chromosome 10AllAssembled moleculeUnlocalized scaffolds129,427,755128,595,539832,21623221127,993,544127,166,577826,967105,070,696105,070,696566,73916,61116,58130110
Chromosome 11AllAssembled moleculeUnlocalized scaffolds128,587,664128,539,18648,478523127,390,151127,342,69347,45893,156,12493,156,12425,4377,4447,4413110
Chromosome 12AllAssembled moleculeUnlocalized scaffolds108,573,933108,555,83018,103725107,504,334107,487,13917,19591,780,69491,780,6943,7952,2632,2621110
Chromosome 13AllAssembled moleculeUnlocalized scaffolds98,537,49298,384,682152,81082696,792,87896,648,038144,84070,951,43270,951,43238,47520,69320,67419110
Chromosome 14AllAssembled moleculeUnlocalized scaffolds107,732,398107,702,43129,967826106,369,707106,339,74029,96787,182,21087,182,2104,92713,38813,3880110
Chromosome 15AllAssembled moleculeUnlocalized scaffolds91,923,87991,754,291169,58872590,875,49090,705,972169,51864,272,88264,272,882129,5481,5061,4997110
Chromosome 16AllAssembled moleculeUnlocalized scaffolds75,385,61275,148,670236,9421821674,240,84274,006,318234,52454,492,65354,492,65320,7162,7952,7887110
Chromosome 17AllAssembled moleculeUnlocalized scaffolds72,276,40171,996,105280,2961421271,072,98870,794,876278,11253,420,52853,420,52843,5591,2441,2359110
Chromosome 18AllAssembled moleculeUnlocalized scaffolds72,338,67572,318,68819,98752371,306,87571,287,88818,98746,092,63846,092,6388,4231,0831,0821110
Chromosome 19AllAssembled moleculeUnlocalized scaffolds33,469,96133,263,144206,8171211132,361,86832,157,988203,88032,263,14432,263,14434,0811,8691,85316110
Chromosome 20AllAssembled moleculeUnlocalized scaffolds133,634,921130,588,4693,046,45216115132,485,500129,454,4873,031,013129,588,469129,588,4691,557,9413,5793,415164110
Chromosome 21AllAssembled moleculeUnlocalized scaffolds127,727,582127,223,203504,37916115126,584,169126,083,898500,271126,223,203126,223,203254,1233,5203,49426110
Chromosome 22AllAssembled moleculeUnlocalized scaffolds101,434,026101,219,884214,14213112100,152,87899,943,137209,741100,219,884100,219,88434,17113,00812,99810110
Chromosome 23AllAssembled moleculeUnlocalized scaffolds83,237,29382,825,804411,4892412382,184,05181,784,379399,67281,825,80481,825,80422,2001,4111,38031110
Chromosome 24AllAssembled moleculeUnlocalized scaffolds84,983,65584,932,90350,75251483,693,48883,643,61249,87683,932,90383,932,90333,63018,85318,8476110
Chromosome 25AllAssembled moleculeUnlocalized scaffolds85,838,00385,787,24050,76361584,802,27684,753,85348,42384,787,24084,787,24025,9961,4791,4763110
Chromosome 26Assembled molecule58,131,712157,099,49757,131,7121,3131
Chromosome 27AllAssembled moleculeUnlocalized scaffolds48,884,34948,547,382336,9673413347,829,92347,501,612328,31147,547,38247,547,38216,4461,00997633110
Chromosome 28AllAssembled moleculeUnlocalized scaffolds21,960,94721,531,802429,14581720,913,82620,489,661424,16520,531,80220,531,802379,2531,2061,18818110
Chromosome 29Assembled molecule24,206,276123,190,35823,206,2764321
Chromosome XAllAssembled moleculeUnlocalized scaffolds144,378,445130,038,23214,340,2131231122141,513,464127,577,26913,936,195130,038,232130,038,2329,014,3105,4454,861584000
Chromosome YAllAssembled moleculeUnlocalized scaffolds7,326,8786,181,2191,145,659261257,213,4056,102,6281,110,7776,181,2196,181,219312,24930024456000
unplacedAssembled molecule17,905,5281,43217,406,65485,7271,3970