U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Download Assembly



Macaca_fascicularis_5.0

Organism name:
Macaca fascicularis (crab-eating macaque)
Sex:
female
BioSample:
SAMN00811240
BioProject:
PRJNA20409
Submitter:
Washington University (WashU)
Date:
2013/06/12
Synonyms:
macFas5
Assembly level:
Chromosome
Genome representation:
full
Excluded from RefSeq:
  • superseded by newer assembly for species
GenBank assembly accession:
GCA_000364345.1 (latest)
RefSeq assembly accession:
GCF_000364345.1 (suppressed) see latest RefSeq assembly for this species
RefSeq assembly and GenBank assembly identical:
no (hide details)
  • Only in RefSeq: chromosome MT
  • Data displayed for GenBank version
WGS Project:
AQIA01
Assembly method:
SOAPdenovo v. 1.0.5, SRPRISM v. 2.4; ARGO v. 0.1
Genome coverage:
68x
Sequencing technology:
Illumina HiSeq

IDs: 704988 [UID] 704988 [GenBank] 779818 [RefSeq]

See Genome Information for Macaca fascicularis

There are 13 assemblies for this organism

See more

History (Show revision history)

Comment

Macaca fascicularis (cynomolgus macaque) Sequence Assembly Release Notes
 The cynomolgus macaque DNA for shotgun sequencing, is derived from a female, 5.8 yrs old, provided by Dr. Jay Kaplan and originated from "Tinjil", not a native location for cynomolgus, rather ... it is an island off the south coast of Java that was seeded with monkeys by the Washington National Primate Center. The original animal was trapped in eastern Sumatra. Sequences were generated on the Illumina HiSeq for assisted and de novo assembly. Sequence genome coverage for each paired end read type is as follows: 50x 300-500bp inserts, 10x 3kb insert and 2x 8kb insert. 
 Two independent assemblies were built with all sequence data, using an assisted assembler and the de novo assembler SOAP. The workflow for the assisted assembley is as follows: 1) map reads to reference and filter alignments using SRprism (unpublished but in process of being written up) that reports all alignments of equally good quality. Filtering is done by first finding out the histogram for per library insert size seen in alignments, deciding which range to use (usually tightest 99th percentile), and then retaining paired reads that have correct orientation with insert size in the desired range. Different data types (Illumina, traces, solid, 454) have slightly different filtering criteria. 2) use mapped and filtered reads for building consensus contigs. 3) find consecutive contigs that are bridged by mate pairs having 30-mers each on either side of the gap, de-novo assembly in gaps between bridged contigs: 30-mers from reads are used to build an index for de-novo assembly, only filter out reads and reads mapped to contig ends that go into that index, set a predefined maximum gap size and number of iterations used to limit the resources spent on any particular gap. 4) Find structural differences between scaffolds built and reference by using paired reads with mates on different scaffolds and do de-novo gap filling between reordered scaffolds [in progress]. The reference genome used to align cynomolgus macaque reads was the published version (MMUL_1) of rhesus macaque and a updated rhesus macaque assembled version not yet published (courtesy of Aleksey Zimin). Using the assisted assembly as the reference we aligned and merged the de novo assembly using the GAA tool.
 In the final assembly, referred to as Macaca_fascicularis_5.0, there were 102,878 contigs with an N50 contig length of 85 kb. There were 7627 supercontigs (scaffolds) with the N50 supercontig length of 144 Mb. A total of 2.8 Gb was assembled in contigs. 
 ****************************************************
 Macaca fascicularis Sequence and Assembly Credits

 DNA source - Dr. Jay Kaplan, Wake Forest Primate Facility, Wake Forest, NC. Genome Sequence - The Genome Institute, Washington University School of Medicine, St Louis, MO. Sequence Assembly - Richa Agarwala, Sergey Shiryaev, NCBI and The Genome Institute, Washington University School of Medicine, St Louis, MO. Assembly curation - LaDeana Hillier, The Genome Institute, Washington University School of Medicine, St Louis, MO. FISH mapping data - Mariano Rocchi, Department of Biology, University of Bari, Bari, Italy.
 Funding for the sequence characterization of the cynomolgus macaque genome was provided by NHGRI. 
 Author List: Richard K. Wilson, Wesley C. Warren
 ****************************************************
 Chromosome lengths
 Column 1 = Chromosome Column 2 = Chromosome lengths (including estimated gap sizes) Column 3 = Chromosome sequence length (without including estimated gap sizes)
 MFA1 227556264 217433370 MFA10 96509753 90761517 MFA11 137757926 132144036 MFA12 132586672 127191125 MFA13 111193037 106335528 MFA14 130733371 123895447 MFA15 112612857 107712928 MFA16 80997621 74103573 MFA17 96864807 92008008 MFA18 75711847 71766527 MFA19 59248254 51391499 MFA2 192460366 186559336 MFA20 78541002 72393001 MFA3 192294377 180410849 MFA4 170955103 164881207 MFA5 189454096 183527297 MFA6 181584905 175247550 MFA7 171882078 164071319 MFA8 146850525 140657447 MFA9 133195287 127272501 MFAX 152835861 144357465
 An additional 69.8Mb of sequence is unlocalized.
 ***********************************************************************************
 Assembly statistics:
 *** Contiguity: Contig *** Total contig number: 102878 Total contig bases: 2805274345 bp Average contig length: 27268 bp Maximum contig length: 764150 bp N50 contig length: 85974 bp N50 contig number: 9304
 Major contig (> 500 bp) number: 81458 Major_contig bases: 2801848696 bp Major_contig avg contig length: 34396 Major_contig N50 contig length: 86137 Major_contig N50 contig number: 9284

 *** Contiguity: Supercontig *** Total supercontig number: 7627 Average supercontig length: 367808 bp Maximum supercontig length: 221345846 bp N50 supercontig length: 144445942 bp N50 supercontig number: 8
 Major supercontig (> 500 bp) number: 7587 Major_supercontig bases: 2805261977 bp Major_supercontig avg  more

Global statistics

Total sequence length2,946,827,162
Total ungapped length2,803,850,123
Gaps between scaffolds24
Number of scaffolds7,624
Scaffold N5088,649,475
Scaffold L5014
Number of contigs87,763
Contig N5086,040
Contig L509,296
Total number of chromosomes and plasmids21
Number of component sequences (WGS or clone)87,763

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
Assembly Unit: Primary Assembly (GCA_000364355.1)
Molecule nameGenBank sequenceRefSeq sequenceUnlocalized
sequences count
Chromosome 1CM001919.1=NC_022272.128
Chromosome 2CM001920.1=NC_022273.120
Chromosome 3CM001921.1=NC_022274.141
Chromosome 4CM001922.1=NC_022275.169
Chromosome 5CM001923.1=NC_022276.122
Chromosome 6CM001924.1=NC_022277.115
Chromosome 7CM001925.1=NC_022278.125
Chromosome 8CM001926.1=NC_022279.117
Chromosome 9CM001927.1=NC_022280.127
Chromosome 10CM001928.1=NC_022281.118
Chromosome 11CM001929.1=NC_022282.119
Chromosome 12CM001930.1=NC_022283.19
Chromosome 13CM001931.1=NC_022284.112
Chromosome 14CM001932.1=NC_022285.118
Chromosome 15CM001933.1=NC_022286.119
Chromosome 16CM001934.1=NC_022287.122
Chromosome 17CM001935.1=NC_022288.129
Chromosome 18CM001936.1=NC_022289.15
Chromosome 19CM001937.1=NC_022290.128
Chromosome 20CM001938.1=NC_022291.117
Chromosome XCM001939.1=NC_022292.112
unplacedn/an/an/a7,107

Assembly statistics

MoleculeSequence RoleTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
AllAssembled molecule2,946,827,1627,6242,803,850,12388,649,47580,13924
Chromosome 1AllAssembled moleculeUnlocalized scaffolds228,955,476227,556,2641,399,21230228218,712,996217,433,3701,279,626138,665,737138,665,737152,5386,4726,365107110
Chromosome 2AllAssembled moleculeUnlocalized scaffolds192,909,870192,460,366449,50422220186,971,721186,559,336412,385125,741,623125,741,62363,0344,3104,27832110
Chromosome 3AllAssembled moleculeUnlocalized scaffolds193,512,962192,294,3771,218,58543241181,472,993180,410,8491,062,144129,674,376129,674,37664,8194,9314,831100110
Chromosome 4AllAssembled moleculeUnlocalized scaffolds173,232,586170,955,1032,277,48371269166,827,843164,881,2071,946,636119,724,987119,724,98765,3704,1403,886254110
Chromosome 5AllAssembled moleculeUnlocalized scaffolds190,271,649189,454,096817,55325322184,225,783183,527,297698,486105,597,079105,597,07987,6904,2194,13287220
Chromosome 6AllAssembled moleculeUnlocalized scaffolds181,913,984181,584,905329,07917215175,554,307175,247,550306,757132,081,036132,081,03674,6864,1974,17324110
Chromosome 7AllAssembled moleculeUnlocalized scaffolds172,638,558171,882,078756,48027225164,699,552164,071,319628,233107,376,033107,376,03353,0864,6574,58869110
Chromosome 8AllAssembled moleculeUnlocalized scaffolds147,524,517146,850,525673,99219217141,254,680140,657,447597,233100,378,728100,378,728423,5373,2203,16951110
Chromosome 9AllAssembled moleculeUnlocalized scaffolds133,925,568133,195,287730,28129227127,928,433127,272,501655,93292,187,01892,187,01846,6883,4043,35351110
Chromosome 10AllAssembled moleculeUnlocalized scaffolds96,855,22096,509,753345,4672021891,082,61290,761,517321,09560,750,29060,750,29027,3773,2213,19229110
Chromosome 11AllAssembled moleculeUnlocalized scaffolds138,871,270137,757,9261,113,34421219133,185,823132,144,0361,041,787101,971,489101,971,489603,9094,0373,97958110
Chromosome 12AllAssembled moleculeUnlocalized scaffolds132,992,465132,586,672405,7931239127,546,852127,191,125355,727106,729,359106,729,35970,3532,9172,88136220
Chromosome 13AllAssembled moleculeUnlocalized scaffolds111,518,885111,193,037325,84815312106,613,556106,335,528278,02888,649,47588,649,47575,8082,7872,74542220
Chromosome 14AllAssembled moleculeUnlocalized scaffolds132,195,074130,733,3711,461,70320218125,296,749123,895,4471,401,30267,271,94767,271,947992,5163,4133,36251110
Chromosome 15AllAssembled moleculeUnlocalized scaffolds113,086,136112,612,857473,27921219108,135,860107,712,928422,93298,078,25298,078,25257,6103,0442,99054110
Chromosome 16AllAssembled moleculeUnlocalized scaffolds82,009,34180,997,6211,011,7202422275,009,13174,103,573905,55857,635,53457,635,53490,8963,3133,22687110
Chromosome 17AllAssembled moleculeUnlocalized scaffolds97,546,54296,864,807681,7353122992,614,48892,008,008606,48053,877,09553,877,09547,9002,2932,22766110
Chromosome 18AllAssembled moleculeUnlocalized scaffolds75,807,05675,711,84795,20972571,849,47671,766,52782,94948,683,46848,683,46828,3241,6961,68511110
Chromosome 19AllAssembled moleculeUnlocalized scaffolds60,553,05359,248,2541,304,7993022852,581,05651,391,4991,189,55733,250,00333,250,00371,0523,7393,627112110
Chromosome 20AllAssembled moleculeUnlocalized scaffolds78,739,55578,541,002198,5531921772,565,36272,393,001172,36144,637,17144,637,17121,0822,6522,62428110
Chromosome XAllAssembled moleculeUnlocalized scaffolds153,601,299152,835,861765,43814212145,064,668144,357,465707,20393,731,72593,731,725214,3734,5314,48249110
unplacedAssembled molecule58,166,0967,10754,656,18222,4982,9460