Format

Download Assemblies

Send to:

Choose Destination

P_pygmaeus_2.0.2

  • Record removed. This version of the assembly has been suppressed.
Organism name:
Pongo abelii (Sumatran orangutan)
Isolate:
ISIS 71
Sex:
female
BioSample:
SAMN02981238
BioProject:
PRJNA20869
Submitter:
Orangutan Genome Sequencing Consortium
Date:
2010/03/08
Assembly level:
Chromosome
Genome representation:
full
GenBank assembly accession:
n/a
RefSeq assembly accession:
GCF_000001545.4 (suppressed)
RefSeq assembly and GenBank assembly identical:
n/a
WGS Project:
ABGA01
Genome coverage:
6x

IDs: 395158 [UID] 395158 [RefSeq]

See Genome Information for Pongo abelii

There are 3 assemblies for this organism

See more

History (Show revision history)

Comment

The Pongo abelii whole genome shotgun data from primary donor-derived reads (Susie, a female sumatran orangutan housed at the Gladys Porter Zoo (Brownsville, TX)) were assembled using PCAP (Huang 2006) using stringent parameters derived by eliminating detectable global mis-assemblies ... (interchromosomal cross-overs determined by alignment of the orangutan genome against the human genome) larger than 50kb. Sequences were obtained from plasmids, fosmids, and BAC-end sequences (from CHORI-276 also obtained from "Susie"). A fingerprint map with a target of 12X clone coverage is currently in progress.

Of the 3.09Gb of total sequence ordered and oriented along the chromosomes gap sizes between supercontigs were estimated based on their size in human with a maximum gap size allowed of 30kb.

The assembly data were aligned against the human genome at UCSC (B. Raney) utilizing BLASTZ (Schwartz 2003) to align and score non-repetitive orangutan regions against repeat-masked human sequence. Alignment chains differentiated between orthologous and paralogous alignments (Kent 2003) and only "reciprocal best" alignments were retained in the alignment set. The orangutan AGP files were generated from these alignments in a manner similar to that already described (The Chimpanzee Genome Sequencing and Analysis Consortium 2005). Documented inversions based on primarily on FISH data (Rocchi, personal communication) as well as inversions suggested by the assembly and supported by additional mapping data (e.g. fosmid end sequences against the human assembly, (Chen and Eichler, personal communication)) were introduced as was the separation of alignments to human chromosome 2 into orangutan chromosomes 2A and 2B. Finally, 78 finished BAC (CHORI-273) clones were integrated into the final chromosomal sequences.

Background information on the orangutan genome sequencing project and the initial news release about the orangutan assembly can be found on the Washington University School of Medicine and NHGRI websites. 

Bulk downloads of the sequence and annotation data are available via GenBank, Ensembl, DDBJ and the UCSC Genome Browser. The complete set of sequence reads is available at the NCBI Trace Archive.

Credits:

Genome Sequence - Washington University School of Medicine, Baylor College of Medicine

Sequence Assembly and Chromosomal Sequence/AGP Construction - Washington University School of Medicine

BAC library - Yuko Yoshonaga in Pieter de Jong's laboratory at Children's Hospital Oakland Research Institute, Oakland, California, USA (http://bacpac.chori.org)

Fosmid Library - Washington University School of Medicine

Fingerprint Map (in progress) - Washington University School of Medicine

Cytogenetic Mapping and Human/Orang Breakpoint Analyses - Mariano Rocchi, Department of Genetics and Microbiology, University of Bari, Bari, Italy (http://www.biologia.uniba.it/DIGEMI/)

Fosmid End Placement against the human genome used for breakpoint/inversion analyses during AGP construction - Lin Chen, Evan Eichler, Department of Genome Sciences, University of Washington, Seattle, Washington, USA

Funding for the sequence characterization of the orangutan genome is being provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).

Assembly name: P_pygmaeus_2.0.2
Coverage: 6x

Eight contigs were suppressed because they may contain contamination.
Singleton scaffolds KE647387-KE650770 were added in August 2013 because the component is in the minus orientation, so the scaffold and component sequences are not identical.  more

Global statistics

Total sequence length3,441,244,233
Total ungapped length3,093,565,778
Gaps between scaffolds17,860
Number of scaffolds79,342
Scaffold N50747,460
Scaffold L501,057
Number of contigs408,552
Contig N5015,648
Contig L5055,289
Total number of chromosomes and plasmids25
Number of component sequences (WGS or clone)408,686

Supplemental Content

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
non-nuclear
Assembly Unit: Primary Assembly (GCF_000001525.3)
Molecule nameGenBank sequenceRefSeq sequenceUnlocalized
sequences count
Chromosome 1CM000550.1=NC_012591.15,639
Chromosome 2ACM000551.1=NC_012592.12,276
Chromosome 2BCM000552.1=NC_012593.12,553
Chromosome 3CM000553.1=NC_012594.13,491
Chromosome 4CM000554.1=NC_012595.13,851
Chromosome 5CM000555.1=NC_012596.12,909
Chromosome 6CM000556.1=NC_012597.12,432
Chromosome 7CM000557.1=NC_012598.12,691
Chromosome 8CM000558.1=NC_012599.12,632
Chromosome 9CM000559.1=NC_012600.12,239
Chromosome 10CM000560.1=NC_012601.15,798
Chromosome 11CM000561.1=NC_012602.1746
Chromosome 12CM000562.1=NC_012603.12,014
Chromosome 13CM000563.1=NC_012604.11,656
Chromosome 14CM000564.1=NC_012605.11,269
Chromosome 15CM000565.1=NC_012606.11,500
Chromosome 16CM000566.1=NC_012607.11,523
Chromosome 17CM000567.1=NC_012608.11,453
Chromosome 18CM000568.1=NC_012609.11,238
Chromosome 19CM000569.1=NC_012610.11,162
Chromosome 20CM000570.1=NC_012611.1884
Chromosome 21CM000571.1=NC_012612.1733
Chromosome 22CM000572.1=NC_012613.1409
Chromosome XCM000573.1=NC_012614.11,965
unplacedn/an/an/a8,447

Assembly statistics

MoleculeSequence RoleTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
AllAssembled molecule3,441,227,73479,3413,093,549,279747,460329,21017,860
Chromosome 1AllAssembled moleculeUnlocalized scaffolds265,347,102229,942,01735,405,0856,7841,1455,639247,739,428216,062,81831,676,610692,802919,0708,70127,33322,4184,9151,1461,1460
Chromosome 2AAllAssembled moleculeUnlocalized scaffolds126,669,488113,028,65613,640,8322,9887122,276116,914,153104,617,29412,296,859642,468778,1138,21711,76910,0431,7267137130
Chromosome 2BAllAssembled moleculeUnlocalized scaffolds149,759,979135,000,29414,759,6853,2767232,553139,692,184126,498,17413,194,010941,6431,053,0307,48013,98711,9702,0177247240
Chromosome 3AllAssembled moleculeUnlocalized scaffolds222,374,710202,140,23220,234,4784,3438523,491208,583,915190,357,22418,226,691961,1411,109,9257,93720,22717,5332,6948538530
Chromosome 4AllAssembled moleculeUnlocalized scaffolds221,124,155198,332,21822,791,9375,3171,4663,851206,625,489186,137,06920,488,420638,335800,8148,24721,20718,1473,0601,4671,4670
Chromosome 5AllAssembled moleculeUnlocalized scaffolds201,493,775183,952,66217,541,1133,8068972,909188,445,063172,728,73915,716,3241,062,2201,203,5918,48318,39116,1512,2408988980
Chromosome 6AllAssembled moleculeUnlocalized scaffolds190,442,443174,210,43116,232,0123,2167842,432177,914,195164,090,09313,824,1021,105,2061,251,6479,04217,13115,1172,0147857850
Chromosome 7AllAssembled moleculeUnlocalized scaffolds176,319,695157,549,27118,770,4243,8581,1672,691162,732,837145,688,23717,044,600670,005896,1899,74417,17414,8262,3481,1681,1680
Chromosome 8AllAssembled moleculeUnlocalized scaffolds168,319,326153,482,34914,836,9773,5349022,632153,958,699140,595,65813,363,041838,382935,8077,69515,49513,5401,9559039030
Chromosome 9AllAssembled moleculeUnlocalized scaffolds148,762,520135,191,52613,570,9942,9597202,239121,461,004109,205,90112,255,103738,933895,6628,57712,70110,9521,7497227220
Chromosome 10AllAssembled moleculeUnlocalized scaffolds171,830,649133,410,05738,420,5926,6538555,798159,247,864124,372,83634,875,028546,138857,7339,31817,52012,2295,2918568560
Chromosome 11AllAssembled moleculeUnlocalized scaffolds138,255,412132,107,9716,147,4411,551805746129,947,183124,384,3265,562,857941,540996,92610,39813,05512,2138428068060
Chromosome 12AllAssembled moleculeUnlocalized scaffolds147,917,946136,387,46511,530,4812,7457312,014139,133,514128,778,48610,355,028949,0271,073,9087,70214,64013,1541,4867327320
Chromosome 13AllAssembled moleculeUnlocalized scaffolds127,049,429117,095,1499,954,2802,2856291,656103,695,85794,686,9949,008,863890,3481,089,2238,08010,1758,9281,2476316310
Chromosome 14AllAssembled moleculeUnlocalized scaffolds116,196,699108,868,5997,328,1001,7544851,26993,156,59186,578,7866,577,805933,168994,3908,0419,7678,7889794874870
Chromosome 15AllAssembled moleculeUnlocalized scaffolds109,930,22899,152,02310,778,2051,9794791,50085,836,66575,962,0629,874,603648,144799,99010,3269,1097,7471,3624814810
Chromosome 16AllAssembled moleculeUnlocalized scaffolds89,960,13177,800,21612,159,9152,0104871,52381,912,92670,802,78811,110,138652,907785,64111,1709,6888,1361,5524884880
Chromosome 17AllAssembled moleculeUnlocalized scaffolds90,905,34973,212,45317,692,8962,0455921,45383,475,88867,082,59516,393,293561,027567,12459,85010,8798,5532,3265935930
Chromosome 18AllAssembled moleculeUnlocalized scaffolds100,653,15394,050,8906,602,2631,6454071,23879,463,25673,516,5775,946,679936,823994,7336,6897,5936,7868074094090
Chromosome 19AllAssembled moleculeUnlocalized scaffolds68,617,17460,714,8407,902,3342,2451,0831,16258,278,82851,368,1596,910,669140,387167,6569,71110,0328,7671,2651,0841,0840
Chromosome 20AllAssembled moleculeUnlocalized scaffolds68,042,15562,736,3495,305,8061,25837488462,826,65058,062,0514,764,599698,716747,3677,7106,8576,1856723753750
Chromosome 21AllAssembled moleculeUnlocalized scaffolds52,534,22348,394,5104,139,7131,22749473336,747,53333,052,9233,694,610279,752336,7587,3764,0753,5075684964960
Chromosome 22AllAssembled moleculeUnlocalized scaffolds50,079,54846,535,5523,543,99675834940933,436,21230,217,9743,218,238442,835472,42112,5434,2303,7364943513510
Chromosome XAllAssembled moleculeUnlocalized scaffolds166,443,959156,195,29910,248,6602,6586931,965157,283,252148,147,7469,135,5061,228,2481,315,4856,55715,38313,9651,4186926920
unplacedAssembled molecule72,198,4868,44765,040,09312,55910,7920
MoleculeTotal
Length
Mitochondrion MT16,499
Support Center