The Pongo abelii whole genome shotgun (WGS) project has the project accession ABGA00000000. This version of the project (01) has the accession number ABGA01000000, and consists of sequences ABGA01000001-ABGA01410168.
The Pongo abelii whole genome shotgun data from primary donor-derived reads (Susie, a female sumatran orangutan housed at the Gladys Porter Zoo (Brownsville, TX)) were assembled using PCAP (Huang 2006) using stringent parameters derived by eliminating detectable global mis-assemblies (interchromosomal cross-overs determined by alignment of the orangutan genome against the human genome) larger than 50kb. Sequences were obtained from plasmids, fosmids, and BAC-end sequences (from CHORI-276 also obtained from 'Susie'). A fingerprint map with a target of 12X clone coverage is currently in progress.
Of the 3.09Gb of total sequence ordered and oriented along the chromosomes gap sizes between supercontigs were estimated based on their size in human with a maximum gap size allowed of 30kb.
The assembly data were aligned against the human genome at UCSC (B. Raney) utilizing BLASTZ (Schwartz 2003) to align and score non-repetitive orangutan regions against repeat-masked human sequence. Alignment chains differentiated between orthologous and paralogous alignments (Kent 2003) and only 'reciprocal best' alignments were retained in the alignment set. The orangutan AGP files were generated from these alignments in a manner similar to that already described (The Chimpanzee Genome Sequencing and Analysis Consortium 2005). Documented inversions based on primarily on FISH data (Rocchi, personal communication) as well as inversions suggested by the assembly and supported by additional mapping data (e.g. fosmid end sequences against the human assembly, (Chen and Eichler, personal communication)) were introduced as was the separation of alignments to human chromosome 2 into orangutan chromosomes 2A and 2B. Finally, 78 finished BAC (CHORI-273) clones were integrated into the final chromosomal sequences.
Background information on the orangutan genome sequencing project and the initial news release about the orangutan assembly can be found on the Washington University School of Medicine and NHGRI websites.
Bulk downloads of the sequence and annotation data are available via GenBank, Ensembl, DDBJ and the UCSC Genome Browser. The complete set of sequence reads is available at the NCBI Trace Archive.
Genome Sequence - Washington University School of Medicine, Baylor College of Medicine
Sequence Assembly and Chromosomal Sequence/AGP Construction - Washington University School of Medicine
BAC library - Yuko Yoshonaga in Pieter de Jong's laboratory at Children's Hospital Oakland Research Institute, Oakland, California, USA (http://bacpac.chori.org)
Fosmid Library - Washington University School of Medicine
Fingerprint Map (in progress) - Washington University School of Medicine
Cytogenetic Mapping and Human/Orang Breakpoint Analyses - Mariano Rocchi, Department of Genetics and Microbiology, University of Bari, Bari, Italy (http://www.biologia.uniba.it/DIGEMI/)
Fosmid End Placement against the human genome used for breakpoint/inversion analyses during AGP construction - Lin Chen, Evan Eichler, Department of Genome Sciences, University of Washington, Seattle, Washington, USA
Funding for the sequence characterization of the orangutan genome is being provided by the National Human Genome Research Institute (NHGRI), National Institutes of Health (NIH).
Assembly name: P_pygmaeus_2.0.2
Eight contigs were suppressed because they may contain contamination.
Singleton scaffolds KE647387-KE650770 were added in August 2013 because the component is in the minus orientation, so the scaffold and component sequences are not identical.
In the form below please describe the problem that you
encountered. We will do our best to fix it as soon as possible.