| Notes for Build 34 |
 |
Assembly:
Build 34 was assembled using HTGS phase 3, Single fragment HTGS phase 2 and WGS contigs from the MGSCv3. In this build chromosomes 1,3,5,6,7,8,9,10 and 12-19 were automatically assembled using a hand curated Tiling Path File (TPF) consisting of clone based sequence and WGS contigs. The HTGS phase 3 sequence is assembled by hand into non-redundant contigs and these are used as an input to the assembly.
Assembly instructions for chromosomes 2,4,11 and X were provided as AGP files by the Sanger Institute. The Y chromosome was built by hand in collaboration with the Washington University Genome Sequencing Center and the Page lab at the Whitehead Institute for Biomedical Research. Currently, only the short arm of the Y has reliable mapping data and thus most of the contigs on the Y chromosome are unplaced.
Annotation:
Delayed release: Due to a change in software, the following maps are not immediately available but will be added shortly:
- Variation (SNP)
- MICER
- Phenotype
In addition, additional information and links will be updated shortly as well.
- Map matrix for STS maps (The Green ball figure)
- Polymorphism column for the STS maps
- STS links for the MGI map
- Insert Sequence Links for the BES clone map
An additional feature will be added to the STS map with the release of Build 34. A direct link to additional placements of that STS will be added. This will provide direct access to these additional placements without having to performa a search.
New Maps available:
Gene Trap clones: Sequence tags from ES cells lines containing a gene trap integration have been aligned to Build 34. Sequence tags are either derived from plasmid rescue (and thus represent genomic sequence) or are derived via RACE (and thus represent cDNA sequence). Both types of sequences are represented on this track. Sequences provided by The International Gene Trap Consortium are noted. The alignment(s) with the highest score are shown.
verbose text: shows the percent identity of the alignment (average of all exons for cDNA sequence). The percent coverage of the gene trap sequence, the score for the alignment and the strain from which the sequence was derived.
colors: For genomic sequences, all alignments with a percent identity >95% and with >85% of the gene trap sequence colored are blue. For cDNA sequences, exons are colored are blue if that exon aligns with a percent identity >95%. The line connecting the exons is blue if >85% of the gene trap sequence is involved in the alignment, otherwise it is orange.
MICER: These are end sequences from the Mutagenic Insertion and Chromosome Engineering Resource (MICER) produced by the Bradley lab at the Sanger Institute. The clones from which these sequences are derived are ready made vectors for generating knockout mice and for chromosome engineering.
Display Changes:
Genes map:
- A linkout to the GENSAT project was added
- Evidence codes have been changed: Previously a single code was used to represent both the evidence and the alignment quality. Now, the evidence code displayed represents the evidence used to support the model. Values are Best RefSeq, mRNA, protein, -, external. The alignment quality is represented by the color
- Blue: identical or mismatch
- Orange: poor or no alignment
- Gray: No annotated CDS
- Black: external support
- RefSeq Transcripts (was RNA): In addition to renaming this map, links to UniGene (ug) and BLink were added. In addition, the alignment quality of the transcript is explicitly stated.
|
| Notes for Build 33 |
 |
Assembly:
Build 33 was assembled using HTGS phase 3 (finished), single fragment HTGS phase 2 and the MGSCv3. A tiling path was hand curated by combining data from clone based Tiling Path Files (TPFs) and the MGSCv3. Chromosome 11 is essentially finished and the Sanger Center provided the assembly for this chromosome. For more information, click here.
Annotation:
Annotation of variation (SNPs)
Variation data is not currently available for this build. Variation data will be added to this build with the release of dbSNP Build 123 on or about September 13, 2004. First use of PREDICTED in annotation of mouse
The word PREDICTED has been added to the title of RNA records with accessions beginning with XM and XR and to the title of protein accessions beginning with XP. PREDICTED therefore appears in the definition line seen in retrievals from Entrez nucleotide and protein and in BLAST results. PREDICTED means that these sequences are derived from genomic placements and not directly from a cDNA. It does not mean that the gene itself is predicted, although it may be. MapViewer:
Maps added as part of this release include:
- Repeats: alignment of common repetitive elements
The May, 2002 version of RepeatMasker was executed using these flags:
- -w flag --invoking MaskerAid
- -no_is
- -cutoff 255
- -frag 20000
The placement ids from RepeatMasker were retained to facilitate individual integration events.
- RnUniG: alignment of rat mRNAs, labeled according to the UniGene cluster to which they belong.
- Rn ESTS: alignment of rat mRNAs (including ESTs)
- CpG: CpG islands found using the algorithm of Takai and Jones, 2002.
|
| Notes for Build 32 |
 |
Assembly:
Build 32 is a composite assembly, integrating HTGS and MGSCv3 sequence. Chromosomes 2,4,5,7,11,15,18,19,X and Y were assembled using clone based Tiling Path Files (TPFs) that integrate whole genome shotgun sequence were appropriate. Chromosomes 1,3,6,8,9,10,12,13,14,16 and 17 were assembled using the MGSCv3 as a tiling path and integrating HTGS sequence were appropriate. For more information, click here.
In addition, we are now displaying and annotating an alternative view of mouse chromosome 16 that was assembled by Celera using a whole genome shotgun method and described in:
Mural RJ, et al. (2002) Science Aug 23;297(5585):1278.
Annotation:
Versioning of annotation
The version of a set of annotation is now displayed on the Map Viewer page. A version will be incremented if data in map or maps is updated, for example if a new dbSNP build is released and the Variation map is changed accordingly. The statistics page now supports reporting by version. In contrast, a Build is incremented only with a change in the reference sequence (assembly) itself, and the initial version for that new build is set as one(1).
Gene Annotation
The algorithm for placing mRNAs on the genome was improved to:
- align small internal exons
- generate full-length alignments extending the alignment to cover short regions at the ends of the transcripts
These changes should be apparent in the UniGene and EST maps as well as the exon annotation on the Reference sequences. The number of genes annotated on the reference genome has decreased, and the number of models identified as pseudogenes has increased. This is primarily due to a change in the algorithm used to model genes, mRNAs and proteins which gives more weight to coding propensity and matches to existing proteins, and checks more rigorously for changes in frame. This method, developed by Alexandre Souvorov and named Gnomon, has replaced GenomeScan as our standard method of predicting gene models. It is discussed in more detail here. In this method, any gene model that results in a frameshift or premature termination relative to a set of conserved proteins is flagged as a probable pseudogene. That pseudogene is retained as the annotation unless: (1) the gene model corresponds to the the best placement of a RefSeq mRNA from a protein-coding gene, or (2) the gene is identified as protein-coding by best placement of known mRNAs. If mRNA aligns well to the model, a model RNA product is generated (RefSeq accession of the format XR_xxxxxx), otherwise the gene is annotated as /pseudo with no product. Because of the above, there are now three sources of pseudogene annotation:
- Alignment of a genomic RefSeq accession, with the pseudogene annotation transferred by alignment
- Alignment of a RefSeq RNA from a pseudogene (NR_xxxxxx)
- Evaluation of protein-coding propensity by Gnomon.
MapViewer:
Release of this build involves a change of software and the addition of several maps.
- Comparative map: View conserved segments of the human genome.
- Software changes:
- view annotation on alternate assemblies in the same view.
- when verbose mode is not selected, information concerning the loci shown is obtainable in a pop-up menu.
|
| Notes for Build 31 |
 |
| Build 31 was not released due to assembly problems |
| Notes for Build 30 |
 |
Assembly:
Build 30 is a composite assembly composed of HTGS phase 3 sequence and the MGSCv3. For more details on this process see the Mouse Assembly Page.
Annotation:
There are no changes in the annotation. MapViewer:
Gene model representation: In previous releases, a single mismatch between an mRNA and the genome sequence would lead to a conflict, and the gene model would be represented in orange. These parameters have now been relaxed. Mismatches are allowed as long down to 95% identity. A gap in the alignment between an mRNA and the genome will still result in a conflict
FES_Clone map: Fosmid end sequences that were generated as part of the Whole Genome Shotgun sequencing effort are being mapped onto the assembly. |
| Notes for Build 29/Build 3 |
 |
Assembly:
No changes to the assembly process were made in this build
Annotation:
MapViewer:
Three new maps were added to the map viewer:
Strain map: This map allows users to visualize all of the sequence for a given region of the genome. Currently, the MGSCv3 is being used as the reference assembly, and all of the BAC based contigs (NT_XXXXXX contigs) are being treated as separate haplotypes, or strains. When viewing the strain map, a blue line to the left of the thin grey line indicates the haplotype that is being viewed. All additional maps in the view will reflect the annotation applied to that haplotype. Orange lines to the left of the thin grey line show regions of the genome where sequence from other haplotypes are available. Clicking on the label for one of these haplotypes will move it to the right of the thin grey line, and the annotation on that particular haplotype will now be displayed.
QTL map: This map represents QTL data that is stored at MGI. Currently, all QTLs are shown as a single point on the genome, associated with the peak LOD score for that QTL.
Hs_EST: This map represents the alignment of all human ESTs to the mouse genome.
Hs_UniGene: This map takes EST alignments shown in the Hs_EST map, and clusters these based on shared introns, and alignment to a common position on the genome. |
| Notes for Build 27/Build 2 |
 |
Assembly:
Build 27 consists of finished HTGS sequence assembled into non-redundant contigs. These contigs have accession numbers of the form NT_XXXXXX. Build 2 consists of the Mouse Genome Sequencing Consortium's (MGSC) Whole Genome Shotgun (WGS) assembly of the mouse Genome. WGS contigs in this assembly have accession numbers of the form CAAA01XXXXXX.
WGS supercontigs have accession numbers of the form NW_XXXXXX.
Annotation:
This release brought annotation of the mouse in line with the annotation of the human genome. In addition, BAC end sequences were aligned to the assembly to produce a robust clone map(BES). |
| Notes for Build 25 |
 |
Assembly All contigs consist of finished (htgs phase 3 sequence).
Annotation Currently only STSs and variation are being annotated on these contigs. MapViewer Fingerprint Contig Maps were added to the Map Viewer with this release. There are two fingerprint based maps. The FPC map and the FPC-STS map.
Configuration Maps&Options has replaced the previous Display Settings link. It has been made more obvious by a new contrasting background color, and by being accessible not only within the blue bar at the top of the screen but also from the blue column at the left. Other changes in the basic display include:
- Adding an option to set the format of the compact (thumbnail) view of the chromosome in the blue column at the left.
- Clarifying the zoom options in the left column by reducing the number of choices and providing a mouse over to indicate the fraction of the chromosome to display (from 1/10000, 1/1000, 1/100, 1/10 or 1/1).
- Within the Maps&Options box itself, allowing configuration of the thumbnail view.
- Making the name of the master map be red.
- Adding a link to the BLAST search page.
Data as table option is now available. Data in a particular genome view can be viewed in a tabular format and downloaded as a tab-delimited file |
|