| PubMed | Nucleotide | Protein | Genome | Structure | Taxonomy |
| Mus musculus - laboratory mouse data and search tips | Revised June 28, 2007 |
The Map Viewer help document describes how to use the Map Viewer software. This page describes the data available for Mus musculus (mouse), and the search tips specific to that organism. You can also return to the Mus musculus genome view search page. The Map Viewer home page allows you to search the genome data of any organism represented in MapViewer.
|
|
|
| The Map Viewer provides a view of mouse data from a variety of sources, including sequence, genetic linkage, radiation hybrid, and YAC maps, described below. |
| Mouse Genomic Sequence Data: finished BACs and whole genome shotgun (WGS) data; one assembly per strain |
|
|
The current mouse reference genome build is a largely finished reference assembly, produced by NCBI in consultation with the Mouse Genome Sequencing Consortium, that contains small amounts of WGS and HTGS Draft sequence. Several assemblies exist for the mouse genome, each containing data from a given mouse strain:
When viewing any of the sequence maps, Map Viewer displays the sequence data and annotations for only one assembly/strain at a time. To see which strains have data on a particular chromosome, view the Assembly map for that chromosome. It can be added to the display using the Maps&Options dialog box. By default, the sequence maps show the data and annotations from the reference assembly. To see the data from another strain, go to Maps & Options. Select the desired strain in the Assembly menu, select the maps from the list on the right, press the Change Assembly button, and click OK. That will refresh the display of all the sequence maps currently in your browser window so they show the sequence data/annotations from the desired strain. Also, please note that any single chromosome, or chromosome region, might contain data from only a subset of the mouse strains. Any single sequence map (e.g., contig, component, gene_sequence), or region of that sequence map, might contain data from only a subset of mouse strains. There is no chromosome or sequence map that contains data from all the strains. Additional details are provided in the description of the Assembly map. Separate documents provide information about the process used to assemble and annotate the finished and draft BACs at NCBI, the MGSCv3 assembly process, the release notes for each build of the genome, and statistics for the current build (32). A glossary of genome-related terms is also available. |
| Mouse BLAST Databases |
|
|
The complete set of mouse sequence databases available for BLAST searching are shown on the mouse BLAST page, which includes a link to the database descriptions. |
| Additional Mouse Genome Resources |
|
|
In addition to the mouse data displayed in the Map Viewer,
NCBI provides access to Human-Mouse Homology Maps -- synteny maps that compares genes in homologous segments of DNA from human and mouse. The Mouse Genome Resources page includes links to both Map Viewer and the Homology Maps. It also brings together information on diverse mouse-related resources from multiple centers, including sequence, mapping, and clone information as well as pointers to strain and mutant resources. |
|
|
| The available maps for Mus musculus include: |
| Cytogenetic Maps |
|
| Ideogram |
Chromosome banding pattern. |
| Sequence Maps |
|
| Ab initio |
Models generated by Gnomon. Those models with e values <0.0001 are indicated as dark brown on the map. Other models are shown in light brown. Please note that this process predicts exons and not all possible mRNAs, so there is only one model per putative gene. The labels on the map are linked to the protein record of the highest scoring match to the model's predicted protein. Note that Gnomon models may also included in the Gene_Sequence map, in regions where confirmed models have not yet been identified. |
| Assembly |
The Assembly map allows users to visualize all of the sequence data available for a given region of the genome, and separates the data by assembly. Each assembly contains sequence data and annotations from a different mouse strain. Data are currently available from the following strains:
When viewing the Assembly map, a blue vertical line indicates the assembly that is being viewed. The reference assembly is shown as the blue line by default. The orange vertical lines show regions of the genome where sequence data from other assemblies are available. The Maps&Options dialog box allows you to change the assembly being displayed. To do that, select the desired assembly in the Assembly menu, select Assembly from the list of maps on the right, press the Change Assembly button, and then click OK. When the display is refreshed, the line color for your selected assembly will change from orange to blue, and it will move to the right side of Assembly map. (Conversely, the other assembly(ies) available for the chromosome region will now be shown as vertical orange lines on the left side of the map.) |
| BES_Clone | Alignment of BAC end sequences to
the assembled genomic sequence. BAC end sequences were generated by TIGR
(http://www.tigr.org/tdb/bac_ends/mouse/bac_end_intro.html). During the alignment process, at least 50% of the BAC end had to align to the genome with >96% identity.
All hits with the best bit score were kept. For example, if a BAC end sequence hit two places on the genome with the same high bit score, both of those hits are shown. Various colors are used in the graphic display to show (1) the quality of alignment of a BAC end to the assembled genomic sequence (e.g., does the BAC end hit the genome uniquely, does it contain repetitive sequence?), and (2) the relationship between two BAC ends (e.g., are they at the expected orientation and distance from each other, are they on different chromosomes, or is a virtual relationship estimated between a BAC end that has been sequenced, and its unsequenced mate pair?). The examples below provide more detail. When the BES_Clone map is displayed as the master map (described in the main Map Viewer help document), the values shown in the "end1" and "end2" columns provide the accession numbers of the sequence records for those ends, which were deposited into dbGSS.![]() ![]()
![]()
![]()
|
| Component | The component map provides the tiling path of GenBank accessions used to build each "NT_xxxxxx" contig, and the tiling path of GenBank "CAAA01xxxxxx" or "AAHY01xxxxxx" accessions used to build the "NW_xxxxxx" WGS contigs. |
| Contig | Shows the chromosomal placement of NT_xxxxxx and NW_xxxxxx contigs. |
| Ensembl Genes | Alignment of genes annotated on the genomic contigs by Ensembl. |
| Ensembl Transcripts | Alignment of individual transcripts to the assembled genomic sequence by Ensembl. |
| GenBank_DNA | Shows the placement of mouse genomic DNA sequences from GenBank that were not used in the assembly of contigs. The placement is based on the alignment of the sequences to the components of the contigs. It includes mouse genomic sequences longer than 500 bp that have at least 97% identity to the components for at least 98 base pairs. If a sequence extends beyond a contig, that portion of sequence is not shown. The 'hits' link leads to a tabular display that shows the matching regions (base spans) of the assembly component and the GenBank genomic DNA record that has been aligned to it. Orange lines represent unfinished (phase 1 and 2) HTGs sequences that have been aligned to the assembled genome. Blue lines represent other human genomic DNA records that have been aligned to the assembled genome. The length of a line represents the upper and lower-most points on the genome assembly to which sequence fragments from a single GenBank record were aligned. Thick parts of a line represent fragments of sequence from a GenBank record that have been aligned to the assembled genomic sequence, and the thin parts of a line connect the fragment that come from a single GenBank record. When the GenBank_DNA map is displayed as the master map, in the default verbose mode, the descriptive text includes a bases column, which shows the total number of bases in the GenBank record that was aligned to the genome, and a status column, which shows the total number of bases from that record that were aligned to the genome, how many separate pieces of sequence from that record were aligned, and whether those pieces were shuffled to make the alignment. |
| Genes_Sequence | Genes that have been annotated on the genomic contigs. This includes known and putative genes placed as a result of alignments of mRNAs to the contigs. If multiple models exist for a single gene, corresponding to splicing variants, the Gene_Sequence map presents a flattened view of all the exons that can be spliced together in various ways. For example, if one splice variant uses exons 1, 3, 4, and another splice variant uses exons 2, 3, 4, the Gene_Sequence map shows exons 1, 2, 3, 4. (In comparison, the Transcript (RNA) map shows what combinations of exons are valid based on mRNA sequences from RefSeq and GenBank.) Genes shown on the left of the grey line are transcribed in the - orientation (from bottom up), and those on the right in the + orientation (from top down). When Gene_Sequence is selected as the Master map, the verbose display (detailed labeling, shown by default) includes arrows to the right of each gene name indicate its direction of transcription as well as links to:
Additional information about these links is also provided below, under view/download sequence data from a chromosome region. Gene models are shown in five colors, depending on the type of evidence that was used to construct the models. The one or two letter code shown in the evidence column (that is displayed when Gene_Sequence is the master map) also indicates the type of evidence. |
|
|
Additional Notes: In general, a gene model is shown in blue if there is a clean alignment between a RefSeq or GenBank mRNA sequence and the genomic sequence, and if there is an exact match between the protein product that was annotated in the mRNA sequence record and the conceptual translation of the genomic sequence gene model. A gene model is shown in orange if there is some discrepancy between the mRNA sequence and the gene model, either in the alignment of the two and/or in their protein products. Examples of the former can include gaps, or the alignment of an mRNA to two or more genomic regions. Examples of the latter can include differences between the amino acid sequence given in an mRNA sequence record and the conceptual translation of the corresponding gene model, or premature termination of a coding region in the genomic sequence. Both of those can be caused by base pair mismatches between the mRNA and genomic sequence. Models with Interim LocusIDs (evidence code I) may be paralogs, genes not yet curated, duplications because of assembly errors, or pseudogenes. The genome assembly and annotation pipeline assigns interim IDs when there is no unambiguous solution to what they should be. Interim LocusIDs are always associated with a RefSeq XM_* accessions (model mRNAs), although supporting alignments may (or may not) include RefSeq NM_* accessions (known mRNAs). More about RefSeq and RefSeq accessions... |
| GeneTrap | Alignment of sequence tags from ES cells lines containing a gene trap integration to the assembled genomic sequence. Sequence tags are either derived from plasmid rescue and thus represent genomic sequence, or are derived via RACE and thus represent cDNA sequence. Both types of sequences are represented on this track. Sequences provided by The International Gene Trap Consortium are noted. The alignment(s) with the highest score are shown. |
| MICER | Alignment of end sequences from MICER clones to the assembled genomic sequence. The MICER (Mutagenic Insertion and Chromosome Engineering Resource) are produced by the Bradley lab at the Sanger Institute. The clones from which these sequences are derived are ready made vectors for generating knockout mice and for chromosome engineering. |
| Hs_RNAs | Alignment of individual human transcripts to the assembled genomic sequence. The corresponding alignment of EST clusters is shown in the Hs_UniGene map, described below. |
| Hs_UniGene | Alignment of human EST clusters to the assembled genomic sequence. ESTs are clustered based on shared introns and alignment to a common position on the genome. Those ESTs can come from one or more UniGene clusters, whose IDs are noted by the EST cluster. (UniGene clusters are made with a different build procedure, so there is not necessarily a one-to-one correspondence between EST clusters on the Hs_UniGene map and clusters in the UniGene resource.) |
| Mm_RNAs | Alignment of individual mouse transcripts to the assembled genomic sequence. The corresponding alignment of EST clusters is shown in the Mm_UniGene map, described below. |
| Mm_UniGene | Alignment of mouse EST clusters to the assembled genomic sequence. ESTs are clustered based on shared introns and alignment to a common position on the genome. Those ESTs can come from one or more UniGene clusters, whose IDs are noted by the EST cluster. (UniGene clusters are made with a different build procedure, so there is not necessarily a one-to-one correspondence between EST clusters on the Mm_UniGene map and clusters in the UniGene resource.) |
| Rn_RNAs | Alignment of rat transcripts to the assembled genomic sequence. The corresponding alignment of EST clusters is shown in the Rn_UniGene map, described below. |
| Rn_UniGene | Alignment of rat EST clusters to the assembled genomic sequence. ESTs are clustered based on shared introns and alignment to a common position on the genome. Those ESTs can come from one or more UniGene clusters, whose IDs are noted by the EST cluster. (UniGene clusters are made with a different build procedure, so there is not necessarily a one-to-one correspondence between the placement on the Rn_UniGene display and clusters in the UniGene resource.) |
| Phenotype |
Map of Quantitative Trait Loci. This map was obtained from Mouse Genome Informatics (MGI). The data are represented as single points along the chromosome, as the QTL is currently associated with the marker that gave the highest LOD score. Additional information about Quantitative Trait Loci is provided by Griffiths, et al., in An Introduction to Genetic Analysis, available online through the Entrez Books database. |
| Repeats | Position of repetitive elements
The May, 2002 version of RepeatMasker was executed using these flags:
|
| STS | Placement of STSs from a variety of sources onto the assembled genomic sequence (the NT_xxxxxx contigs, described above) using Electronic-PCR (e-PCR). Current sources of STS data are MGI, WI-MRC RH, WI-Genetic, and WI-YAC. |
| RefSeq RNA | Diagrams of the RNAs that are predicted on the genomic contigs. The RefSeq RNA map and Gene_Sequence map are built in the same way, using the same types of evidence, described above. However, the Gene_Sequence map shows a view of all the exons in a gene, while the RNA map shows the combinations of exons (i.e., splice variants) that are valid, based on mRNA sequences. |
| Variation | Alignment of genetic variation data from dbSNP onto the assembled genomic sequence (the NT_xxxxxx contigs, described above). |
| Genetic Linkage Maps |
|
| MGI Integrated Genetic |
The mouse linkage map displayed in this viewer has been obtained from MGI (Blake et al., 2000), using the consensus marker and map information available from the MGI ftp site. Only a subset of the available genetically mapped markers were used in this viewer: primarily genes and markers with associated sequence data. MGD integrates primary mapping data from multiple sources. The MGI consensus map is built from these data, from work of the Mouse Chromosome Committees, and the MGI curation efforts. For more details, or to obtain different map views, go to the MGI Maps and Mapping Data page, or to the Mouse Genome Informatics Home Page. |
| Whitehead_Genetic | This map obtained from the Whitehead Intstitute consists of 6331 simple sequence length polymorphisms (SSLPs) mapped on an Ob x Cast F2 intercross at an average resolution of 1.1 cM. Primary data for this map can be obtained from the Whitehead Mouse Mapping Web Site. |
| Radiation Hybrid and YAC Maps |
|
| Whitehead/MRC_RH | A whole-genome RH map of genetic loci, STSs representing
ESTs, and 2,114 reference SSLPs (simple sequence length polymorphisms),
mapped onto the T31 RH panel. The project is described by Hudson, et al., Nature Genetics, October 2001, and the T31 panel
is described by McCarthy,
et al., Genome Research, 1997. This is a collaboration between
the MRC-Harwell
and the Whitehead Institute. These data will be described in an upcoming
manuscript. The most
recent statistics are available from the WICGR
Mouse RH Map Home Page. Scale = cR3000. Total number of centiRays across the genome = 29,559. Kb/cR: 98.8 Kilobase/cR3000. |
| Whitehead_YAC | This map consists of roughly 10,000 STSs derived from simple sequence length polymorphisms contig'ed on a library of yeast artificial chromosome clones (~820kb average insert length). Primary data for this map can be obtained at the WIBR physical mapping web page. These data are descibed in Haldi et al. |
|
|
| Clones |
|
| Components of Sequence Assembly |
|
|
| GenBank Accessions |
|
|
| Genes |
|
|
| Phenotypes |
|
|
| Polymorphisms |
|
|
| STSs |
|
|
|
|
| Verbose Mode |
|
|
By default, the master map at the right side of the display is shown in verbose mode, which provides descriptive information (as available) for each object on the master map. |
| Orientation |
|
| Object Location | Symbol | Meaning |
| Plus strand | Genes shown to the right of the grey line are transcribed in the + orientation (from top down); contigs with a + orientation are read from top down | |
| Minus strand | Genes shown to the left of the grey line are transcribed in the - orientation (from bottom up); contigs with a - orientation are read from bottom up | |
| Unknown | ? | The orientation of the map element is unknown. |
| Links to Related Resources |
|
| Each map element displayed in your search results will be associated with a number of links (when available) that lead to additional information. The links include: |
| Linked Text | Link Action | Description |
| Map element | Map View | The results of a search list the map elements that contain your search term. Those elements can be present in one or more maps. Following the link for a particular map element leads to a graphical view of the chromosomal region that contains the element. |
| MGI | Mouse Genome Informatics | Links to gene information on the Mouse Genome Informatics web site at The Jackson Laboratory |
| sv | Sequence Viewer | Graphically shows the position of the map element within the sequence region. The display includes a graphic depiction of the coding region (CDS), RNA, and gene features that have been annotated on that sequence region. A 2 Kb section of sequence is shown below that, with corresponding graphic annotations of the features. The left and right arrows at either end of the sequence data allow you to move upstream and downstream. |
| pr | Protein | Links to the corresponding protein sequence record in the Entrez Protein database. |
| dl | Download Sequence |
Opens a form that allows you to download a region of a chromosome. The form has two parts: (1) the top part allows you to enter chromosome coordinates in text boxes, and (2) the bottom part displays the NT_* contigs (or portions of them) that are found in that chromosome region. Note that part 1 shows the position (base span) of the region on the chromosome, and part 2 shows the position of the region on the contig. The "strand" column for each contig shows whether that contig is on the plus or minus strand of the chromosome. Therefore, if a contig is on the minus strand, increasing the value of the 3' chromosome coordinate will decrease the value of the 5' contig coordinate. The options to "Display, Save to Disk, and View Evidence" allow you to view the individual contigs in the region (or portions of them, depending on the chromosome region specified). By default, the dl link beside each gene displays the chromosome and contig coordinates for the span of that gene. To view/save additional sequence data upstream and downstream of the gene, simply adjust the chromosome coordinates and press the "Change Region" button. Note that the contig coordinates will also change. |
| ev | Evidence Viewer | Graphical display of the biological evidence supporting a particular gene model. It displays all RefSeq models, GenBank mRNAs, annotated known or potential transcripts, and ESTs that align to the genomic sequence region of interest. (more...) |
| mm | Model Maker | Allows you to view the evidence that was used to build a gene model on assembled genomic sequence, and to create your own version of the model by selecting exons of interest. Model Maker is accessible from sequence maps that were analyzed at NCBI and displayed in Map Viewer. To see an example, follow the "mm" link beside any gene annotated on the human "Gene_Sequence" map. (more...) |
| hm | HomoloGene | a resource of curated and calculated orthologs for genes as represented by UniGene or by annotation of genomic sequences. (more about HomoloGene...) |
| STS Maps Legend |
|
| Colored dots indicate uniqueness of STS positions |
|
| Polymorphism Column |
|
|
The polymorphism column indicates whether the marker has been used to detect a polymorphism, with Y for yes and N for no. |
| Detailed Marker Information |
|
|
To see detailed mapping information about a marker, follow the link for that marker to its UniSTS record. |
|
|
| Searchable Terms |
|
| Text terms |
|
The Mus musculus data are searchable with the following types of terms:
The system will retrieve mapped objects containing the search terms in their descriptions. |
| Truncation |
|
| Search terms can also be truncated at the right end only, using an asterisk (*) as a wild card to represent zero to many characters. See the truncation section of the general Map Viewer Help document for more details. |
| Map Positions or Regions |
|
As noted in the Search By Position section of the Entrez Map Viewer general help document,
there are three main ways to search by map position from the
Map View of a chromosome:
|
| Query options |
|
| Boolean Operators |
|
|
If multiple terms are entered, they will automatically be combined with a Boolean AND, as mentioned in the Text Terms section above. Adjacency searches are not supported at present. For example, a query entered as cell adhesion will be processed as cell AND adhesion and will retrieve records with descriptions that contain cell matrix adhesion as well as cell adhesion.
You can choose to use any Boolean operators (AND, OR, NOT) in your query. Boolean operators must be written in upper case.
The general syntax for a Boolean Query is: The available search fields and their corresponding abbreviations (qualifiers) are listed below. By default, Boolean operators are processed from left to right. The order in which Entrez processes a search statement can be changed by enclosing individual concepts in parentheses. The terms inside the parentheses are processed first as a unit and then incorporated into the overall strategy. Additional details about Boolean Operators are provided in the Entrez Help document. |
| Search fields |
|
|
If desired, you can restrict the search for a term to a particular field by placing the field qualifier in square brackets [] after the term. It is not necessary to include a space between the search term and the field specifier. If no field qualifier is used, the system will search all fields. Terms can be combined with Boolean operators, as described above. |
| Search field | Description | Qualifier |
|---|---|---|
| accession | the nucleotide accession of a GenBank component or the nucletide or protein accessions for RefSeqs | [accession], [acc], [accn] |
| chromosome | the chromosome number | [chr] |
| id | the integer identifier for a particular type of object; useful in combination with type | [id] |
| map name | the name of the map (The general Map Viewer Help document provides a list of map names. Use the character string in the "URL value" column.) | [map_name],[map] |
| symbol | the gene symbol or other short name; includes clone names, marker names, and alternate symbols (also referred to as aliases or synonyms; see Text Terms section above for example) | [sym] |
| title | gene name, symbol, or description | [title], [ti], [titl] |
| type | type of mapped object; most useful in combination with id Options are: component, contig, fpcclone, locus, snp, sts | [obj_type] |
|
|
| If you would like to create WWW links to the Map Viewer, the instructions for constructing URLs are given in the general Map Viewer Help document. You can construct URLs that either perform a search or display a specific mapped object or chromosomal region. |
|
Questions or Comments? Write to the NCBI Service Desk |