Sample Entrez Gene Record (GeneID 4292: human MLH1)
Now that you are viewing the detailed record for GeneID 4292: human MLH1, we will talk about:
Information common in both Entrez Gene and RefSeq records
RefSeq (covered in the previous modules on "Types of Databases"
and "Format of Sequence record") and Entrez Gene are essentially
companion resources; Gene provides a broader umbrella of information.
That is, a Gene record includes info from the RefSeq record
as well as other info not in the RefSeq record. Therefore,
some types/sections of info in an Entrez Gene record are also
found in the RefSeq record, e.g.
Information unique in Entrez Gene record
- record summary (everything above the "genomic regions, transcripts, and products" section) -- highlights:
- Gene ID is a stable ID for that particular locus in that organism.
(remains the same even if info about the locus changes such
as gene symbol, genomic position, etc.)
- Official gene symbol and which organization provided it
- Aliases/alternative symbols by which the gene might have been know in earlier times
- Brief summary of what is known about the gene
- Genomic regions, transcripts, and products
- Shows the genomic position of MLH1 gene (bp coordinates on the chromosome)
- Links to reference sequence records that organize
sequence data according to the central dogma of biology:
- genomic DNA (chromosome record NC_000003)
- mRNA (RefSeq record NM_000249)
- protein (RefSeq record NP_000240)
- Genomic context
- Shows the broader genomic context of MLH1, including
flanking genes. The genomic context view is useful to researchers
in a number of ways. For example, one use is to identify genes
that might share a regulatory region and therefore be co-expressed.
- Bibliography (GeneRIFs)
- The bibliography in an Entrez Gene record (and the "PubMed" Link)
is similar to the list of references associated with the
corresponding RefSeq mRNA NM_000249 (and accessible from the
"PubMed (RefSeq)" link for that mRNA).
However, the reference lists associated with RefSeq and Entrez Gene
records might sometimes be out of sync because RefSeq records are
generally updated first and Entrez Gene follows later.
Other types of info found in Entrez Gene record and not in RefSeq record. This demonstrates how Entrez Gene is an umbrella over other databases and provides more information about a gene than can be found in individual records of other Entrez databases. For example, and Entrez Gene record includes, when available, information about:
Additional links to resources of interest to certain user groups (e.g., clinical resources, consumer health, genetic variation)
- Gene Ontology
- Link to MapViewer display of homologous mouse and rat genes
- List of markers found in gene
- Related Sequences (a representative sample of genomic and mRNA records
for the gene)
- NOTE: The set of Genomic records from U17839 through
U17857 is a segmented set -- a lab sequenced and submitted exons from
the genomic DNA molecule (by GenBank/EMBL/DDBJ rules, each sequence
record must contain a contiguous piece of sequence from a single
molecule type. The exons are all from genomic DNA, but they are not
contiguous because the lab didn't sequence the intervening introns.
So each exon sequence needed to be submitted separately. In such a
case, the last exon record typically contains the CDS feature and
its amino acid translation.
"Additional Links" at bottom of record includes links to resources relevant for:
- clinician (e.g., GeneTests)
- patients (e.g., Genes and Disease)
- resources of special interest to certain groups (e.g., Mismatch Repair Genes Variant Database)