PubMed Nucleotide Protein Genome Structure Taxonomy

Drosophila melanogaster genome data and search tips Revised March 27, 2008

The Map Viewer help document describes in general how to use the Map Viewer software. This page describes the data available for Drosophila melanogaster, and the search tips specific to that organism. You can also return to the Drosophila melanogaster genome view search page. The Map Viewer home page allows you to search the genome data of any organism represented in Map Viewer.

  1. Scope of Data
  2. Available Maps
  3. Constructing Queries
  4. Constructing URLs

Scope of Data back to top

The Map Viewer provides a view of Drosophila melanogaster data from a variety of sources described below.

Drosophila melanogaster Genomic Sequence Data back to top

The current Drosophila melanogaster genome annotation (release 5.2) is based on the Release 5 genome assembly. This assembly is based on the original genome sequence determined in a collaboration between Celera and the Berkeley Drosophila Genome Project (BDGP), and is described in the March 24, 2000 issue of Science. Additional sequencing ("finishing") and re-assembly was carried out by the Berkeley Drosophila Genome Project (Celniker et al., 2002). Release 5 represents the first assembly that incorporates heterochromatic scaffolds and represents a non-redundant assembly of all available assembled genomic sequence. Annotation updates are provided by the FlyBase Consortium (Misra et al.,2002).

BLAST Drosophila melanogaster Genomic Sequence back to top

The complete set of Drosophila melanogaster sequence databases available for BLAST searching is shown in the pop-up menu on the Drosophila melanogaster BLAST page, which includes a link to the database descriptions. In addition, those interested in comparative genomics may use the arthropod genomes BLAST page, which includes individual and combined sequence databases for Drosophila melanogaster, Drosophila pseudoobscura, Anopheles gambiae, Apis mellifera and others.

Additional Drosophila melanogaster Genome Resources back to top

In addition to the Drosophila melanogaster data available in the Map Viewer and through BLAST, links to NCBI resources and external sites are available from the Drosophila Genome Resource Guide.

Available Maps back to top

The available maps for Drosophila melanogaster include:

Sequence Maps back to top

Ab initio

Shows models generated by Gnomon. Gnomon uses protein alignments in addition to transcript alignments and, in order to capture as much coding information in the genome as possible in this assembly, Gnomon models may represent partial as well as complete coding sequences. Models with a completely supported CDS are blue, models with a partially supported CDS are green, and the pure ab initio predictions are brown. Pure ab initio status indicates that the model was built without the support of mRNA or protein alignments, either through failure to align the sequence to the genome or an alignment ignored by Gnomon due to a score falling below a pre-determined threshold.


Component

Provides the tiling path of GenBank accessions used to build each "NC_xxxxxx", "NT_xxxxxx", and "NW_xxxxxxxxx" contig, which are described below.


Contig

Shows the chromosomal placement of NC_xxxxxx, NT_xxxxxx, and NW_xxxxxxxxx contigs on the assembled genome sequence. Individual GenBank records used to assemble the contigs are shown on the Component map, described above.


Dm RNA Shows the alignment of individual Drosophila melanogaster transcripts to the assembled genomic sequence. The corresponding alignment of EST clusters is shown in the Dmel_UniG map, described below.

Dros RNA

Shows the alignment of individual Drosophila transcripts (excluding transcripts from D. melanogaster) to the assembled genomic sequence.


Ins RNA

Shows the alignment of individual insect transcripts (excluding Drosophila transcripts) to the assembled genomic sequence. The corresponding alignment of insect EST clusters is shown in the Ins_UniG map, described below.


GenBank_DNA

Shows the placement of Drosophila melanogaster genomic DNA sequences from GenBank that were not used as components in the assembly. Placement is based on the alignment of the sequences to the assembled genomic scaffolds or chromosomes. It includes Drosophila melanogaster genomic sequences longer than 500 bp that have at least 97% identity to the components for at least 98 base pairs. If a sequence extends beyond a contig, that portion of sequence is not shown. The 'hits' link leads to a tabular display that shows the matching regions (base spans) of the assembly component and the GenBank genomic DNA record that has been aligned to it.

The length of a line represents the upper and lower-most points on the genome assembly to which sequence fragments from a single GenBank record were aligned.

When the GenBank_DNA map is displayed as the master map, in the default verbose mode, the descriptive text includes several columns: Total Bases which shows the total number of bases in the GenBank record; Aligned Bases which shows the total number of bases from that record that were aligned to the genome; % identity for the alignment; % coverage which shows how much of the Genbank record aligned to the genome as a percentage; Alignment-length ratio, which is the ratio of the alignment length in the genome to the alignment length of the Genbank record; and Strain from which the Genbank record was derived, when available.

Genes_Sequence

Shows genes that have been annotated on the genomic contigs by FlyBase.

If multiple models exist for a single gene, corresponding to splicing variants, the Gene_Sequence map presents a flattened view of all the exons that can be spliced together in various ways. For example, if one splice variant uses exons 1, 3, 4, and another splice variant uses exons 2, 3, 4, the Gene_Sequence map shows exons 1, 2, 3, 4. (In comparison, the Transcript (RNA) map shows what combinations of exons are valid based on mRNA sequences from RefSeq and GenBank.)

Genes shown on the left of the grey line are transcribed in the - orientation (from bottom up), and those on the right in the + orientation (from top down).

When Gene_Sequence is selected as the Master map, the verbose display (detailed labeling, shown by default) includes arrows to the right of each gene name indicate its direction of transcription as well as links to:

Additional information about these links is also provided in the Map Viewer Help Document, under Links to Related Resources.

RefSeq Transcripts

Shows diagrams of the RNAs that are predicted on the genomic contigs. The Transcript map and Gene_Sequence map are built in the same way, using the same types of evidence, described above. The Gene_Sequence map, however, shows a view of all the exons in a gene, while the Transcript map shows the combinations of exons (i.e., splice variants) that are valid, based on mRNA transcript annotations provided by FlyBase.


Repeats

Position of repetitive elements, calculated with RepeatMasker v3.1.8, using repeat library RELEASE 20061006.


STS Shows the placement of STSs from a variety of sources onto the assembled genomic sequence (described above) using Electronic-PCR (e-PCR).

Dm_UniG Shows the alignment of Drosophila melanogaster EST clusters to the assembled genomic sequence. ESTs are clustered based on shared introns and alignment to a common position on the genome. Those ESTs can come from one or more UniGene clusters, whose IDs are noted by the EST cluster. (UniGene clusters are made with a different build procedure, so there is not necessarily a one-to-one correspondence between EST clusters on the Dm_UniG map and clusters in the UniGene resource.)

Ins_UniG Shows the alignment of EST clusters from other insect species to the assembled genomic sequence. ESTs are clustered based on shared introns and alignment to a common position on the genome. Those ESTs can come from one or more UniGene clusters, whose IDs are noted by the EST cluster. (UniGene clusters are made with a different build procedure, so there is not necessarily a one-to-one correspondence between EST clusters on the Ins_UniG map and clusters in the UniGene resource.)

Cytogenetic Maps back to top

ideogram Cytogenetic map (banding pattern) of euchromatic arms from polytene chromosomes.

Genetic Maps back to top

genetic Meiotic recombination map of mutations in known and unknown genes determined by linkage analyses.

Constructing queries back to top

Searchable Terms back to top

The Map Viewer supports searching on any term that describes an element on any map, including:

  • gene symbol
    A search for cnn AND genes[map] will retrieve the locus cnn.
  • GenBank accession number
    A search for U35621[acc] will retrieve the Dm RNA map for the region containing the centrosomin gene.
  • text words
    For example, a search for actin will retrieve all map objects containing that word in their description. If multiple terms are entered. they will automatically be combined with the 'AND' Boolean operator.

Map Positions back to top

As noted in the Search By Position section of the Map Viewer help document, there are three main ways to search by map position from the Map View of a chromosome:
  1. enter a range of interest in the Region text boxes in side bar
  2. click on the region of interest in the chromosome thumbnail graphic in the sidebar
  3. click on a region of interest in the enlarged Map View of the chromosome

For Drosophila melanogaster, the following types of map positions can be entered in option 1:
  • symbols - you can enter marker names or alternate marker names (aliases) to display a region of the chromosome between those mapped elements. Note that both mapped elements must be present on the maps that share the same coordinate system in order for the range search to work properly.

  • numerical positions - It is not necessary to specify units. The Map Viewer will interpret the range in the units of the master map. Note that for a sequence map, base pair positions may be entered in any of the following formats: 1000000 or 1,000,000 or 1M or 1000K.

    It is not necessary to enter a value in both Region text boxes. If you enter a value only in the upper box, the Map Viewer will display the region of the chromosome starting from that point and ending at the lower end of the chromosome. If you enter a value only in the lower box, the Map Viewer will display the region of the chromosome starting at the upper end of the chromosome and ending at the value entered.

General Tips back to top

As mentioned in the Searchable Terms section of the Map Viewer Help Document, any term entered in the query box will be treated as an independent entity to be joined by the 'AND' Boolean operator. It is also possible to construct more complex queries by using explicit Boolean operators (AND, OR, NOT), field restriction, or limiting retrieval to records that have certain properties.

The Advanced Search page allows you to use a number of query options by simply checking boxes or radio buttons that represent various search fields, properties, and object types. It also allows you to limit your query to one or more chromosomes. The Advanced Search page is accessible from the header region of the genome view page.

Constructing URLs that link to Map Viewer back to top

If you would like to create WWW links to the Map Viewer, the instructions for constructing URLs are given in the general Map Viewer Help document. You can construct URLs that either perform a search or display a specific mapped object or chromosomal region. For example:

Questions or Comments?
Write to the NCBI Service Desk