Basic Operation

Step 1: Introduction

This tutorial provides a broad overview of how to use Genome Workbench to analyze and display data. Before beginning this tutorial download and install Genome Workbench from the install page. If you do not have administrative privileges on your computer please istall the program somewhere within your user home folder instead of the default location.

All the illustrations in this tutorial are fully applicable for the MS Windows 7 users. Unix/Linux users or users of the other MS Windows versions might experience minor variations in default settings, column order, window size, and other insignificant differences.

Step 2: Startup

When Genome Workbench starts up, the Main Application Window appears with several different panes. The Project Tree View is on the left and it is empty upon startup. This is where the data you load, the analyses you do, and the views you create will be stored.

The Views (as referred to in this and other tutorials) are different windows within the Genome Workbench application providing the user with the information on various aspects of the work of the application and data. Views provided by the application are available through the View drop-down menu, data views are created by users.

The Main Application Window has many views and is completely configurable. You can add the views to the screen by selecting the view that is currently not showing from the View drop-down menu. You can close the view by clicking on the X in the rigt corner of the view tab. You can click on a particular view and drag it where you want it. You can also stack views so that they appear in folder tabs like the Event View and the Task View shown below. When you click on a view to move it you will see icons showing the docking options for the view. By dropping the view on your choice you will dock the view to the desired position. Genome Workbench will remember your configuration so you only have to do this once.

Opening screen

To start let us search for some data in the public databases at NCBI. Select (if you have opened multiple views) the Search view from the Main Application Window. The Search View provides a single interface for many of the most frequent kinds of searches done in Genome Workbench.

The Search View is also accessible from the main menu as Tools => Search or from the tool bar by clicking on the binoculars.

The view can be obtained also from View => Search and from the toolbar by clicking on the binoculars icon.

Let us now search for the gene superoxide dismutase in the Entrez Gene database. To do this, follow the steps below:

  • Make sure the Search Tool in the Search View says: Search NCBI Public Databases
  • Select Entrez Gene for the NCBI Database.
  • Enter the gene name superoxide dismutase in the search box located to the right from the Select NCBI Database drop-down list.
  • Press the Enter key or click the Start button.

Select search

Step 4: Search for Genes (continued)

You should see a set of results like those shown below. You can adjust the width of the columns by clicking on the divider between the column headings. If you right-click in the column header you can choose to turn some columns on/off.

Search dialog select

Select the item in the list corresponding to the human variant (Organism is Homo sapiens and the Label is SOD1 - might take a bit to find it) and right-click. Choose Add to Project from the contextual menu. You should see a dialog like the one below. The Entrez Gene database formats an object that contains a wealth of information about the gene and its placement on various assemblies. All of these are available to Genome Workbench.

Search dialog disambiguator

Click OK in the dialog to create a new project. We will use the defaults from this dialog. In the future you will have the option to add it to an existing project or to change how it appears in the project tree.

Step 5: Viewing your data

Once the project is loaded the main window should look somewhat like the image below. The workspace area should now contain a blue notebook for a workspace and a green notebook for a project.

New project

Genome Workbench organizes your data into workspaces and projects. A workspace contains one or more projects. Projects are genome data loaded from files or external databases.A project can be shared between several workspaces. In addition both projects and workspaces can hold separate notes and descriptors - a project can contain notes about how the project was created, what analyses were run, and when and what sorts of editing operations were performed.

Let us now take a look at this data. To open a view select the data in the project tree, right-click, and choose Open New View . You could also choose View=>Open New View from the menu bar or double-click data. When the dialog appears choose Graphical View.

Open view dialog

Genome Workbench will now ask you which sequence you wish to view. As mentioned previously the Entrez Gene object holds references to many sequences in many different placements for each gene. You should see a dialog like the one below, asking you which sequence you want to view. We will start by looking at the placement of the gene on the reference chromosome. This sequence can be identified using the description column in the dialog. In addition it can be identified through patterns of accessions:

  • Accessions beginning with NC_ are reference sequence chromosomes
  • Accessions beginning with NT_ are reference sequence contigs
  • Accessions beginning with NG_ are reference sequences that have had some degree of human curation.
  • Accessions beginning with NM_ are reference sequence mRNAs.
  • Accessions beginning with NP_ are reference sequence proteins.

View select sequences

As you start using views Genome Workbench will remember the views you have used most recently and present them to you in a shorter menu.

Step 6: The Graphical View

Select the desired sequence and click the Finish button

The graphical view shows the public annotations on a sequence using both color and arrangement to show the relationships. In the view below different annotations are shown in different colors:

  • Green bars represent genes.
  • Blue bars represent transcripts / mRNAs.
  • Red bars represent coding regions / proteins.

For more details on the colors and arrangements used please see the full GSV legend document

At the bottom of the graphical view there are several controls. The content drop down is highlighted in the figure below.

In addition to the Central Dogma annotations there are additional annotations that are available. These include:

  • At the top of the image there is a row of short blue tick marks. These marks represent variations from dbSNP for this sequence. As you zoom in and out they will become available as selectable objects.
  • Just beneath the variations there is a set of blue bars representing the components that are used to assemble this sequence. Larger genomic sequences are split into many smaller pieces and reassembled from these chunks. The blue bars show you where the chunk boundaries are and what the approximate overlap between the chunks is.
  • Many other features including sequence tagged sites (STSs, visible in the image below) are shown as black bars underneath the genes and the gene products.

Gene in graphical view

There are several ways to zoom in and out in the graphical view. One way shown below uses a zoom slider. To show the zoom slider press and hold the Z key. You can then zoom in and out by left-clicking and dragging the mouse up and down. The view will change the level of zoom in real time.

Graphical view zoom slider

You can also pan the view from left to right (right to left) by left-clicking and dragging from left to right (right to left).

A third way to zoom in is called rectangular or regional zoom. It is available by holding down the R key, left-clicking, and dragging over a region. When you do this you will see a view like the one below. Try this now to zoom in to a region around an exon as shown below.

Graphical view rectangular zoom

Step 8: Zoomed In Detail

The Graphical View balances the depth of detail with the depth of zoom. As you zoom in more and more you will see more and more details. In the image below the view is zoomed all the way in to the actual sequence. The gray sequence bars across the top are now duplicated, showing both the forward and reverse (complemented) strand of the chromosomal sequence. In addition inside the protein coding region the letters of the protein are inscribed and spaced out to account for the codon boundaries and reading frame.

If you select a coding region annotation you will see (inscribed beneath each amino acid residue) the letters of the codon actually responsible for that amino acid.

If you hover over an annotation or over the blue sequence bars you will receive a tool tip popup providing additional information. The images below show two tool tips - one over a feature, showing information about the feature as well as the GenBank display for this feature, and one for the sequence, providing details both about the organism involved and the location over which the mouse is positioned.

Graphical view tooltip 1


Graphical view tooltip 2

For more details on the graphical view legend please see the Graphical View Legend Document.

Step 9: Graphical View Configuration

The graphical view supports a wide range of visual customizations to make it easier to understand the data presented. The controls for customization are at the bottom of the graphical view (highlighted in the image below). Each track can be customized as well (also highlighted in the image below). The track controls will appear when you place the mouse over the title bar.

Among the things you can change are:

  • Decorations. Annotations, such as mRNA, and coding region features can be displayed with a wide variety of decorations, such are circle or square anchors, arrow fletchings, and different kinds of arrow heads.
  • Spacing. For many displays a more compact view size is preferred. To access this choose the Compact option for size.
  • Content. In many cases you may be interested in seeing just a subset of features. The Content drop-down menu lets you choosing what kinds of features to show. For example the MolBio option shows just genes, mRNAs, and coding region annotations.
  • Track Order. To change the order of the tracks click in the track's title bar and drag.

The graphical view will remember your settings, so that the next time you create a graphical view your settings will apply. This is true even if you exit Genome Workbench. In addition you can create and save several different Themes of settings and easily switch back and forth between styles without having to remember a lot of settings.

Graphical view preferences

Step 10: Launching Tools from the Graphical View

In the graphical view everything you see is selectable, and you can perform actions on things you select. To select an item just click on it and the object will highlight.

Let us run a BLAST search on a sequence in the graphical view. There is one gene annotated here: SOD1. Select a region of the genome containing this gene by clicking in the ruler above the sequence and dragging a gray rectangle to cover the gene model. Then choose Tools => Run Tool from the main menu and the Run Tool dialog will appear. This tool will submit the sequence to the NCBI BLAST service for alignment against a set of sequences. When you choose this option you should see a dialog like the one below.

Select mrna align to neighbors

There may be more than one sequence listed in the dialog, go ahead and choose the one whose label starts with NC_000021.8.

For this particular example, there are a couple things to change. First, let us choose MegaBLAST from the Program menu. Second, enter an Entrez query biomol mrna[prop] (it ensures that we are considering only the molecules known to be mRNAs).

Align to neighbors

Click Next.

There are many parameters that can be changed for BLAST, we are going to accept the defaults.

Blast parameterss

Select these options and click Next.

We will add the results to the existing project in this tutorial, but you can also explore other options in the dialog below.

Blast results project

Click Finish.

Once the sequence has been submitted, it will be entered into the BLAST polling system for retrieval. The Task View will help you track your job's progress.

Blast poll

Step 11: Viewing Results

When our alignment job is finished, the data will be added to the Graphical View automatically. Also, our Alignment will be added to the Project Tree View

Viewing results

Incidentally, you can arrange the data in the Data folder. If you select the folder, then right-click (or control-click for Mac OS), you will find options in the context menu to create a new folder. You can then cut and paste or drag-and-drop it into new locations. As you collect more and more results in a project, these folders will help you keep track of what each set of results means.

New data folder

Step 12: Manipulating Alignments

The graphical view shows the newly obtained BLAST alignments as a set of disconnected bars hanging underneath our models. While this gives us some idea of regions of the genome that likely provide coding potential, it also confuses the issue significantly. Genome Workbench provides a couple of tools to make this visualization easier.

Let us clean the alignment up by using the Clean Up Alignments tool. Select the alignment you have just created by left-clicking it. It is important that you select the alignment and therefore make it the active object in the system for the tool only works on the whole alignment. If you live the region (the grey area selected in the previous step) active or make it active again (by clicking on it) the tool will not be able to perform the operation and will display an errorr message.

Select Tools => Run Tool command or click the tool icon. You can also right-click the alignment and select the Run Tool command.

The system will present the following screen.

Select tool

Select the Clean Up Alignments tool and click Next (if you select the tool by left-clicking you do not have to click the Next button.

Cleanup alignments

Select the alignments set you need (for this exercise we select all the alignments) and click Next. Then choose Add to an exisiting project and click Finish. The graphical view will change to look like the image below, and a new item (or several items) will appear in the project tree.

Cleaned alignments view

Step 13: Saving the project

If you want to save your project select File=>Save from the drop-down menu.

Save project

The system will present you with the options to name your project and to select the location.

Saving options

Click Save. Now your project is saved with the name you have selected to the location you have selected and can be opened again for future use, copied, sent via e-mail, etc.

Step 14: Finished

Congratulations, you have completed the first tutorial!

In this tutorial we examined several aspects of how Genome Workbench allows you work with data including:

  • Searching for data using NCBI public databases such as Entrez Gene.
  • Exploring projects in the workspace explorer portion of the Genome Workbench interface
  • Creating and navigating in views such as the graphical view.
  • Running analyses such as creating alignments of sets of sequences.

In all of these explorations there are some common themes that will help you get the most out of Genome Workbench including:

  • Use of context menus. If you right click (or control-click in Mac OS) on a view or on a selection somewhere you will receive information about that object, including things that can be done to that object.
  • Customization. None of the sets of view parameters is correct for everyone. Genome Workbench contains some easy ways to customize each view so that the views can provide you with more information relevant to you.
  • Selections. Things you see on the screen are selectable, whether they are annotations on sequence, rows of an alignment, or swaths of sequence themselves. Genome Workbench uses selections as inputs to analyses.
Write to the Help Desk

Last updated: 2013-03-18T10:43:39-04:00