Working with Multiple Views

Step 1: Introduction

This tutorial was prepared using Genome Workbench version 2.6.0

This tutorial demonstrates how to manipulate views in Genome Workbench. After completing it you will know how to create new views, create workspace splitters and tabs, and move views between different locations. You will also get knowledge on how views communicate between each other and learn about the ways to see selections in one view reflect in other views of Genome Workbench.

In order to get the full benefit of this tutorial you will need to download the sample data set for the Barcode project.

You should complete Basic Operation tutorial first.

Step 2: Get the tools and sample data

The sample data set contains a project that references a set of sequences for the Collembola species. These sequences are all examples of cytochrome oxidase from several closely related organisms. In addition the project contains a protein multiple alignment generated using MUSCLE and a phylogenetic tree reconstructed from this alignment.

MUSCLE is not distributed with Genome Workbench but it is found here: http://http://www.drive5.com/muscle/

The sample data can be obtained directly from: ftp://ftp.ncbi.nlm.nih.gov/toolbox/gbench/tutorial/Tutorial3/BX530088_BX572102.comp.zip

Step 3: Opening a Multiple Alignment View

Open Genome Workbench. It will present you with the opening screen.

Opening screen

Let us start by opening the example workspace that you downloaded. Choose the open folder icon from the main toolbar or choose File=>Open from the menu bar.

Open file

Choose Projects or Workspace from the left side of the dialog and click the button with the ... on the right side. Then navigate to the barcode.gbw file that you have downloaded, select it, and click Open. Then click Finish.

The system will open the barcode project.

Barcode project data

Next open a multiple alignment view on the protein alignment. You can do this by selecting the multiple alignment in the project tree (the MUSCLE alignment), right-clicking and choosing Open New View. Another easier way to get there is by double-clicking on the item; this will bring up the Open View dialog, as shown below. Or you can choose View=>Open New View from the main menu.

Inset select new view

Select Multiple Alignment View and click next.

Step 4: Multiple Alignment View Features

The default view for the multiple alignment will appear. You will see an image like the one below.

Multiple alignment default

There are several features to note:

  • As with the graphical view, you can zoom interactively on this view by using Z + Left Click/Drag for the interactive zoom slider, or R + Left Click/Drag for a range zoom.
  • A tooltip will appear if you hover over a location. This tooltip tells you which sequence you are hovering on, as well as the position on that sequence (in both sequence coordinate space and alignment coordinate space).
  • The header row contains a set of column headings. The set of columns visible is up to you to decide; you can rearrange the columns using drag-and-drop, and a right-click on the header and choose Settings to bring up a menu to select additional columns.

Column settings

Step 5: Coloration Schemes

Now we will look at alternate ways to score and color an alignment. Genome Workbench provides a variety of means for scoring alignments. If you right-click in the alignment view you will see a context menu like the one below. Choose Coloration => Select Method... to see the list of available schemes for coloring proteins.

Select coloration method

The default coloration scheme is called Column Quality score Protein. This scheme assigns scores to residues based on how well a particular residue agrees with the others in a column.

Coloration methods protein

Let us change the method to one that scores based on a hydropathy scale. This method scores each residue in an amino acid independently and provides a colorimetric scale between hydrophobic (red) and hydrophilic (blue). Click on Hydropathy Scale and click Select. Once this is set you should see the multiple alignment view change to look like the image below.

Coloration applied

Step 6: More About Coloration Schemes

Each coloration method offers its own configuration settings. While many of these settings are not thr ones that most people would want to change, some of these are notable, so let us look at how to change them. Right-click in the multiple alignment view and choose Coloration => Method Properties....

You should see the menu as in the image below.

Method properties menu

Choosing this brings up a properties dialog for the coloration scheme.

Scoring method properties

There are several things to note here:

  • The colors used are configurable. The default for the hydropathy scale is red for hydrophobic and blue for hydrophilic. In addition, the color used for neutral is provided. You may change each of these colors. In addition, there is a slider above the color scale so you can select the degree of gradation between colors.
  • When using consensus scoring, you can choose to provide a window for averaging across an alignment. The default value is 1. Consider the average score for a column to consist of the averages of all residues in that column. You can change this to include adjacent columns as well. For hydropathy this provides means for identifying regions of the alignment that are more or less hydrophobic or hydrophilic than expected.
  • There is a check box to toggle consensus scoring. Toggling this changes the calculation of score so that coloration is based on the difference from the average score in a column rather than on the single score provided for the amino acid. Choosing this allows you to investigate variance within a column. Please check Use consensus now and click OK. You should see the screen change to match the image below.

Method properties applied

Step 7: Adding a Phylogenetic Tree View

Next let us add a phylogenetic tree view. Select the Phylogenetic tree item in the Project Tree.

Select phylogenetic tree

Right-click the selected tree and choose Open New View or select View => Open New View

Select tree view

Click Tree View. You should see a view like the one in the image below appear.

Phylogenetic tree view

This tree is a tree constructed from the alignment in this project. The tree was obtained by running the tool at Tools => Run Tool, choosing Phylogenetic Tree Builder Tool, and using the Neighbor Joining algorithm.

Step 8: Phylogenetic Tree View Features

The phylogenetic tree view offers a variety of ways to manipulate and edit trees. We will discuss a few of these below.

Layout Options

The phylogenetic tree offers several different methods to lay out the nodes in the tree. In order to choose the layout option Right-click on the tree image in the right part of the screen. In the pop-up menu select Layout.

Phytree radial menu

Available options are:

  • Rectangular Cladogram - the default view
  • Slanted Cladogram - provides a triangular view of the tree
  • Radial Tree - shows the tree in radial format
  • Circular Tree - shows the tree in circular format
  • Force Layout - shows the tree in radial format, using a physics-based approximation to adjust the layout

Force Layout tree more adequately reflects the fact that the tree itself is unrooted since this is a reconstruction.

Phytree force layout

The phylogenetic tree offers a powerful search implemented via the search bar at the top of the window. Two search methods are implemented within the single interface:

  • Simple string matching
  • Full query search

Simple string matching allows you to type in some text and then press enter or Start to search for that text within all node properties in the current tree. If your text includes blanks, enclose it in quotes to force the search tool to use simple string matching. If the text in the search box has blank spaces and is not enclosed in quotes, the search engine will attempt to parse it according to the query language syntax.

Phytree search results

After a query is executed, the matching nodes replace any currently selected nodes. To enhance visualization of the results, check the Filter on the toolbar which draws the nodes not selected by the query semi-transparently.

Search results filtered

Full query search allows you to create logical queries similar to how you select records in an SQL database. In this format use the node properties and compare them with other properties or values of your choice. Queries can be built from a combination of comparisons, such as equal and greater-than, combined with logical operators, such as AND and OR. Logical operators may be given in upper or lower case. While typing in a query node property names will be highlighted in blue. To execute the query, press Enter in the search box or click the Start button. While a query is running you will not be able to manipulate the tree. In the event that a query takes too long click the Stop button to stop the query.

The full query search syntax allows for a number of comparison and logical operators that you can combine with values you enter in the search string and properties from the current tree. The valid query elements include:

  • String, numeric and boolean values (such as 5, 0.2, true, "mitochondrial")
  • Node properties (such as seq-id, dist, organism, cluster-id)
  • Simple comparisons: <, <=, >, >=, =, !=
  • The 'Like' Comparison which allows wildcards: organism like Desoria*
  • 'Between' comparison: dist between 0.02 and 0.05
  • 'In' comparison: seq-id in (AAT66216, AAT66240.1)
  • Logical Operators: AND, OR, XOR, NOT

Some valid queries for the sample project are:

organism = "Archisotoma polaris" and seq-id = "AAT66228"

dist between 0.002 and 0.003 or seq-id==AAT66206

label like "AAT6619*" xor dist > 0.002

seq-id in (AAT66197, AAT66220, AAT66229)

If a search returns multiple nodes (our example uses like AAT6622* search) you can view the nodes one-at-a-time. To view the nodes one-at-a-time uncheck the Select All check box and then use Prev and Next arrow buttons to go through the selected nodes individually. Search result and 2 individual node views are illustrated below.

View nodes 1

View nodes 2

View nodes 3

Distances

Phytree use distances

Phytree distances

As generated in Genome Workbench, the phylogenetic tree itself marks each node with a computed distance from the presumed root. This distance can be used to alter the rendering to show graphically how each node related based on its distance from the root. This option is available in the context menu by right-clicking and selecting Layout -> Use Distances, as shown on the right.

Distances can be used in the Rectangular Cladogram as well as in the Radial and Circular views.

Support for Distance-based Circular Trees

Starting with Genome Workbench version 2.10.5 circular trees may also be rendered to show distances between the root and leaves based on true distances.

To enable this feature make sure "Use Distances" check box is enabled from the context (right-mouse) Layout sub-menu.

use distances

The resulting tree will be rendered in accordance with the real distances.

distance based tree

Labeling

Phytree settings new label

Phytree organisms

The phylogenetic tree, by default, displays the sequence identifier at each node. This can be changed by using the Settings option in the right-click context menu. When you select this, you see a tabbed dialog like the one on the right. Select the Labels heading to change the labels. This dialog contains some simple labeling options to select the accession or organism name. In addition, you can select Custom Labels to construct a label from the available properties in the tree. The example here uses the label

$(label) - $(organism)

to construct a label containing the organism name. The drop-down and Insert button on this page may be used to insert the properties without needing to know the syntax.

Once the labels are set, you will see the phylogenetic tree view change to match the image on the right. Each node is now marked with the sequence accession as well as the species name.

Node Markers

Phytree markers

Sometimes it may be desirable to highlight individual nodes in the tree with a marker attribute that will change the color and size of the displayed node. Markers are added as a property in the "Node Properties" dialog. You can display this dialog by using the Properties option in the right-click context menu for the node.

Markers are created by adding a property with name "marker" to the properties list for a node. The marker value may include one or more colors and, optionally, a size parameter. The colors are specified as RGB values between 0 and 255 between square brackets, e.g. [64 0 128]. The numbers may be separated by commas and/or spaces. If a fourth value, commonly called the alpha channel, is given between the brackets, it is ignored. When multiple colors are given, the marker is divided evenly between the given colors, and looks much like a pie chart.

Examples

Red marker, default size

[255 0 0]

Marker that is 50% red and 50% green with large size

[0 255 0] [255 0 0] size=4

Subtree Boundaries

Phytree bounded force

Phytree bounded slanted cladogram

Phytree bounded rect cladogram

The phylogenetic tree supports adding a colored boundary to one or more subtrees. Boundaries are added as a property to the parent node of the subtree using the "Node Properties" dialog. You can display this dialog by using the Properties option in the right-click context menu for the node.

Boundaries are created by adding a property with name "$NODE_BOUNDED" to the properties list for a node. There are several parameters for a boundary including its shape, color, border width and whether or not the boundary should include text. It is also possible to define different boundary shapes for each of the different layout methods. Parameters other than the shape will remain the same for each layout method.

Parameters for the boundary regions are not case-sensitive. Colors are specified in the format [0..255, 0..255, 0..255, 0..255] for red, blue, green and, optionally, alpha. The numbers may be separated by spaces and/or commas. Parameters that require a value are specified in the form "parameter=x", and the possible values for 'x' are shown below. Boolean parameters can be 'true', 'yes', 'y', 'false', 'no', or 'n'. Parameters such as color and border that apply to more than one boundary shape will be applied to all applicable shapes.

Shape Parameters

The following parameters specify the shapes to be used for different layouts. If the same boundary shape is to be used for all layouts, specify only the 'Shape' parameter. To override the 'Shape' parameter for other layouts, specify the shape for that layout.

Shape={Rectangle, RoundedRectangle, Triangle}

RectCladogram={Rectangle, RoundedRectangle, Triangle}

SlantedCladogram={Rectangle, RoundedRectangle, Triangle}

Radial={Rectangle, RoundedRectangle, Triangle}

ForceLayout={Rectangle, RoundedRectangle, Triangle}

Appearance Parameters

These parameters apply to all the different shapes. The boundary color is specified as [r, g, b, a] without the 'keyword=' syntax and it can include an optional transparency, or alpha, value where 0 is fully transparent and 255 is fully opaque. The 'DrawEdge' parameter adds a 1-pixel border to the boundary. The edge color defaults to black but can be changed with the 'EdgeColor' parameter. 'Border' expands the overall shape by a specified number of pixels and 'Corner' rounds off the corners in RoundedRectangles and Triangles. Since rounding corners brings corners inward, it may be helpful to increase 'Border' to compensate. If 'IncludeText' is true, the boundary shape will be expanded to include node labels.

[0..255, 0..255, 0..255, 0..255]

Border=n

Corner=n

DrawEdge={true, false}

EdgeColor=[0..255, 0..255, 0..255, 0..255]

IncludeText={true, false}

Triangle Parameters

This last set of parameters applies only to triangles. If 'AxisAligned' is true then the shape is aligned with the nearest x or y axis. This defaults to 'true'. The 'TextBox' parameter forces the text of the bounded nodes to be placed in a square box rather than expanding the triangle to include the text. Lastly, 'TriOffset' is the distance behind the root node at which the triangle apex should be placed. It defaults to '40' units.

AxisAligned={true, false}

TextBox={true, false}

TriOffset=n

Examples

Green rectangle boundary for all layouts with text included but no border or edge.

[0 255 0 255]

shape=Rectangle

IncludeText=true

Red triangle that does not include a text box and has rounded corners and a black edge.

[255 0 0 128]

Shape=Triangle

corner=10

border=10

textbox=false

drawedge=y

AxisAligned=false

Blue rectangle with rounded corners for the rectangular cladogram layout, triangle for slanted cladogram and force layouts and rounded rectangle for radial layouts. Boundaries will not be expanded to include text. Corners will be rounded and a 10-pixel border will expand the boundary size.

[0 0 255 128]

shape=RoundedRectangle

SlantedCladogram=Triangle

Radial=RoundedRectangle

ForceLayout=Triangle

drawedge=n

corner=10

border=10

textbox=false

IncludeText=false

AxisAligned=false

Saving Images

If you need to save a screen capture of the current tree, select Save Images... from the File menu to bring up the Save Images dialog. The dialog allows you to save the tree as a single image, or to divide the image into equal-sized tiles (sub-images) and save those to a directory. When saving the images, you can, via Printing Guides, display cutting markers and names of adjacent image tiles in the image margins. This is useful for saving images that will be printed and then reassembled into a poster presentation.

Save images 1

In the Save Images dialog, use the Partitions slider to subdivide the image into multiple sub-images, each of which will be saved to a separate file in the directory name given by Directory. The names of the image files are displayed on each tile and are a combination of File Name and the image's index given according to the numbering scheme in Numbering. Use the Image Size to specify the size of each individual image saved, and use proportions to set the width-to-height ratio to make images as small as possible or to force them to a standard (paper) size.

Save images 3

Save images 2

Loading Attributes

Load attributes 1

To update the properties of nodes in a tree from a file, right-click on the background to bring up the context menu, and then select Load Attributes. The Loading Attributes feature allows you to update the properties of nodes in a tree by loading them from a flat file. The attributes in the file can include both updates to existing nodes attributes as well as new attributes. The sequence identifier, seq-id, property is used as the key to match nodes in the file to nodes in the tree. This of course implies that the feature can't be used to directly update nodes that do not have a seq-id.

The file that provides the updates to the node properties has a well-defined format an example of which is shown below. The first line of the file must contain the file-identifier:

#BKBTA-1

The next line must specify the names of all the node properties that are given in the file. The list of property names should start with # and the individual properties should be separated by spaces or tabs. The first property has to be a key value that can be used to look up the elements in the tree that will be modified by the corresponding row in the attribute table.  Attribute rows in which the first element - the key value - do not match any nodes in the tree are ignored.

#seq-id cluster-id label dist

After these two lines, the following lines contain the actual node identifiers and properties to update. Additionally, any lines after the first two lines that start with # are read as comments and will be ignored. The list of properties for each node must be separated by tabs, not spaces.

AAT66197 2 Hypogastrura concolor 0.02

This is an example that provides a cluster-id for s set of nodes in the sample barcode project:

#BKBTA-1

#seq-id cluster-id

#Add a cluster id to the tree

AAT66197 9

AAT66196 9

AAT66189 9

AAT66223 2

AAT66236 2

AAT66216 2

AAT66230 2

AAT66203 2

AAT66195 2

Load attributes 2

Re-rooting Tree

To split an existing branch and place the root in the middle, right click on the branch and, when you see it has been highlighted, select “Place Root at Middle of Branch”.

branch middle menu

This will split the selected edge in two and place a new root for the tree in the middle. If you do not want the root to be exactly in the middle, then after the tree is re-rooted, you can edit the distance property “dist” of the two children of the new root node to represent the position you prefer.

branch middle result

Midpoint rooting is also supported. Use the option menu to select “Set Midpoint Root” which searches the tree for its “middle point”.

midpoint root menu

This computes all the leaf to leaf distances in the tree and selects the longest one. The distance-based midpoint of the path between these leaves is then found and a new node is added at that point. The added node is then made to be the new root of the tree.

midpoint root result

Step 9: Arranging Windows

One of the powerful features in Genome Workbench is the ability to move the views where you'd like them , create tabbed stacks and resize any view. Our goal here is to take the search view and the selection inspector and dock them with the tab group on the bottom left. Then resize the selection inspector and use it to inspect the nodes in our Phylogenetic Tree.

Window dock

Click on the Search view tab and drag it over the title bar in the bottom left view. As you drag, you'll see the dock icons appear giving you choices where to put the view. Choose the center icon when over the bottom left view. See right.

Do the same thing with the Selection Inspector. Then resize the bottom panel by moving the mouse over the divider and when it changes to a double arrow, click and drag. All the frames are resizable using the same technique.

Feel free to experiment with moving, docking and undocking, and resizing windows to find the set up that works for you.

Then go ahead an click on a node on our phlyogenetic tree and the selection inspector will show the item dynamically. The item displayed change based on where you click in the tree.

Once this is completed, you should have a view that looks like the image below.

All views

Step 10: Interactions Between Views

Multiple alignment selection

So why would you want to go to the trouble of arranging views like this? The primary reason to do this is to see several aspects of the same data simultaneously. Genome Workbench provides this ability. To see it in action, open a Multiple Alignment View on the MUSCLE alignment in the Project Tree View.

Multiple alignment view

Dock the Multiple Alignment View on the bottom of the gBench window like we did in the previous step. Your view should like the view on the right.

Show only selected chosen

Show only selected

If you click on a node in the Phylogenetic Tree view, the corresponding rows will highlight in the Multiple Alignment View. There can be many rows in the Multiple Alignment View so there are two ways to see the relevant rows.

The first way is to right-click (or control-click) on a description in the Multiple Alignment View and select Hide/Show -> Show Only Selected from the contextual menu.

Move selected items up

Move selected items chosen

The second way is to right-click (or control-click) on a description in the Multiple Alignment View and select Move Selected Items Up from the contextual menu.

You can reverse this operation at any time by right clicking and selecting Hide/Show -> Show All.

Step 11: Finished

This completes this tutorial. In this tutorial, we covered:

  • How to create different kinds of views on your data (Multiple Alignment View, Phylogenetic Tree View, Selection Inspector)
  • How to use scoring and coloration schemes in the multiple alignment view to see differences in your data.
  • How to manipulate the phylogenetic tree view to provide more informative displays.
  • How to arrange views to provide several different views of the same data on the screen at once.
  • How to see selections shown between different views.

30-zoom-to-sequence

Tree View on the Web

Tree View on the Web service is also available at https://www.ncbi.nlm.nih.gov/projects/treeview/

Current Version is 2.12.10 (released August 20, 2018)

Documentation Home

Downloads


General


Help


Tutorials


Manuals


Other Resources


Support Center

Last updated: 2017-11-04T03:25:26Z