General Questions
What is Genome Workbench?
Genome Workbench is an integrated application for viewing and
analyzing sequences. Genome Workbench can be used to browse data in
GenBank and combine this data with your own private data.
What platforms does Genome Workbench run on?
Genome Workbench runs natively on Windows 2000/XPVista, Linux, MacOS X
(10.4) and various flavors of Unix. We provide binary snapshots of
the application for Windows and MacOS X; the source snapshot can be
used on any platform.
Are there any mailing lists for Genome Workbench?
Yes - there are several. Some useful e-mail addresses to know:
-
gbench@ncbi.nlm.nih.gov. This is a public
list for discussion of Genome Workbench. Any member of the
public may post to this list, with moderation; in addition,
membership is open.
-
gbench-announce@ncbi.nlm.nih.gov. This is a
public list for all Genome Workbench announcements. Any member
of the public may belong to this list; posting to this list is
limited to approved announcements from NCBI.
-
gbench-bugs@ncbi.nlm.nih.gov. This is an
e-mail alias for reporting bugs with Genome Workbench. This
alias is the recipient of all feedback reports supplied by Genome
Workbench. Membership is limited to the developers of Genome
Workbench; however, anyone can post to this list.
-
cpp-gui@ncbi.nlm.nih.gov. This is a list for
discussion of Genome Workbench development and general GUI
development using the NCBI C++ ToolKit.
Help! I want to do something, but Genome Workbench can't do it! What
can I do?
Post a message to the main mailing list,
gbench@ncbi.nlm.nih.gov. The list is read by many
knowledgeable users at NCBI, as well as application developers.
Chances are, we can help find a solution.
Genome Workbench Installation
I've installed Genome Workbench, but it won't run (or takes a really
long time to start). What is going on?
The most common problem here is that you are running Genome Workbench
behind a restrictive firewall that prevents outbound connections.
The possible solutions are:
- Ask your network administrators to permit outbound connections.
This is the most common cause of problems.
- Configure Genome Workbench to understand your firewall
settings. This configuration can be done from within Genome
Workbench by following the steps below:
- Choose the menu item Tools -> Configure Network
Connection.
- There are two drop-down menus in the section entitled
NCBI Connection Settings. Please choose the
options Connect via NCBI Dispatcher in the
first drop-down and Reuse connections to NCBI
services if possible in the second.
- If you have an institution-wide web proxy, please enter the
settings into the section entitled HTTP Proxy
Settings.
- If you experience extreme slow-downs with connections,
please try changing the second drop-down back to
Reuse connections. This may not work behind
a firewall, dependening on your firewall's settings.
- The configuration steps above establish some defaults in a
configuration file that you can edit manually if you need to. If
you have completed the steps above, you do not need to make any
of the changes below; the changes below are merely provided for
debugging purposes. An easy place to start is this file, which contains a set of
settings that are common enough that most firewall users should
be able to work with. This file should be renamed and placed in
your Genome Workbench user-specific directory:
- Windows: C:\Documents and Settings\<login
name>\Application
Data\GenomeWorkbench\gbench.ini
- Unix/MacOS X: $HOME/.gbench/gbench.ini
If you experience any problems, particularly with speed, please
remove the line that says 'STATELESS=TRUE' and try again. If you
are having trouble with the link above, the file contains the
following:
[CONN]
FIREWALL = TRUE
STATELESS = FALSE
- Configure Genome Workbench to understand your HTTP proxy. If
you use a corporate HTTP proxy, you may need to adjust Genome
Workbench to respect the proxy settings. A sample configuration
file is below; this belongs in the same place as the example
above:
[CONN]
FIREWALL = TRUE
STATELESS = FALSE
HTTP_PROXY_HOST = 10.0.100.1
HTTP_PROXY_PORT = 3128
TIMEOUT=60
- The firewall settings are available from inside Genome
Workbench by using the menu command Tools -> Configure
Network Connection.
Genome Workbench Interface
What is a Project? What is a Workspace?
In short: Projects Hold Data. Workspaces Hold
Projects. It is best to combine data that go together
inside projects, and to use workspaces to hold collections of
projects that may or may not go together.
Genome Workbench organizes your data into Projects. A Genome
Workbench Project is a combination of data that semantically go
together. Data inside a project can communicate more easily with
other data inside the same project. For example, if you are
reviewing a gene, it would be best to organize the data for the
gene in one project, combining references to genomic sequences,
transcripts, proteins, and any local analyses done on the sequence.
Workspaces are collections of projects. Workspaces provide
convenient ways to organize sets of related data, such as a set of
projects relating to all the genes you are studying in mouse, or
presenting different chunks of data in different organisms.
What is the Project Tree?
The Project Tree is a view on your workspace and projects. The
project tree shows you a hierarchical expansion of your data, and
allows you to group data items into folders.
The Project Tree is available from the main menu at View ->
Project Tree. It is on by default, and appears on the
left-hand side.
What is the Selection Inspector?
The selection inspector provides a means for evaluating all the
selected objects in Genome Workbench. The selection inspector has
three modes of operation (Table, Brief Text, and Full Text),
selectable by using the icons in the right-hand corner of the view.
The modes are:
-
Table provides a tabular list of the most
common attributes of selected objects.
-
Brief Text provides a short textual
description of the selected objects
-
Full Text provides a verbose textual
printing of selected objects; this mode implies the GenBank
Flat File format for any selected features.
The strength of the selection inspector is in aggregating
selections across views. The selection inspector
features a drop down menu that indicates which view's selections
are being shown. One of the options is to show selections
from all views. This mode allows you to combine
selections from the project tree with selections from any other
view.
What is the Data Mining View?
The data mining view is a view that combines many modes of searching
into one interface. From the data mining view, you can search for
items in the public sequence repository; you can search for gene
records from Entrez Gene; you can search for annotations in a given
view; and you can search for patterns of sequences.
The data mining view is on by default, and is available from
View -> Data Mining View. It is generally docked
along the bottom.
How do I load data into Genome Workbench?
There are a few ways to do this.
First, you can Import data into Genome Workbench.
Importing data allows you to load accessions from GenBank directly.
Importing also allows you to read data from files. Import is
available through File -> Import. You must decide
whether you want the imported data to be loaded into a new project,
or added to the current project.
Second, you can load data through the Data Mining View. In
the data mining view you can search several public databases in
Entrez (Entrez Protein, Entrez Nucleotide, and Entrez Gene) and
load data directly from your query.
What data formats does Genome Workbench support?
Lots... The currently supported list of file formats for import
includes:
- FASTA sequence files
- GFF2/GTF format (NOTE: GFF3 support will be added soon)
- RepeatMasker .out format
- Sequin-style 5-Column Feature Table format
- Newick-format phylogenetic trees
- Phrap/ACE assembly files
- AGP sequence assembly files
- NCBI ASN.1 objects (in ASN.1 text or binary or in XML format)
Questions About Specific Tools
Help! I'm trying to use a tool and I can't!
The most common problems concern selections.
Genome Workbench expects that you have already identified the data
that you wish to work with when executing a tool. For example, if
you wish to BLAST a region of a genome, or to run ORF finder on a
putative cDNA, you must identify the region of the genome or the
cDNA first.
Genome Workbench makes extensive use of selections.
Selections are visible in views, either by using dark background
shading or by using a highlight box.
How do I run Blast2Sequences? I want to align some sequences that
I have loaded into Genome Workbench.
You will need to select the sequences or ranges on sequences
before you align. You may need to use the Selection
Inspector to do this; in particular, pay attention to
the drop down that indicates where your selections come from.
You may need to use the option to select from All
Views, particularly if you wish to combine selections
from the Project Tree and another view.
Once you have selected your items, choose Tools ->
Alignments -> BLAST Sequences from the main menu. You
will receive a dialog providing the options for BLAST. In
general, the defaults work well. You will need to clarify the
query and subject sequences, as well as the BLAST program to run
(such as BLASTp, BLASTn, MegaBLAST, etc.).
Can I run a BLAST search? How do I BLAST a sequence against the NCBI
public database?
You can execute a BLAST search against the public databases using
the tool at Tools -> Alignments -> BLAST Database
Search. The dialog will walk you through
I've loaded some sequences. Can I run CLUSTALw or MUSCLE to
generate an alignment?
Yes - support for CLUSTALx and MUSCLE is integrated into Genome
Workbench. You can fid these in the Tools menu at Tools ->
Alignments -> CLUSTALw and Tools -> Alignments
-> MUSCLE. Genome Workbench will need to be able to
find the binary to execute each of these programs. If you have
installed the applications with CLUSTALw and MUSCLE available in
your runtime PATH, then Genome Workbench will work out of the box.
If Genome Workbench cannot locate the applications, it will ask you
for their locations.