What is IBIS?

IBIS is the Inferred Biological Interactions Server, which has been
developed at NCBI to organize, analyze and predict interactions between
proteins and other biomolecules. The unique feature of IBIS is that it
identifies and predicts the protein interaction partners together with the
locations of their binding sites on the query sequence/structure. The IBIS
categorizes binding sites into four categories: protein, small chemical,
nucleic acids and peptide binding sites. For a given sequence/structure
query with unknown binding sites, first IBIS reports physical interactions
made by this query. These so- called
"observed interactions" have been directly observed in experimentally-determined
structures and denoted by a letter "o" in red color at the beginning of the
row.
Second, IBIS can infer binding sites by homology, by inspecting the
protein complexes formed by close homologs of a given query. To ensure
biological relevance of inferred binding sites, IBIS clusters them based on
their sequence and structure
conservation. Only those binding sites that are evolutionarily conserved
among non-redundant homologous proteins are considered in the prediction.
Additionally, binding site clusters are verified by comparing them with the curated
binding site annotations from the Conserved Domain Database (CDD(2)) (if
present) and in the case of protein-protein binding sites by comparing them with
the interfaces confirmed by the PISA algorithm. After binding sites are
clustered, position specific scoring matrices (PSSMs) are constructed based on
the binding site alignments and are subsequently used (together with
other measures) to rank binding sites with respect to their closeness to the
query and their biological relevance.
How to search IBIS? 
IBIS can be searched by a four letter PDB code (Ex: 1XBB).
If known, a single
letter chain identifier can also be supplied along with the PDB code
(Ex:1XBBA).
IBIS can also be searched using the GI (GenBank identifier) of any
protein sequence (Ex: 1668706).
How does IBIS find inferred interactions for protein
sequences without a known structure? 
IBIS can also be searched for sequences with no known structures using their NCBI GenBank identifier (GI). With the recent advances in the Structural
Genomics Initiative, a sequence is likely to have a homolog with known structure.
In the current version, for a given protein sequence query, a BLAST search is
performed against the sequences of all structures in MMDB(1) to find the closest
homolog. The homolog is chosen with a strict, conservative threshold, requiring
requiring at least 30% sequence identity and at least 80% of the query sequence
to be aligned. The Interactions for the closest
homolog are then displayed in IBIS. In the future, we intend to map the inferred
binding sites directly onto the query sequence.
Why there are no results reported for my query?
If the query is a PDB identifier and does not have any homologs with more
than 30% sequence identity over the structurally superimposed region, IBIS will
not report any interactions.
If the query is a protein GI (NCBI GenBank identifier) and does not have any
homologs in the structure database that satisfy the criteria described above,
IBIS will not report any interactions.
Only homologs that display native biomolecular interactions are used for
inferring binding sites/interactions.
How are different types of interactions defined and how
to navigate between them?
Four different types of interactions/binding sites are currently
presented in IBIS. protein-protein, protein-chemical, protein-nucleic acid and
protein-peptide. The tabs Protein, Chemical, DNA/RNA and
Peptide on the top left corner display the interactions/binding sites for
the respective type. The user may navigate between different types of interactions by clicking on
these tabs.
Protein-chemical, protein-nucleic acid and protein-peptide interactions
are based on the full chain sequence of the query.
Protein-protein interactions are annotated and inferred for each domain of the
query. Domains on the query are mapped by using CDD(2) database and CD-search(3). If
no domains mapped to the query, the full chain sequence
of the query is used instead. Different query domains can be selected by clicking on the
corresponding domain bubble below the query in the graphic
at the top of the page.
How is the IBIS web page organized? 
Each web page pertains to a single query sequence/structure or protein domain.
There are three main parts of the web page display.
A summary graphic is at the top of the page. The query sequence is represented
by a "ruler". Bubbles below the sequence show the footprints of any conserved
domains detected for the query. Underneath is a list of binding site clusters
(see below), with locations of binding site residues indicated by triangles.
Each cluster is labeled by the name of the interacting partner. Clicking on the
partner name expands the corresponding row in the detailed table.
Below there is a table listing the details of each binding site cluster (see
below) for the selected type of interaction (Protein, Chemical, DNA/RNA and
Peptide). The table contains a list of binding site clusters. Click on the [+]
button at the left side of the cluster to see the cluster features. To close the
cluster click on the [-] button.
On the left bar there are Search tools for exploring the interaction data of a
given query protein.
What is a "binding site cluster"? 
A cluster consists of a collection of structures that are
related to the query structure. All members of the cluster contain similar
binding sites. Homology is inferred by comparing the query with similar
structures (currently 30% identity threshold is used) determined using the VAST
algorithm for structure-structure superposition. IBIS displays only those
clusters that contain evolutionarily-conserved binding sites among non-redundant
homologous proteins. Similarity between binding sites is measured in terms of
sequence similarity and those positions which overlap structurally are assigned
a higher weight. A "singleton" cluster has only one non-redundant member (after
members with more than 90% identity are purged).
For each cluster, binding site residues are illustrated in
the summary graphic and further details are provided in the corresponding row of
the table. Clusters that contain an interaction observed in the query structure
(or, for a sequence query, the closest structure homolog) are marked by letter
"o" at the beginning of the row. By expanding the cluster row, one can see
additional information about its members.
What do the columns in the interaction summary tables
mean? 
Each row in the table corresponds to a binding site cluster. Each column in
the table is defined as follows.
"Interaction partner" - name of the a representative interaction partner which
interacts with the actual query or with its homologs within a given binding site
cluster. either with the actual query ("observed" interactions) or with the
homologs of the query from a given binding site cluster. For protein-protein
interactions, the CDD(2) domain name of the partner is used. When and in case
there are no domain assignments on the interacting chain, "No domain assigned"
is displayed. For protein-chemical interactions, this column reports the name of
the small molecule bound to the representative member of the cluster. For
protein-nucleic acid and protein-peptide interactions, the column reports the
first 20 nucleotides/residues from the interaction partner of the representative
cluster member."Ranking score" the score which that ranks the binding site
clusters in terms of their biological relevance and similarity to the query. The
components of the ranking score include the sequence- PSSM score; the average
sequence identity between the query and cluster members, calculated over the
whole structure-structure alignment; the number of interfacial contacts and the
fraction of conserved (calculated as entropy) columns in the binding site
alignment . All components of the ranking score are then normalized and the Z-score,and
all clusters are then ranked with respect to the Z-score. The ranking score is
not defined for "singleton" clusters. Clusters with CDD annotations are
displayed on the table before clusters without annotations.
"Number of cluster members" - the number of homologs in the cluster.
"Average percent identity to query" - the average sequence identity between
the query and the cluster members calculated over the whole structure-structure
alignment.
"Number of binding site residues" - the union of binding sites mapped from
all members of the cluster to the query.
"Number of chemicals" (for protein-chemical interactions) the
number of different chemicals present in a given binding site cluster.
"Curator annotation" binding site annotation(s) from the CDD(2) which overlaps
more than 50% with the sites annotated by IBIS. Binding sites in proteins that
have been manually annotated in CDD(2) families are regarded as more reliable. These
binding sites are elevated to the top of the table irrespective of their ranking score.
"Taxonomic diversity" - the last common ancestor of the proteins from a given
cluster listed with a link to NCBI Taxonomy Browser to explore all taxonomic
groups represented in the cluster.
What is the meaning of the data that appears when I
expand a cluster? 
Each binding site cluster can be expanded (by clicking the (+) sign), making it
possible to see the properties of all members in the cluster and their
interaction partners. Up to ten non-redundant members are displayed (non-redundancy
defined with respect to 90% sequence identity), The complete list of members is
displayed by clicking the "See all members" link at the bottom of the list of
cluster members. Details of the columns in the inner table are as follows,"Interaction partner" (for protein-chemical, protein-nucleic
acids and protein-peptide interactions) - the name of the interaction partner
for a given cluster member. The naming convention is the same as for the master
table. These names are hyperlinked to PubChem(7), CDD(2), and MMDB(1) according to the
type of interaction.
"Structure of complex" (for protein-protein interactions) - PDB code of the homologous cluster member.
"Interacting chain(s)" for protein-protein interactions, the notation "A::B"
means that the query protein aligns with chain A of the structure homolog, and
that chains A and B of that structure complex interact with one another. If this
is an observed interaction, both chains are from the query structure complex.
For other types of interactions, only one chain is listed, showing which chain
in the structure homolog aligns with the query protein.
"% identity to query" - the average sequence identity between the query and
the cluster members calculated over the whole structure-structure alignment.
"Curator annotation" same as the master table. Cluster members with
annotations are elevated to the top of the inner table irrespective of their
ranking score.
"Binding site alignment" - Multiple sequence alignment of the binding site
residues of the homologous cluster members also aligned to the query. Residue
conservation is calculated in terms of the relative entropy and color is used to
depict the degree of conservation: red indicates highly conserved and blue
indicates medium and non-conserved residues are shown in black.
"View Binding Sites " clicking this button launches the NCBI
graphical viewer Cn3D(4) to view binding site residues. See "How do I view
interaction sites in protein structures?" for more info.
"PISA"(for protein-protein interactions) shows whether a given
protein-protein interaction has been validated by PISA. "N/A" refers to those
cases where PISA does not provide any information. By default, only interactions
validated with PISA are displayed. Unselect the "PISA validation" checkbox on
the left bar to show all interactions.
How can I access the IBIS web page for another query structure? 
Enter the PDB code or PDB code along with chain identifier (case sensitive) or
a GI in the Search PDB ID/GI box on the upper right of the web page. Ex:
1xbb, 1xbbA, 1668706.
What is a singleton cluster? 
"Singleton" means a cluster which has only one
non-redundant member (after members with more than 90% identity are purged).
How do you "expand a cluster"? 
The table contains a list of clusters. Click on the [+]
button at the left for the cluster you are interested in to see the
cluster features. The [+] button becomes a [-] button. When you are
done browsing, to close the cluster click on the [-] button.
How do I view the binding sites in the protein
structures? 
Structures can be viewed with the Cn3D software (http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml).
The first step is to expand a binding site cluster and then select the
specific structures or all structures in the table by using the checkboxes.
Then clicking on the button [Show Binding Sites] will show the query
structure superimposed with the selected structures and the binding sites
depicted with side chains. The sequence viewer window of Cn3D also
highlights the binding site residues based on sequence conservation in the
aligned column.
How do you use the various "Advanced Search" features? 
The search features enable you to filter the interactions based on
various criteria. For example, one might be interested in the Chemical
binding Siteschemical binding sites and want to know if there is a structure
with a particular chemical bound to it. Importantly, the search results
apply only to the IBIS data corresponding to from the current query
structure! Keyword Search: Type the keyword into the box to the right of
"Enter Keyword: then click the [GO] button at the bottom of the search box.
Advanced Search: There are three flavors of advanced search, click on [>
box] the to see the options.
PISA validation 
PISA(5) is a new method for automatic detection of macromolecular
assemblies within the Protein Data Bank entries that are the results of X-ray
diffraction experiments. It is used to validate oligomeric states and
interfaces of interactions between different protein chains. Only
clusters/binding sites with at least one PISA validated interaction are considered biologically relevant and
shown by default. Additional sites can be viewd by unchecking the checkbox - 'PISA Validation' on the left side bar.
What are non-biological chemical sites? 
Non-biological chemical sites are binding sites formed by non-biological
molecules (such as buffers, salts, detergents, solvents and ions added
for purification and/or crystallization process). These sites are hidden
by default, but can be viewed by selecting the option Non-Biological
Chemical Sites:Show on the
left side bar.
References : 
1) Benjamin A. Shoemaker, Dachuan Zhang, Ratna R. Thangudu, Manoj Tyagi,
Jessica H. Fong, Aron Marchler-Bauer, Stephen H. Bryant, Thomas Madej* and
Anna R. Panchenko: Inferred Biomolecular Interaction Server-a web server to analyze and predict protein interacting partners and binding sites. Nucleic acids research 2009, 37(1):1-7.
2) Chen J, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He S, Hurwitz
DI, Jackson JD, Jacobs AR, Lanczycki CJ et al: MMDB: Entrez's 3D-structure
database. Nucleic acids research 2003, 31(1):474-477.
3) Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M et al: CDD: specific functional annotation with the Conserved Domain Database. Nucleic acids research 2009, 37(Database issue):D205-210.
4) Marchler-Bauer A, Bryant SH: CD-Search: protein domain annotations on the
fly. Nucleic acids research 2004, 32(Web Server issue):W327-331.
5) Wang Y, Geer LY, Chappey C, Kans JA, Bryant SH: Cn3D: sequence and
structure views for Entrez. Trends in biochemical sciences 2000, 25(6):300-302.
6) E. Krissinel and K. Henrick (2007). Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774--797
PISA Web server : http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html
7) Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH: PubChem: a public
information system for analyzing bioactivities of small molecules. Nucleic acids
research 2009, 37(Web Server issue):W623-633.