What is IBIS?

IBIS is the Inferred Biological Interactions Server, which has been
developed at NCBI to organize, analyze and predict interactions between
proteins and other biomolecules. The unique feature of IBIS is that it
identifies and predicts the protein interaction partners together with the
locations of their binding sites on the query sequence/structure. The IBIS
categorizes binding sites into five categories: protein, small chemical,
nucleic acids, peptide and ionand ion binding sites. For a given sequence/structure
query with unknown binding sites, first IBIS reports physical interactions
made by this query. These so- called
"observed interactions" have been directly observed in experimentally-determined
structures and denoted by a letter "o" in red color at the beginning of the
row.
Second, IBIS can infer binding sites by homology, by inspecting the
protein complexes formed by close homologs of a given query. To ensure
biological relevance of inferred binding sites, IBIS clusters them based on
their sequence and structure
conservation. Only those binding sites that are evolutionarily conserved
among non-redundant homologous proteins are considered in the prediction.
Additionally, binding site clusters are verified by comparing them with the curated
binding site annotations from the Conserved Domain Database (CDD(2)) (if
present) and in the case of protein-protein binding sites by comparing them with
the interfaces confirmed by the PISA algorithm. After binding sites are
clustered, position specific scoring matrices (PSSMs) are constructed based on
the binding site alignments and are subsequently used (together with
other measures) to rank binding sites with respect to their closeness to the
query and their biological relevance.
How to search IBIS? 
IBIS can be searched by a four letter PDB code (Ex: 1XBB).
If known, a single
letter chain identifier can also be supplied along with the PDB code
(Ex:1XBBA).
IBIS can also be searched using the GI (GenBank identifier) of any
protein sequence (Ex: 1668706).
How does IBIS find inferred interactions for protein
sequences without a known structure? 
IBIS can also be searched for sequences with no known structures using their NCBI GenBank identifier (GI). With the recent advances in the Structural
Genomics Initiative, a sequence is likely to have a homolog with known structure.
In the current version, for a given protein sequence query, a BLAST search is
performed against the sequences of all structures in MMDB(1) to find the closest
homolog. The homolog is chosen with a strict, conservative threshold, requiring
at least 25% sequence identity and at least 80% of the query sequence
to be aligned. The interactions for the closest
homolog are then displayed in IBIS. In the future, we intend to map the inferred
binding sites directly onto the query sequence.
Why there are no results reported for my query?
If the query is a PDB identifier and does not have any homologs with more
than 25% sequence identity over the structurally superimposed region, IBIS will
not report any interactions.
If the query is a protein GI (NCBI GenBank identifier) and does not have any homologs in the structure database that satisfy the criteria described above, IBIS will not report any interactions.
How are different types of interactions defined and how
to navigate between them?
Five different types of interactions/binding sites are currently presented in IBIS. protein-protein, protein-chemical, protein-nucleic acid, protein-peptide and protein-ion. The tabs "Protein", "Chemical", "DNA/RNA", "Peptide" and "Ion" on the top left corner show the interactions/binding sites for the respective type. The user may navigate between different types of interactions by clicking on these tabs.
Protein-chemical, protein-nucleic acid, protein-peptide and protein-ion interactions are based on the full chain sequence of the query.
Protein-protein interactions are annotated and inferred for each domain of the
query. Domains on the query are mapped by using CDD(2) database and CD-search method(3). If no domains mapped to the query, the full chain sequence of the query is used. Different query domains can be selected by clicking on the
corresponding domain bubble below the query in the graphic
at the top of the page.
How is the IBIS web page organized? 
Each web page pertains to a single query sequence/structure or protein domain.
There are three main parts of the web page display.
A summary graphic is at the top of the page. The query sequence is represented
by a "ruler". Bubbles below the sequence show the footprints of any conserved
domains detected for the query. Underneath is a list of binding site clusters
(see below), with locations of binding site residues indicated by triangles.
Each cluster is labeled by the name of the interacting partner. Clicking on the
partner name expands the corresponding row in the detailed table.
Below there is a table listing the details of each binding site cluster for the selected type of interaction (Protein, Chemical, DNA/RNA, Peptide, and Ion). Click on the [+] button at the left side of the cluster to see the cluster features. To close the cluster click on the [-] button.
On the left bar there are Search tools for exploring the interaction data of a
given query protein.
What is a "binding site cluster"? 
A cluster consists of a collection of structures that are
related to the query. All members of the cluster should contain similar
binding sites. Homology is inferred by comparing the query with similar
structures determined using the VAST algorithm for structure-structure superposition.
Currently 25% identity threshold is used for all except for homooligomer interactions (those protein-protein interactions between two domains belonging to the same CDD families. Since homololigomeric states and interfaces are not very well conserved, we used more stringent threshold of 50% identity between query and all cluster members for inferring homooligomeric states. IBIS-displays only those
clusters that contain evolutionarily-conserved binding sites among non-redundant
homologous proteins. Similarity between binding sites is measured in terms of
sequence similarity and those positions which overlap structurally are assigned
a higher weight. A "singleton" cluster has only one non-redundant member (after
members with more than 90% identity are purged)) and singleton clusters are displayed at the bottom of the list.
For each cluster, binding site residues are illustrated in
the summary graphic and further details are provided in the corresponding row of
the table. Clusters that contain an interaction observed in the query structure
(for a sequence query, the closest structure homolog) are marked by letter
"o" at the beginning of the row. By expanding the cluster row, one can see
additional information about its members.
What do the columns in the interaction summary tables
mean? 
Each row in the table corresponds to a binding site cluster. Each column in
the table is defined as follows.
"Interaction partner" - name of the a representative interaction partner which
interacts with the actual query or with its homologs within a given binding site
cluster. either with the actual query ("observed" interactions) or with the
homologs of the query from a given binding site cluster. For protein-protein
interactions, the CDD(2) domain name of the partner is used. When and in case
there are no domain assignments on the interacting chain, "No domain assigned"
is displayed. For protein-chemical interactions, this column reports the name of
the small molecule bound to the representative member of the cluster. For
protein-nucleic acid and protein-peptide interactions, the column reports the
first 20 nucleotides/residues from the interaction partner of the representative
cluster member.
"Ranking score" the score which that ranks the binding site
clusters in terms of their biological relevance and similarity to the query. The
components of the ranking score include the sequence- PSSM score; the average
sequence identity between the query and cluster members, calculated over the
whole structure-structure alignment; the number of interfacial contacts and the
fraction of conserved (calculated as entropy) columns in the binding site
alignment . All components of the ranking score are then normalized and
all clusters are then ranked with respect to the Z-score. The ranking score is
not defined for "singleton" clusters. Clusters with CDD annotations are
displayed on the table before clusters without annotations.
"Number of cluster members" - the number of homologs in the cluster.
"Average percent identity to query" - the average sequence identity between the query and the cluster members calculated over the whole structure-structure alignment.
"Number of binding site residues" - the union of binding sites mapped from all members of the cluster to the query.
"Number of chemicals" (for protein-chemical interactions) the number of different chemicals present in a given binding site cluster.
"Curator annotation" binding site annotation(s) from the CDD(2) which overlaps more than 50% with the sites annotated by IBIS. Binding sites in proteins that have been manually annotated in CDD(2) families are regarded as more reliable. These binding sites are elevated to the top of the table irrespective of their ranking score.
"Taxonomic diversity" - the last common ancestor of the proteins from a given cluster listed with a link to NCBI Taxonomy Browser to explore all taxonomic groups represented in the cluster.
What is the meaning of the data that appears when I
expand a cluster? 
Each binding site cluster can be expanded (by clicking the (+) sign), making it possible to see the properties of all members in the cluster and their interaction partners. Up to ten non-redundant members are displayed (non-redundancy defined with respect to 90% sequence identity), The complete list of members is displayed by clicking the "See all members" link at the bottom of the list of cluster members. Details of the columns in the inner table are as follows,
"Interaction partner" (for protein-chemical, protein-nucleic acids, protein-peptide and protein-ion interactions) - the name of the interaction partner for a given cluster member. The naming convention is the same as for the master table. These names are hyperlinked to PubChem(7), CDD(2), and MMDB(1) according to the type of interaction.
"Structure of complex" (for protein-protein interactions) - PDB code of the homologous cluster member.
"Interacting chain(s)" for protein-protein interactions, the notation "A::B" means that the query protein aligns with chain A of the structure homolog, and that chains A and B of that structure complex interact with one another. If this is an observed interaction, both chains are from the query structure complex. For other types of interactions, only one chain is listed, showing which chain in the structure homolog aligns with the query protein.
"% identity to query" - the average sequence identity between the query and the cluster members calculated over the whole structure-structure alignment.
"Curator annotation" is the same as in the summary table. Cluster members with annotations are elevated to the top of the inner table irrespective of their ranking score.
"Binding site alignment" - multiple sequence alignment of the binding site residues of the homologous cluster members also aligned to the query. Residue conservation is calculated in terms of the relative entropy and color is used to depict the degree of conservation: red indicates highly conserved and blue indicates medium and non-conserved residues are shown in black. All singletons are shown in grey.
"View Binding Sites " clicking this button launches the NCBI graphical viewer Cn3D(4) to view binding site residues. See "How do I view interaction sites in protein structures?" for more info.
"PISA" (for protein-protein interactions) shows whether a given protein-protein interaction has been validated by PISA. "N/A" refers to those cases where PISA does not provide any information. By default, only those clusters are displayed which contain at least one PISA validated interaction or all interactions with "N/A" PISA status. Unselect the "PISA validation" checkbox on the left bar to show all interactions.
How can I access the IBIS web page for another query structure? 
Enter the PDB code or PDB code along with chain identifier (case sensitive) or a GI in the Search PDB ID/GI box on the upper right of the web page. Ex: 1xbb, 1xbbA, 1668706.
What is a singleton cluster? 
"Singleton" means a cluster which has only one non-redundant member (after members with more than 90% identity are purged).
How do you "expand a cluster"? 
The table contains a list of clusters. Click on the [+] button at the left for the cluster you are interested in to see the cluster features. The [+] button becomes a [-] button. When you are
done browsing, to close the cluster click on the [-] button.
How do I view the binding sites in the protein
structures? 
Structures can be viewed with the Cn3D software (http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml). The first step is to expand a binding site cluster and then select the specific structures or all structures in the table by using the checkboxes. Then clicking on the button [Show Binding Sites] will show the query structure superimposed with the selected structures and the binding sites depicted with side chains. The sequence viewer window of Cn3D also highlights the binding site residues based on sequence conservation in the aligned column.
How do you use the various "Advanced Search" features? 
The search features enable you to filter the interactions based on various criteria. For example, one might be interested in the Chemical binding sites and want to know if there is a structure with a particular chemical bound to it. Importantly, the search results apply only to the IBIS data corresponding to the current query structure! Keyword Search: type the keyword into the box to the right of "Enter Keyword: then click the [GO] button at the bottom of the search box.
Advanced Search: There are three flavors of advanced search, click on [> box] the to see the options.
PISA validation 
PISA(5) is a new method for automatic detection of macromolecular assemblies within the Protein Data Bank entries that are the results of X-ray diffraction experiments. It is used to validate oligomeric states and interfaces of interactions between different protein chains. Only clusters/binding sites with at least one PISA validated interaction (or all N/A PISA status interactions) are considered biologically relevant and shown by default. Additional sites can be viewed by unchecking the checkbox - 'PISA Validation' on the left side bar.
What are non-biological chemical sites? 
Non-biological chemical sites are binding sites formed by non-biological
molecules (such as buffers, salts, detergents, solvents and ions added
for purification and/or crystallization process). These sites are hidden
by default, but can be viewed by selecting the option Non-Biological
Chemical Sites:Show on the
left side bar.
References : 
1) Shoemaker BA, Zhang D, Thangudu RR, Tyagi M, Fong JH, Marchler-Bauer A, Bryant SH, Madej T, Panchenko AR.: Inferred Biomolecular Interaction Server --a web server to analyze and predict protein interacting partners and binding sites. Nucleic Acids Res 2010, 38(D):518-24.
2) Thangudu RR, Tyagi M, Shoemaker BA, Bryant SH, Panchenko AR, Madej T: Knowledge-based annotation of small molecule binding sites in proteins. BMC Bioinformatics 2010, 11:365.
3) Chen J, Anderson JB, DeWeese-Scott C, Fedorova ND, Geer LY, He S, Hurwitz DI, Jackson JD, Jacobs AR, Lanczycki CJ et al: MMDB: Entrez's 3D-structure database. Nucleic acids research 2003, 31(1):474-477.
4) Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M et al: CDD: specific functional annotation with the Conserved Domain Database. Nucleic acids research 2009, 37(Database issue):D205-210.
5) Marchler-Bauer A, Bryant SH: CD-Search: protein domain annotations on the fly. Nucleic acids research 2004, 32(Web Server issue):W327-331.
6) Wang Y, Geer LY, Chappey C, Kans JA, Bryant SH: Cn3D: sequence and structure views for Entrez. Trends in biochemical sciences 2000, 25(6):300-302.
7) E. Krissinel and K. Henrick (2007). Inference of macromolecular assemblies from crystalline state. J. Mol. Biol. 372, 774--797
8) Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH: PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic acids research 2009, 37(Web Server issue):W623-633.