Chapter 2. Handling diverse protein interaction data: integration, storage and retrieval

Anna Panchenko and Benjamin Shoemaker, NCBI, NIH

 

In this chapter we review current approaches to store, retrieve and integrate diverse protein interaction data. To incorporate the heterogeneous results of computational predictions and protein interaction experiments, methods of data integration have been widely used which provide efficient presentation, storage and analysis of interaction data. Among them statistical meta-analysis and supervised machine learning methods are becoming very popular in this respect.  While integration methods reduce complexity and provide efficient presentation and analysis of interaction data, the databases themselves provide its efficient storage and retrieval. A large variety of interaction databases exist which differ in scope, type and coverage of data as well as query search capabilities. We categorize the databases of protein interactions into comprehensive, specialized, structural and network analysis.  This gives a rough grouping of resources based on how they might be used.  In particular, one might often start with a comprehensive database search and use the results to create a refined search in a database with a more specific focus.