Improving interoperability between microbial information and sequence databases

BMC Bioinformatics. 2005 Dec 1;6 Suppl 4(Suppl 4):S23. doi: 10.1186/1471-2105-6-S4-S23.

Abstract

Background: Biological resources are essential tools for biomedical research. Their availability is promoted through on-line catalogues. Common Access to Biological Resources and Information (CABRI) is a service for distribution of biological resources and related data collected by 28 European culture collections. Linking this information to bioinformatics databanks can make the collections' holdings more visible after a search in molecular biology databanks and vice-versa. Identification of links to sequence databases can be useful, but annotation and indexing problems, together with compilation errors, immediately arise. In this paper, we present our efforts for the identification of cross-references between CABRI catalogues and the EMBL Data Library and related results.

Results: An SRS site with both EMBL and CABRI catalogues has been set up. Ad-hoc changes in indexing scripts allowed to achieve homogeneous index keys and SRS link features have been used to identify links between databases. After manual checking and comparison with an alternative procedure, about 67,500 valid cross-references were identified, added to the EMBL Data Library and are now distributed with it. HTML links can be established from EMBL to CABRI network service. Procedures can be executed whenever needed.

Conclusion: Links between EMBL and CABRI catalogues constitute an improved access to micro-organisms of certified quality and can produce positive effects on biomedical research. Further links between CABRI catalogues and other bioinformatics databases can now easily be defined by using these cross-references. Linking genetic information onto natural resources information may stand model for the integration of other databases containing empirical data on these materials.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computational Biology / methods*
  • Database Management Systems
  • Databases as Topic
  • Databases, Factual
  • Databases, Genetic
  • Information Storage and Retrieval / methods*
  • Information Systems
  • Internet
  • Molecular Biology / methods
  • Programming Languages
  • Sequence Alignment
  • Sequence Analysis
  • Software
  • Systems Integration