Mapping data elements to terminological resources for integrating biomedical data sources

BMC Bioinformatics. 2006 Nov 24;7 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-7-S3-S6.

Abstract

Background: Data integration is a crucial task in the biomedical domain and integrating data sources is one approach to integrating data. Data elements (DEs) in particular play an important role in data integration. We combine schema- and instance-based approaches to mapping DEs to terminological resources in order to facilitate data sources integration.

Methods: We extracted DEs from eleven disparate biomedical sources. We compared these DEs to concepts and/or terms in biomedical controlled vocabularies and to reference DEs. We also exploited DE values to disambiguate underspecified DEs and to identify additional mappings.

Results: 82.5% of the 474 DEs studied are mapped to entries of a terminological resource and 74.7% of the whole set can be associated with reference DEs. Only 6.6% of the DEs had values that could be semantically typed.

Conclusion: Our study suggests that the integration of biomedical sources can be achieved automatically with limited precision and largely facilitated by mapping DEs to terminological resources.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Intramural

MeSH terms

  • Abstracting and Indexing*
  • Algorithms
  • Artificial Intelligence
  • Databases, Factual*
  • Information Storage and Retrieval / methods*
  • Natural Language Processing*
  • Periodicals as Topic*
  • Semantics
  • Software
  • Terminology as Topic*
  • Vocabulary, Controlled*