Send to

Choose Destination
See comment in PubMed Commons below
J Bioinform Comput Biol. 2004 Jun;2(2):375-411.

Techniques for optimization of queries on integrated biological resources.

Author information

Arizona State University, PO Box 876106, Tempe, Arizona 85287-6106, USA.


Today, scientific data are inevitably digitized, stored in a wide variety of formats, and are accessible over the Internet. Scientific discovery increasingly involves accessing multiple heterogeneous data sources, integrating the results of complex queries, and applying further analysis and visualization applications in order to collect datasets of interest. Building a scientific integration platform to support these critical tasks requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web, as well as data that are locally materialized in warehouses or generated by software. The lack of efficiency of existing approaches can significantly affect the process with lengthy delays while accessing critical resources or with the failure of the system to report any results. Some queries take so much time to be answered that their results are returned via email, making their integration with other results a tedious task. This paper presents several issues that need to be addressed to provide seamless and efficient integration of biomolecular data. Identified challenges include: capturing and representing various domain specific computational capabilities supported by a source including sequence or text search engines and traditional query processing; developing a methodology to acquire and represent semantic knowledge and metadata about source contents, overlap in source contents, and access costs; developing cost and semantics based decision support tools to select sources and capabilities, and to generate efficient query evaluation plans.

[Indexed for MEDLINE]
PubMed Commons home

PubMed Commons

How to join PubMed Commons

    Supplemental Content

    Loading ...
    Support Center