Format

Send to

Choose Destination
Electrophoresis. 1993 Dec;14(12):1341-50.

An efficient disk based data structure for rapid searching of quantitative two-dimensional gel databases.

Author information

1
Image Processing Section, LMMB/NCI/FCRDC, Frederick, MD 21702.

Abstract

Fast access of two-dimensional (2-D) gel quantitative databases is important for rapid searching for protein differences between sets of 2-D gels from an experiment. The GELLAB-II system organizes corresponding spots from the gels in the database into reference or "Rspot" sets. These Rspot numeric names index fixed regions in the paged composite gel database file. This is adequate for an existing database, but has several problems. (i) Building the initial database requires guessing how much disk space to pre-allocate for each corresponding spot (i.e. spots from different gels). If it ever runs out of pre-allocated space during this process, it must expand the size of each corresponding set of spots copying the old database data into the new in-place on the disk. (ii) When adding new gels or editing the database, if a new spot is created, the system may also go into this expansion mode. The time spent and wasted disk space can be appreciable--depending on the size of the database (order of 100 gel database). (iii) Because each set of corresponding spots is the same size, we waste space in most spot sets since they do not require the additional space a few spot sets require which contain additional fragmented spots. We present a new low-level disk object-based structure and algorithm, paged indexed buckets (PIB), which optimizes disk space usage while having similar retrieval speed to the original method.

PMID:
8137800
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center