Format

Send to

Choose Destination
BMC Bioinformatics. 2015 Dec 1;16:399. doi: 10.1186/s12859-015-0831-6.

Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies.

Author information

1
Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA. tahsin.kurc@stonybrook.edu.
2
Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA. qixi@rutgers.edu.
3
Rutgers Cancer Institute of New Jersey, New Brunswick, USA. qixi@rutgers.edu.
4
Department of Electrical and Computer Engineering, Rutgers University, New Brunswick, USA. daihou.wang@rutgers.edu.
5
Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA. fusheng.wang@stonybrook.edu.
6
Department of Computer Science, Stony Brook University, Stony Brook, USA. fusheng.wang@stonybrook.edu.
7
Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA. teodoro@unb.br.
8
Department of Computer Science, University of Brasilia, Brasília, Brazil. teodoro@unb.br.
9
Department of Biomedical Informatics, Emory University, Atlanta, USA. lee.cooper@emory.edu.
10
Department of Biomedical Informatics, Emory University, Atlanta, USA. mnalisn@emory.edu.
11
Department of Biomedical Engineering, University of Florida, Gainesville, USA. lin.yang@bme.ufl.edu.
12
Department of Biomedical Informatics, Stony Brook University, Stony Brook, USA. joel.saltz@stonybrookmedicine.edu.
13
Department of Pathology & Laboratory Medicine, Rutgers -- Robert Wood Johnson Medical School, New Brunswick, USA. foran@cinj.rutgers.edu.
14
Rutgers Cancer Institute of New Jersey, New Brunswick, USA. foran@cinj.rutgers.edu.

Abstract

BACKGROUND:

We describe a suite of tools and methods that form a core set of capabilities for researchers and clinical investigators to evaluate multiple analytical pipelines and quantify sensitivity and variability of the results while conducting large-scale studies in investigative pathology and oncology. The overarching objective of the current investigation is to address the challenges of large data sizes and high computational demands.

RESULTS:

The proposed tools and methods take advantage of state-of-the-art parallel machines and efficient content-based image searching strategies. The content based image retrieval (CBIR) algorithms can quickly detect and retrieve image patches similar to a query patch using a hierarchical analysis approach. The analysis component based on high performance computing can carry out consensus clustering on 500,000 data points using a large shared memory system.

CONCLUSIONS:

Our work demonstrates efficient CBIR algorithms and high performance computing can be leveraged for efficient analysis of large microscopy images to meet the challenges of clinically salient applications in pathology. These technologies enable researchers and clinical investigators to make more effective use of the rich informational content contained within digitized microscopy specimens.

PMID:
26627175
PMCID:
PMC4667532
DOI:
10.1186/s12859-015-0831-6
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center