Format

Send to

Choose Destination
BMC Bioinformatics. 2015 Mar 27;16:102. doi: 10.1186/s12859-015-0534-z.

Aro: a machine learning approach to identifying single molecules and estimating classification error in fluorescence microscopy images.

Author information

1
Graduate Program in Bioinformatics and Systems Biology, University of California, La Jolla, San Diego, CA, USA. allison.cy.wu@gmail.com.
2
Graduate Program in Bioinformatics and Systems Biology, University of California, La Jolla, San Diego, CA, USA. sarifkin@ucsd.edu.
3
Section of Ecology, Behavior, and Evolution, Division of Biology, University of California, La Jolla, San Diego, CA, USA. sarifkin@ucsd.edu.

Abstract

BACKGROUND:

Recent techniques for tagging and visualizing single molecules in fixed or living organisms and cell lines have been revolutionizing our understanding of the spatial and temporal dynamics of fundamental biological processes. However, fluorescence microscopy images are often noisy, and it can be difficult to distinguish a fluorescently labeled single molecule from background speckle.

RESULTS:

We present a computational pipeline to distinguish the true signal of fluorescently labeled molecules from background fluorescence and noise. We test our technique using the challenging case of wide-field, epifluorescence microscope image stacks from single molecule fluorescence in situ experiments on nematode embryos where there can be substantial out-of-focus light and structured noise. The software recognizes and classifies individual mRNA spots by measuring several features of local intensity maxima and classifying them with a supervised random forest classifier. A key innovation of this software is that, by estimating the probability that each local maximum is a true spot in a statistically principled way, it makes it possible to estimate the error introduced by image classification. This can be used to assess the quality of the data and to estimate a confidence interval for the molecule count estimate, all of which are important for quantitative interpretations of the results of single-molecule experiments.

CONCLUSIONS:

The software classifies spots in these images well, with >95% AUROC on realistic artificial data and outperforms other commonly used techniques on challenging real data. Its interval estimates provide a unique measure of the quality of an image and confidence in the classification.

PMID:
25880543
PMCID:
PMC4450985
DOI:
10.1186/s12859-015-0534-z
[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for BioMed Central Icon for PubMed Central
Loading ...
Support Center