Send to

Choose Destination
J Chem Inf Model. 2005 Nov-Dec;45(6):1824-36.

Identifying biologically active compound classes using phenotypic screening data and sampling statistics.

Author information

Howard Hughes Medical Institute, Harvard Institute of Chemistry and Cell Biology, Broad Institute of Harvard and MIT, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, USA.


Scoring the activity of compounds in phenotypic high-throughput assays presents a unique challenge because of the limited resolution and inherent measurement error of these assays. Techniques that leverage the structural similarity of compounds within an assay can be used to improve the hit-recovery rate from screening data. A technique is presented that uses clustering and sampling statistics to predict likely compound activity by scoring entire structural classes. A set of phenotypic assays performed against a commercially available compound library was used as a test set. Using the class-scoring technique, the resultant activity prediction scores were more reproducible than individual assay measurements, and class scoring recovered known active compounds more efficiently than individual assay measurements because class scoring had fewer false positives. Known biologically active compounds were recovered 87% of the time using class scores, suggesting a low false-negative rate that compared well to individual assay measurements. In addition, many weak and potentially novel classes of active compounds, overlooked by individual assay measurements, were suggested.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for American Chemical Society
Loading ...
Support Center