Display Settings:

Format

Send to:

Choose Destination
    Bioinformatics. 2010 May 15;26(10):1348-56. Epub 2010 Apr 7.

    A CROC stronger than ROC: measuring, visualizing and optimizing early retrieval.

    Source

    Division of Laboratory and Genomic Medicine, Department of Pathology and Immunology, Washington University, St. Louis, MO 63110, USA.

    Abstract

    MOTIVATION:

    The performance of classifiers is often assessed using Receiver Operating Characteristic ROC [or (AC) accumulation curve or enrichment curve] curves and the corresponding areas under the curves (AUCs). However, in many fundamental problems ranging from information retrieval to drug discovery, only the very top of the ranked list of predictions is of any interest and ROCs and AUCs are not very useful. New metrics, visualizations and optimization tools are needed to address this 'early retrieval' problem.

    RESULTS:

    To address the early retrieval problem, we develop the general concentrated ROC (CROC) framework. In this framework, any relevant portion of the ROC (or AC) curve is magnified smoothly by an appropriate continuous transformation of the coordinates with a corresponding magnification factor. Appropriate families of magnification functions confined to the unit square are derived and their properties are analyzed together with the resulting CROC curves. The area under the CROC curve (AUC[CROC]) can be used to assess early retrieval. The general framework is demonstrated on a drug discovery problem and used to discriminate more accurately the early retrieval performance of five different predictors. From this framework, we propose a novel metric and visualization-the CROC(exp), an exponential transform of the ROC curve-as an alternative to other methods. The CROC(exp) provides a principled, flexible and effective way for measuring and visualizing early retrieval performance with excellent statistical power. Corresponding methods for optimizing early retrieval are also described in the Appendix.

    AVAILABILITY:

    Datasets are publicly available. Python code and command-line utilities implementing CROC curves and metrics are available at http://pypi.python.org/pypi/CROC/ CONTACT: pfbaldi@ics.uci.edu

    PMID:
    20378557
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC2865862
    Free PMC Article

    Images from this publication.See all images (3) Free text

    Fig. 1.
    Fig. 3.
    Fig. 2.

      Supplemental Content

      Icon for HighWire Press Icon for PubMed Central

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk