Display Settings:

Format

Send to:

Choose Destination
    PLoS Comput Biol. 2009 May;5(5):e1000391. Epub 2009 May 22.

    How to get the most out of your curation effort.

    Source

    Department of Medicine, University of Chicago, Chicago, Illinois, USA.

    Abstract

    Large-scale annotation efforts typically involve several experts who may disagree with each other. We propose an approach for modeling disagreements among experts that allows providing each annotation with a confidence value (i.e., the posterior probability that it is correct). Our approach allows computing certainty-level for individual annotations, given annotator-specific parameters estimated from data. We developed two probabilistic models for performing this analysis, compared these models using computer simulation, and tested each model's actual performance, based on a large data set generated by human annotators specifically for this study. We show that even in the worst-case scenario, when all annotators disagree, our approach allows us to significantly increase the probability of choosing the correct annotation. Along with this publication we make publicly available a corpus of 10,000 sentences annotated according to several cardinal dimensions that we have introduced in earlier work. The 10,000 sentences were all 3-fold annotated by a group of eight experts, while a 1,000-sentence subset was further 5-fold annotated by five new experts. While the presented data represent a specialized curation task, our modeling approach is general; most data annotation studies could benefit from our methodology.

    PMID:
    19461884
    [PubMed - indexed for MEDLINE]
    PMCID:
    PMC2678295
    Free PMC Article

    Images from this publication.See all images (5) Free text

    Figure 2
    Figure 4
    Figure 1
    Figure 3
    Figure 5

      Supplemental Content

      Icon for Public Library of Science Icon for PubMed Central

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk