Format

Send to

Choose Destination
See comment in PubMed Commons below
J Am Med Inform Assoc. 2012 Sep-Oct;19(5):913-6. doi: 10.1136/amiajnl-2011-000607. Epub 2012 Jan 29.

Automatic classification of mammography reports by BI-RADS breast tissue composition class.

Author information

1
Biomedical Informatics Program, Stanford University, Stanford, California 94305-5488, USA.

Abstract

Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.

PMID:
22291166
PMCID:
PMC3422822
DOI:
10.1136/amiajnl-2011-000607
[Indexed for MEDLINE]
Free PMC Article
PubMed Commons home

PubMed Commons

0 comments
How to join PubMed Commons

    Supplemental Content

    Full text links

    Icon for Silverchair Information Systems Icon for PubMed Central
    Loading ...
    Support Center