Demonstration of a software design and statistical analysis methodology with application to patient outcomes data sets

Med Phys. 2013 Nov;40(11):111718. doi: 10.1118/1.4824917.

Abstract

Purpose: With emergence of clinical outcomes databases as tools utilized routinely within institutions, comes need for software tools to support automated statistical analysis of these large data sets and intrainstitutional exchange from independent federated databases to support data pooling. In this paper, the authors present a design approach and analysis methodology that addresses both issues.

Methods: A software application was constructed to automate analysis of patient outcomes data using a wide range of statistical metrics, by combining use of C#.Net and R code. The accuracy and speed of the code was evaluated using benchmark data sets.

Results: The approach provides data needed to evaluate combinations of statistical measurements for ability to identify patterns of interest in the data. Through application of the tools to a benchmark data set for dose-response threshold and to SBRT lung data sets, an algorithm was developed that uses receiver operator characteristic curves to identify a threshold value and combines use of contingency tables, Fisher exact tests, Welch t-tests, and Kolmogorov-Smirnov tests to filter the large data set to identify values demonstrating dose-response. Kullback-Leibler divergences were used to provide additional confirmation.

Conclusions: The work demonstrates the viability of the design approach and the software tool for analysis of large data sets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Interpretation, Statistical
  • Databases, Factual
  • Dose-Response Relationship, Radiation
  • Humans
  • Outcome Assessment, Health Care / methods*
  • Programming Languages
  • ROC Curve
  • Radiometry
  • Reproducibility of Results
  • Software Design*
  • Software*
  • Statistics, Nonparametric