Format

Send to

Choose Destination
J Water Health. 2003 Dec;1(4):209-23.

Assessment of statistical methods used in library-based approaches to microbial source tracking.

Author information

1
Southern California Coastal Water Research Project, Westminster, CA 92863, USA. kerryr@sccwrp.org

Abstract

Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.

PMID:
15382725
[Indexed for MEDLINE]

Supplemental Content

Loading ...
Support Center