Send to

Choose Destination
Sci Total Environ. 2019 Mar 10;655:512-519. doi: 10.1016/j.scitotenv.2018.11.022. Epub 2018 Nov 5.

Modeling groundwater nitrate exposure in private wells of North Carolina for the Agricultural Health Study.

Author information

The University of Texas at Austin, Civil, Architectural and Environmental Engineering, 301 E. Dean Keeton St., Austin, TX 78712, United States; Oregon State University, Environmental and Molecular Toxicology, 1007 Agriculture and Life Sciences Building, Corvallis, OR 97331, United States. Electronic address:
Virginia Commonwealth University, Department of Biostatistics, 830 East Main St., Richmond, VA 23298, United States.
Westat, 1600 Research Blvd., Rockville, MD 20850, United States.
National Cancer Institute, Division of Cancer Epidemiology and Genetics, Occupational and Environmental Epidemiology Branch, 9609 Medical Center Dr., Rockville, MD 20850, United States.
U.S. Geological Survey, Water Mission Area, 12201 Sunrise Valley Dr., Reston, VA 20192, United States.


Unregulated private wells in the United States are susceptible to many groundwater contaminants. Ingestion of nitrate, the most common anthropogenic private well contaminant in the United States, can lead to the endogenous formation of N-nitroso-compounds, which are known human carcinogens. In this study, we expand upon previous efforts to model private well groundwater nitrate concentration in North Carolina by developing multiple machine learning models and testing against out-of-sample prediction. Our purpose was to develop exposure estimates in unmonitored areas for use in the Agricultural Health Study (AHS) cohort. Using approximately 22,000 private well nitrate measurements in North Carolina, we trained and tested continuous models including a censored maximum likelihood-based linear model, random forest, gradient boosted machine, support vector machine, neural networks, and kriging. Continuous nitrate models had low predictive performance (R2 < 0.33), so multiple random forest classification models were also trained and tested. The final classification approach predicted <1 mg/L, 1-5 mg/L, and ≥5 mg/L using a random forest model with 58 variables and maximizing the Cohen's kappa statistic. The final model had an overall accuracy of 0.75 and high specificity for the higher two categories and high sensitivity for the lowest category. The results will be used for the categorical prediction of private well nitrate for AHS cohort participants that reside in North Carolina.


Agricultural Health Study; Exposure assessment; Groundwater contamination; Nitrate; Random Forest

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Elsevier Science
Loading ...
Support Center