Format

Send to

Choose Destination
Epigenomics. 2019 Aug 30. doi: 10.2217/epi-2019-0206. [Epub ahead of print]

EpiSmokEr: a robust classifier to determine smoking status from DNA methylation data.

Author information

1
Institute for Molecular Medicine Finland, University of Helsinki, 00290 Helsinki, Uusimaa, Finland.
2
Department of Public Health, University of Helsinki, 00290 Helsinki, Uusimaa, Finland.
3
National Institute for Health & Welfare, University of Helsinki, P.O. Box 30, FI-00271 Helsinki, Uusimaa, Finland.
4
Center for Molecular Biology of the University of Heidelberg, Im Neuenheimer Feld 282, 69120 Heidelberg, Baden-W√ľrttemberg, Germany.

Abstract

Aim: Smoking strongly influences DNA methylation, with current and never smokers exhibiting different methylation profiles. Methods: To advance the practical applicability of the smoking-associated methylation signals, we used machine learning methodology to train a classifier for smoking status prediction. Results: We show the prediction performance of our classifier on three independent whole-blood datasets demonstrating its robustness and global applicability. Furthermore, we examine the reasons for biologically meaningful misclassifications through comprehensive phenotypic evaluation. Conclusion: The major contribution of our classifier is its global applicability without a need for users to determine a threshold value for each dataset to predict the smoking status. We provide an R package, EpiSmokEr (Epigenetic Smoking status Estimator), facilitating the use of our classifier to predict smoking status in future studies.

KEYWORDS:

DNA methylation; epigenetic smoking status; multinomial LASSO; smoking status classifier; tobacco smoking

PMID:
31466478
DOI:
10.2217/epi-2019-0206

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center