Send to

Choose Destination
Epigenomics. 2019 Aug 30. doi: 10.2217/epi-2019-0206. [Epub ahead of print]

EpiSmokEr: a robust classifier to determine smoking status from DNA methylation data.

Author information

Institute for Molecular Medicine Finland, University of Helsinki, 00290 Helsinki, Uusimaa, Finland.
Department of Public Health, University of Helsinki, 00290 Helsinki, Uusimaa, Finland.
National Institute for Health & Welfare, University of Helsinki, P.O. Box 30, FI-00271 Helsinki, Uusimaa, Finland.
Center for Molecular Biology of the University of Heidelberg, Im Neuenheimer Feld 282, 69120 Heidelberg, Baden-W√ľrttemberg, Germany.


Aim: Smoking strongly influences DNA methylation, with current and never smokers exhibiting different methylation profiles. Methods: To advance the practical applicability of the smoking-associated methylation signals, we used machine learning methodology to train a classifier for smoking status prediction. Results: We show the prediction performance of our classifier on three independent whole-blood datasets demonstrating its robustness and global applicability. Furthermore, we examine the reasons for biologically meaningful misclassifications through comprehensive phenotypic evaluation. Conclusion: The major contribution of our classifier is its global applicability without a need for users to determine a threshold value for each dataset to predict the smoking status. We provide an R package, EpiSmokEr (Epigenetic Smoking status Estimator), facilitating the use of our classifier to predict smoking status in future studies.


DNA methylation; epigenetic smoking status; multinomial LASSO; smoking status classifier; tobacco smoking


Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center