Format

Send to

Choose Destination
OMICS. 2019 Apr;23(4):207-213. doi: 10.1089/omi.2018.0191. Epub 2019 Feb 22.

Optimism Bias Correction in Omics Studies with Big Data: Assessment of Penalized Methods on Simulated Data.

Zhao Y1,2,3,4, Dantony E1,2,3,4, Roy P1,2,3,4.

Author information

1
1 Service de Biostatistique-Bioinformatique, Pôle Santé Publique, Hospices Civils de Lyon, Lyon, France.
2
2 Université de Lyon, Lyon, France.
3
3 Université Claude Bernard (Lyon 1), Villeurbanne, France.
4
4 CNRS UMR 5558, Laboratoire de Biométrie et Biologie Évolutive, Équipe Biostatistique Santé, Villeurbanne, France.

Abstract

Big Data generated by omics technologies require simultaneous analyses of large numbers of variables. This leads to complex model selection and parameter estimates that show optimism bias. This study on simulated data sets examined optimism-bias correction by penalty regression methods in case-control studies that involve clinical and omics variables. Least absolute shrinkage and selection operator (LASSO)-based methods (LASSO-penalized logistic regression, adaptive LASSO, and regularized LASSO for selection + ridge regression) were evaluated using power, the false positive rate (FPR), false discovery rate (FDR), and by estimated versus theoretical parameter comparisons. The "ordinary" LASSO overcorrects the optimism bias. The adaptive LASSO with LASSO estimation of the weights was unable to provide a sufficient correction. Importantly, the adaptive LASSO with ridge estimation of the weights showed the best parameter estimation. The regularized LASSO selection showed a slight optimism bias that decreased with the increase in the training set size. The optimism bias decreased with the increase of the number of variables selected among truly differentially expressed variables; however, power, FPR, and FDR were correlated. A compromise between model selection and estimation accuracy should be found. These results might prove useful because Big Data analyses are becoming commonplace in omics/multiomics studies in integrative biology, precision medicine, and planetary health.

KEYWORDS:

LASSO; optimism bias; parameter estimation; variable selection

PMID:
30794050
DOI:
10.1089/omi.2018.0191

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center