A SuperLearner Approach to Predict Run-In Selection in Clinical Trials

Corrado Lanera; Paola Berchialla; Giulia Lorenzoni; Aslihan Şentürk Acar; Valentina Chiminazzo; Danila Azzolina; Dario Gregori; Ileana Baldi

doi:10.1155/2022/4306413

A SuperLearner Approach to Predict Run-In Selection in Clinical Trials

Comput Math Methods Med. 2022 Sep 10:2022:4306413. doi: 10.1155/2022/4306413. eCollection 2022.

Authors

Corrado Lanera¹, Paola Berchialla², Giulia Lorenzoni¹, Aslihan Şentürk Acar³, Valentina Chiminazzo¹, Danila Azzolina^{1

4}, Dario Gregori¹, Ileana Baldi¹

Affiliations

¹ Unit of Biostatistics, Epidemiology and Public Health, Department of Cardiac, Thoracic, Vascular Sciences, and Public Health, University of Padova, Via Loredan, 18, 35121 Padova, Italy.
² Department of Clinical and Biological Sciences, University of Torino, Via Verdi 8, 10124 Torino, Italy.
³ Department of Actuarial Sciences, Hacettepe University, Ankara, Turkey 06800.
⁴ Department of Environmental and Preventive Sciences, University of Ferrara, Via Fossato di Mortara 64B, 44121 Ferrara, Italy.

Abstract

A critical early step in a clinical trial is defining the study sample that appropriately represents the target population from which the sample will be drawn. Envisaging a "run-in" process in study design may accomplish this task; however, the traditional run-in requires additional patients, increasing times, and costs. The possible use of the available a-priori data could skip the run-in period. In this regard, ML (machine learning) techniques, which have recently shown considerable promising usage in clinical research, can be used to construct individual predictions of therapy response probability conditional on patient characteristics. An ensemble model of ML techniques was trained and validated on twin randomized clinical trials to mimic a run-in process within this framework. An ensemble ML model composed of 26 algorithms was trained on the twin clinical trials. SuperLearner (SL) performance for the Verum (Treatment) arm is above 70% sensitivity. The Positive Predictive Value (PPP) achieves a value of 80%. Results show good performance in the direction of being useful in the simulation of the run-in period; the trials conducted in similar settings can train an optimal patient selection algorithm minimizing the run-in time and costs of conduction.

MeSH terms

Algorithms*
Humans
Machine Learning*
Predictive Value of Tests
Research Design