Send to

Choose Destination
Stat Methods Med Res. 2018 Mar;27(3):785-797. doi: 10.1177/0962280216643116. Epub 2016 Apr 25.

Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions.

Author information

1 Inserm UMR 1181, Biostatistics, Biomathematics, Pharmacoepidemiology and Infectious Diseases (B2PHI), F-94807 Villejuif, France.
2 Institut Pasteur, UMR 1181, B2PHI, F-75015 Paris, France.
3 Univ. Versailles St Quentin, UMR 1181, B2PHI, F-94807 Villejuif, France.
4 University of Bordeaux, UMR 1219, F-33000 Bordeaux, France.
5 Inserm UMR 1219, Bordeaux Population Health Research Center, Pharmacoepidemiology team, F-33000 Bordeaux, France.
6 Department of Medical Pharmacology, CHU de Bordeaux, F-33000 Bordeaux, France.


Background All methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients' spontaneous reports. Recently, it was proposed to analyze individual spontaneous reports directly using Bayesian lasso logistic regressions. Nevertheless, this raises the issue of choosing an adequate regularization parameter in a variable selection framework while accounting for computational constraints due to the high dimension of the data. Purpose Our main objective is to propose a method, which exploits the subsampling idea from Stability Selection, a variable selection procedure combining subsampling with a high-dimensional selection algorithm, and adapts it to the specificities of the spontaneous reporting data, the latter being characterized by their large size, their binary nature and their sparsity. Materials and method Given the large imbalance existing between the presence and absence of a given adverse event, we propose an alternative subsampling scheme to that of Stability Selection resulting in an over-representation of the minority class and a drastic reduction in the number of observations in each subsample. Simulations are used to help define the detection threshold as regards the average proportion of false signals. They are also used to compare the performances of the proposed sampling scheme with that originally proposed for Stability Selection. Finally, we compare the proposed method to the gamma Poisson shrinker, a disproportionality method, and to a lasso logistic regression approach through an empirical study conducted on the French national pharmacovigilance database and two sets of reference signals. Results Simulations show that the proposed sampling strategy performs better in terms of false discoveries and is faster than the equiprobable sampling of Stability Selection. The empirical evaluation illustrates the better performances of the proposed method compared with gamma Poisson shrinker and the lasso in terms of number of reference signals retrieved.

[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Atypon
Loading ...
Support Center