Motivation: Many problems in molecular biology as well as other areas involve detection of rare events in unbalanced data. We develop two sample stratification schemes in conjunction with neural networks for rare event detection in such databases. Sample stratification is a technique for making each class in a sample have equal influence on decision making. The first scheme proposed stratifies a sample by adding up the weighted sum of the derivatives during the backward pass of training. The second scheme proposed uses a technique of modified bootstrap aggregating. After training neural networks with multiple sets of bootstrapped examples of the rare event classes and subsampled examples of common event classes, multiple voting for classification is performed.
Results: These two schemes make rare event classes have a better chance of being included in the sample used for training neural networks and thus improve the classification accuracy for rare event detection. The experimental performance of the two schemes using two sets of human DNA sequences as well as another set of Gaussian data indicates that proposed schemes have the potential of significantly improving accuracy of neural networks to recognize rare events.