Display Settings:

Format

Send to:

Choose Destination
    J Chem Inf Model. 2005 May-Jun;45(3):777-85.

    kappa Nearest neighbors QSAR modeling as a variational problem: theory and applications.

    Source

    Laboratory for Molecular Modeling, School of Pharmacy, University of North Carolina at Chapel Hill, North Carolina 27599-7360, USA.

    Abstract

    Variable selection k Nearest Neighbor (kNN) QSAR is a popular nonlinear methodology for building correlation models between chemical descriptors of compounds and biological activities. The models are built by finding a subspace of the original descriptor space where activity of each compound in the data set is most accurately predicted as the averaged activity of its k nearest neighbors in this subspace. We have formulated the problem of searching for the optimized kNN QSAR models with the highest predictive power as a variational problem. We have investigated the relative contribution of several model parameters such as the selection of variables, the number (k) of nearest neighbors, and the shape of the weighting function used to evaluate the contributions of k nearest neighbor compound activities to the predicted activity of each compound. We have derived the expression for the weighting function which maximizes the model performance. This optimization methodology was applied to several experimental data sets divided into the training and test sets. We report a significant improvement of both the leave-one-out cross-validated R(2) (q(2)) for the training sets and predictive R(2) of the test sets in all cases. Depending on the data set, the average improvements in the prediction accuracy (prediction R(2)) for the test sets ranged between 1.1% and 94% and for the training sets (q(2)) between 3.5% and 118%. We also describe a modified computational procedure for model building based on the use of relational databases to store descriptors and calculate compounds' similarities, which simplifies calculations and increases their efficiency.

    PMID:
    15921467
    [PubMed - indexed for MEDLINE]

      Supplemental Content

      Icon for American Chemical Society

      Save items

      loading

      Recent activity

      Your browsing activity is empty.

      Activity recording is turned off.

      Turn recording back on

      See more...
      Write to the Help Desk