Therapeutic antibody development requires selection and engineering of molecules with high affinity and other drug-like biophysical properties. However, optimizing antibody properties such as affinity is often detrimental to other properties such as stability and specificity, which can compromise safety and efficacy. Due to inherent tradeoffs between drug-like biophysical properties, co-optimization of multiple antibody properties remains a difficult and time-consuming process that impedes drug development. Here we have evaluated the use of machine learning to greatly simplify the identification of antibodies with co-optimal levels of affinity and specificity for a clinical-stage antibody (emibetuzumab) that displays both high levels of on-target (affinity) and off-target (non-specific) binding. We mutated sites in the antibody complementarity-determining regions that were predicted to mediate non-specific binding, sorted the antibody libraries for high and low levels of affinity and non-specific binding, and deep sequenced the enriched libraries. Interestingly, we found that machine learning models developed using binary datasets and supervised dimensionality reduction enabled predictions of continuous metrics that were strongly correlated with antibody affinity and non-specific binding. These models illustrated strong tradeoffs between antibody affinity and specificity, as increases in affinity along the co-optimal (Pareto) frontier required progressive reductions in specificity. Notably, models trained with deep learning features enabled extrapolation to predict novel antibody mutations that co-optimized affinity and specificity beyond what was possible for the original antibody library. These findings demonstrate the power of machine learning models to greatly expand the exploration of novel antibody sequence space and accelerate the development of highly potent, drug-like antibodies.
Less...