Send to

Choose Destination
Mol Cell Proteomics. 2015 Nov;14(11):2947-60. doi: 10.1074/mcp.M115.050245. Epub 2015 Aug 26.

Machine Learning-based Classification of Diffuse Large B-cell Lymphoma Patients by Their Protein Expression Profiles.

Author information

From the ‡Proteomics and Signal Transduction Group and.
From the ‡Proteomics and Signal Transduction Group and §Computational Systems Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany.
¶Institute of Pathology, Campus Benjamin Franklin, Molecular Diagnostics, Charité-Universitätsmedizin Berlin, 12200 Berlin, Germany, and.
‖Institute of Oncology and Hematology, III. Medizinische Klinik, Technische Universität München, 81675 Munich, Germany.
§Computational Systems Biochemistry, Max Planck Institute of Biochemistry, D-82152 Martinsried, Germany.
From the ‡Proteomics and Signal Transduction Group and


Characterization of tumors at the molecular level has improved our knowledge of cancer causation and progression. Proteomic analysis of their signaling pathways promises to enhance our understanding of cancer aberrations at the functional level, but this requires accurate and robust tools. Here, we develop a state of the art quantitative mass spectrometric pipeline to characterize formalin-fixed paraffin-embedded tissues of patients with closely related subtypes of diffuse large B-cell lymphoma. We combined a super-SILAC approach with label-free quantification (hybrid LFQ) to address situations where the protein is absent in the super-SILAC standard but present in the patient samples. Shotgun proteomic analysis on a quadrupole Orbitrap quantified almost 9,000 tumor proteins in 20 patients. The quantitative accuracy of our approach allowed the segregation of diffuse large B-cell lymphoma patients according to their cell of origin using both their global protein expression patterns and the 55-protein signature obtained previously from patient-derived cell lines (Deeb, S. J., D'Souza, R. C., Cox, J., Schmidt-Supprian, M., and Mann, M. (2012) Mol. Cell. Proteomics 11, 77-89). Expression levels of individual segregation-driving proteins as well as categories such as extracellular matrix proteins behaved consistently with known trends between the subtypes. We used machine learning (support vector machines) to extract candidate proteins with the highest segregating power. A panel of four proteins (PALD1, MME, TNFAIP8, and TBC1D4) is predicted to classify patients with low error rates. Highly ranked proteins from the support vector analysis revealed differential expression of core signaling molecules between the subtypes, elucidating aspects of their pathobiology.

[Indexed for MEDLINE]
Free PMC Article

Supplemental Content

Full text links

Icon for HighWire Icon for PubMed Central
Loading ...
Support Center