MsImpute: Estimation of Missing Peptide Intensity Data in Label-Free Quantitative Mass Spectrometry

Mol Cell Proteomics. 2023 Aug;22(8):100558. doi: 10.1016/j.mcpro.2023.100558. Epub 2023 Apr 25.

Abstract

Mass spectrometry (MS) enables high-throughput identification and quantification of proteins in complex biological samples and can provide insights into the global function of biological systems. Label-free quantification is cost-effective and suitable for the analysis of human samples. Despite rapid developments in label-free data acquisition workflows, the number of proteins quantified across samples can be limited by technical and biological variability. This variation can result in missing values which can in turn challenge downstream data analysis tasks. General purpose or gene expression-specific imputation algorithms are widely used to improve data completeness. Here, we propose an imputation algorithm designated for label-free MS data that is aware of the type of missingness affecting data. On published datasets acquired by data-dependent and data-independent acquisition workflows with variable degrees of biological complexity, we demonstrate that the proposed missing value estimation procedure by barycenter computation competes closely with the state-of-the-art imputation algorithms in differential abundance tasks while outperforming them in the accuracy of variance estimates of the peptide abundance measurements, and better controls the false discovery rate in label-free MS experiments. The barycenter estimation procedure is implemented in the msImpute software package and is available from the Bioconductor repository.

Keywords: barycenter computation; estimation of variance; imputation; label-free quantification; missing data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Humans
  • Mass Spectrometry / methods
  • Peptides* / analysis
  • Proteins

Substances

  • Peptides
  • Proteins