Establishing a consensus for the hallmarks of cancer based on gene ontology and pathway annotations

BMC Bioinformatics. 2021 Apr 6;22(1):178. doi: 10.1186/s12859-021-04105-8.

Abstract

Background: The hallmarks of cancer provide a highly cited and well-used conceptual framework for describing the processes involved in cancer cell development and tumourigenesis. However, methods for translating these high-level concepts into data-level associations between hallmarks and genes (for high throughput analysis), vary widely between studies. The examination of different strategies to associate and map cancer hallmarks reveals significant differences, but also consensus.

Results: Here we present the results of a comparative analysis of cancer hallmark mapping strategies, based on Gene Ontology and biological pathway annotation, from different studies. By analysing the semantic similarity between annotations, and the resulting gene set overlap, we identify emerging consensus knowledge. In addition, we analyse the differences between hallmark and gene set associations using Weighted Gene Co-expression Network Analysis and enrichment analysis.

Conclusions: Reaching a community-wide consensus on how to identify cancer hallmark activity from research data would enable more systematic data integration and comparison between studies. These results highlight the current state of the consensus and offer a starting point for further convergence. In addition, we show how a lack of consensus can lead to large differences in the biological interpretation of downstream analyses and discuss the challenges of annotating changing and accumulating biological data, using intermediate knowledge resources that are also changing over time.

Keywords: Co-expression network; Gene ontolog; Semantic similarity; The hallmarks of cancer.

MeSH terms

  • Consensus
  • Gene Ontology*
  • Humans
  • Molecular Sequence Annotation
  • Neoplasms* / diagnosis
  • Neoplasms* / genetics
  • Semantics*