Methylation profiling by high throughput sequencing
The field of cancer genomics has been empowered by increasingly sophisticated inference tools to distinguish driver mutations from the vastly greater number of passenger mutations. Similarly, promoter DNA hypermethylation has been shown to drive cancer through inactivation of tumor suppressor genes (TSGs), but growing malignant populations also accrue pervasive stochastic epigenetic changes in DNA methylation (DNAme), which likely carry little functional impact. However, the development of DNAme inference methods has lagged behind, providing limited ability to robustly differentiate driver DNAme changes (epidrivers) from stochastic, passenger DNAme changes. To address this challenge, we developed MethSig, a statistical inference framework that accounts for the varying stochastic hypermethylation rates across the genome and between samples. MethSig estimates the background expected DNAme changes, thereby allowing the identification of epigenetically disrupted loci, where observed hypermethylation significantly exceeds expectation, potentially reflecting positive selection. We applied MethSig to reduced representation bisulfite sequencing (RRBS) data of 407 chronic lymphocytic leukemia (CLL) samples, including 304 CLLs collected in a prospective clinical trial. MethSig resulted in well-calibrated quantile-quantile (Q-Q) plots in contrast to ubiquitously used statistical methods, and reproducible inference of epidrivers across independent cohorts. MethSig provided robust inference in additional cancer types, including multiple myeloma (MM, n = 44) and ductal carcinoma in situ (DCIS, n = 24). To further validate MethSig’s inferences, selected CLL candidate epidrivers (DUSP22, RPRM, or SASH1) underwent CRISPR/Cas9 knockout in CLL cells, showing superior fitness in ibrutinib and fludarabine treatment compared with controls. Candidate epidrivers are enriched in known TSGs, and in genes hypermethylated or inactivated across cancer types. Notably, greater number of epidrivers was closely associated with adverse outcome in CLL in a multivariable model accounting for additional adverse prognostic marks. Application of MethSig to CLL relapsed after chemoimmunotherapy further identified relapse-specific epidrivers, enriched in TP53 targets as well as DNA damage pathway. Collectively, MethSig represents a novel framework for robust identification of epidrivers to chart the role of aberrant DNAme in cancer.
Using RRBS, we profiled DNA methylation patterns of CLL tumor samples.