Cancer Cells Enter an Adaptive Persistence to Survive Radiotherapy and Repopulate Tumor

Abstract Repopulation of residual tumor cells impedes curative radiotherapy, yet the mechanism is not fully understood. It is recently appreciated that cancer cells adopt a transient persistence to survive the stress of chemo‐ or targeted therapy and facilitate eventual relapse. Here, it is shown that cancer cells likewise enter a “radiation‐tolerant persister” (RTP) state to evade radiation pressure in vitro and in vivo. RTP cells are characterized by enlarged cell size with complex karyotype, activated type I interferon pathway and two gene patterns represented by CST3 and SNCG. RTP cells have the potential to regenerate progenies via viral budding‐like division, and type I interferon‐mediated antiviral signaling impaired progeny production. Depleting CST3 or SNCG does not attenuate the formation of RTP cells, but can suppress RTP cells budding with impaired tumor repopulation. Interestingly, progeny cells produced by RTP cells actively lose their aberrant chromosomal fragments and gradually recover back to a chromosomal constitution similar to their unirradiated parental cells. Collectively, this study reveals a novel mechanism of tumor repopulation, i.e., cancer cell populations employ a reversible radiation‐persistence by poly‐ and de‐polyploidization to survive radiotherapy and repopulate the tumor, providing a new therapeutic concept to improve outcome of patients receiving radiotherapy.

[N], untreated cells; [P], pre-budding RTP cells post 8Gy; [B], budding RTP cells of day 9 post 8Gy; [S], budded progeny cells of day 9 post 8Gy. *p < 0.05, **p < 0.01, ***p < 0.001; n.s., not significant.    *p < 0.05, **p < 0.01, ***p < 0.001; n.s., not significant  For each pair of TSP with expression of gene A>B, the true score gets one point. When the cumulative true score is more than 7, prognosis of this patient is considered "sensitive"; otherwise, this patient is assigned to "resistant". Thereinto, 24 pairs derived from locoregionally advanced nasopharyngeal cancer patients with or without metastasis after radical radiotherapy; 8 pairs were from glioblastoma PDXs with acquired radio-resistance and untreated radio-sensitive status. Paired t test.

Supplemental materials and methods
Colony formation assay: Clonogenic assay was performed using the published instructions. [1] Briefly, cells were irradiated with different doses of X-rays and then seeded at different numbers (200 for 0Gy, 400 for 2Gy, 2500 for 4Gy, 5000 for 6Gy, 50,000 for 8Gy and 100,000 for 10Gy) in triplicate in 6-well plates. For colony formation of giant cells, 50,000 or 10,000 cells were seeded. When colonies were clearly visible after 12-14 days, cells were fixed and stained with 0.5% crystal violet. Colonies of more than 50 cells were enumerated and survival fractions were then calculated.
For the soft agar assay, tumor cells were suspended in a soft agar upper layer (0.35% agarose), which were overlaid on the lower layer (0.6% agarose) in 6-well plates. The colonies were then stained with 0.005% crystal violet and counted with Image J software. To evaluate the contribution of PACC in clonogenic formation after irradiation, clonogenic survival assay (at a density of 30,000 cells per well) and soft agar assay (5000 cells per well) were applied simultaneously for both giant and small cells separated at day 6. EdU cell proliferation assay: EdU Cell Proliferation Assay (EdU-647, Sigma-Aldrich) was used per manufacturer's instructions. In short, cells were pulsed for 2 hours before harvesting by directly adding EdU solution to the cultured complete media. After cell fixation using 4% paraformaldehyde and permeabilization with 0.1% Triton X-100, cells were subjected to EdU click reaction. Combined Edu staining and IF was performed by subsequently blocked with 5% BSA in PBS and incubation of appropriate primary and secondary antibodies as described above. Images were acquired by fluorescence microscope (Leica). with an X-gal staining solution. For co-staining with other markers, a commercial SPiDER-β-Gal kit (Dojindo) was used for fluorescent imaging according to the manufacturer's instruction.
Briefly, after the incubation of primary antibody, SPiDER-β-Gal solution together with the secondary antibody was added to fixed cells. Images with bright field or fluorescence were collected by microscope (Leica) and counted with ImageJ software.

Functional enrichment analysis, including GO (Gene ontology) analysis and KEGG (Kyoto
Encyclopedia of Genes and Genomes) analysis, was performed with ClusterProfiler. Fisher's exact test was applied to identify the significant pathway categories and FDR was used to correct the p-values. Gene Set Enrichment Analysis (GSEA) was performed using GSEA software (v4.0.3) (www.broadinstitute.org/gsea) using Hallmark Gene Sets from MSigDB (Broad Institute). [9] The Monocle2 package (v2.18.0) was used for pseudotime analysis indicating the evolvement from different stages. [10] Analysis of chromosome copy number alteration was inferred using the R package inferCNV (v1.0.4).
For 10x single-cell RNA-sequencing, raw sequencing data were filtrated and aligned to the reference genome GRCh38 using CellRanger pipeline (10x Genomics, v3.1.0). For downstream processing, Seurat package (v.2.3) [11] (https://satijalab.org/seurat) was used for quality control, cellular filtration, data normalization, clustering and DEG analysis. Only cells that expressed at least 200 genes and had less than 10% of mitochondrial unique molecular identifiers (UMI) counts were kept for further processing. After removing low-quality data, the filtered matrix was performed log2 standardization. To reduce the high-dimensional transcriptome, principal component analysis (PCA) was performed on the highly varying genes.
The top 20 principle components were then chosen for further tSNE (t-Distributed Stochastic Neighbor Embedding) clustering and visualization using the RunTSNE and TSNEPlot function of Seurat. To characterize these clusters, DEG between consecutive clusters were identified with Limma and thresholds for differential expression were fold-change≥ 1.5 and FDR≤0.05. [12] GO and KEGG pathway analysis were carried out using the ClusterProfiler R package. Cell cycle phases were computed based on gene expression profile of cell cycle using Cyclone function. [13] For single-cell pseudotime trajectories, Monocle 2 was performed with the normalized expression data. [10] For quality control of bulk RNA-seq data, we firstly applied fastp filtering the adaptor sequence and removed the low quality reads. [2] Clean data were then mapped to the reference genome (GRCh38) by HISAT2 (v 2.0) with default parameters. [14] Gene expression levels of the transcripts were estimated by HTSeq (v 0.6.1). RPKM/FPKM (Reads/ Fragments Per Kilobase Million Reads) was used to standardize the matrix data. DEseq2 was sequentially used to filter the DEG and significant DEG was defined as i), log2FC > 0.585 or < -0.585; ii), FDR < 0.05. [8] To evaluate the enrichment scores for each sample, single-sample GSEA (ssGSEA) was implemented using GSVA (v 1.30.0) package with ssGSEA method. [15] The molecular signatures used (including stress [16] , senescence [17] and stemness [18] related gene set and EpiHR [19] signature) were listed in Table.6. The stem cell index [20] , including CBC and RSC index was calculated using the R package ISCindex (https://github.com/gnvalbuena/ ISCindex).

Quantitative real-time polymerase chain reaction (qRT-PCR)
Tumor contents of FFPE tissue sections were evaluated and validated by a pathologist. Tumor RNA was then extracted from the FFPE sections using FFPE RNA Extraction Kit (AmoyDx), quantified and qualified using NanoDrop Spectrophotometer (Thermo Scientific) and reverse transcription into cDNA was performed with the PrimeScript™ RT Master Mix Kit (Takara). qPCR was performed with TB Green® Premix Ex Taq™ Kit (Takara) following the manufacturer's instructions. The primers used were listed in Table 2. Relative gene expression was analyzed using the comparative method (2 − △△ CT ) and results were obtained at three independent experiments.

Public datasets used:
The treatment details and transcript profiles of residual tumor models used in Figure S2I were obtained from the publicly available GEO database GSE162285. [21] For each model, we examined their ssGSEA scores with transcript expression (residual versus untreated baseline) enriched in molecular profile or gene sets extracted from literature. [16a, 22] The rectal cancer transcriptomes pre-and post-neoadjuvant chemo-radiation used in Figure7G-H were downloaded from GEO with accession numbers GSE94104 and GSE190826.
Integration of array and RNA-seq datasets were performed using Rank-In (http://www.baddcao.net/rank-in/index).
Datasets of oncotherapy-treated patients including transcript matrix and clinical annotations, were acquired from The Cancer Genome Atlas Program (TCGA) database with the R package TCGAbiolinks (v 2.13.3). Based on their courses of treatment, the TCGA cases covering 29 cancer types were divided into the following 3 groups: radiotherapy only (368 patients), chemotherapy (592 patients) and concurrent chemo-radiation (207 patients). Among patients received radiotherapy only, according to their clinical response status, they were classified as sensitive to radiation, including clinical complete response (CR), partial response (PR) and stable disease, or assigned to resistance to radiation with radiographic progressive or recurrent diseases.
The treatment details and transcriptome of nasopharyngeal cancer patients and glioblastoma PDXs in Figure S6K were obtained from dataset GSE103611 [23] and GSE206225 [24] .

K-Top-scoring Pair classifier (KTSP):
The KTSP algorithm provides a rank-based approach for classification, which has been developed successfully for predicting treatment response in several cancer types. [25] In this study, we firstly identified 48 DEG of budding RTP cells relative to untreated baseline grounding on the SMART-seq results. To simplify the 48-budding DEG in prognostic prediction, we enrolled 368 patients only received radiotherapy as training set to record the relative ordering of each pair of genes in the budding DEG. Based on clinical response of these patients, 11 top-scoring gene pairs were built for prognostic estimation. Each pair of genes "votes" for poor prognosis and the final score is the sum of all votes among all 11 pairs. If the final score is less than 7, the patient is then assigned to sensitive group; otherwise, the patient is predicted to be resistant to treatment. Kaplan-Meier model and log rank test were then used to evaluate the effect of 11TSPs classifier on survival prediction in all three TCGA cohorts. A multivariate Cox proportional hazards regression model was constructed to evaluate whether the 11TSPs classifier was an independent prognostic factor. Logistic regression models using R package glmnet was performed to assess the efficiency of 11TSPs classifier in prognostic prediction.