The SARS-CoV-2 Transcriptome and the Dynamics of the S Gene Furin Cleavage Site in Primary Human Airway Epithelia

ABSTRACT The spike (S) polypeptide of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) consists of the S1 and S2 subunits and is processed by cellular proteases at the S1/S2 boundary that contains a furin cleavage site (FCS), 682RRAR↓S686. Various deletions surrounding the FCS have been identified in patients. When SARS-CoV-2 propagated in Vero cells, it acquired deletions surrounding the FCS. We studied the viral transcriptome in Vero cell-derived SARS-CoV-2-infected primary human airway epithelia (HAE) cultured at an air-liquid interface (ALI) with an emphasis on the viral genome stability of the FCS. While we found overall the viral transcriptome is similar to that generated from infected Vero cells, we identified a high percentage of mutated viral genome and transcripts in HAE-ALI. Two highly frequent deletions were found at the FCS region: a 12 amino acid deletion (678TNSPRRAR↓SVAS689) that contains the underlined FCS and a 5 amino acid deletion (675QTQTN679) that is two amino acids upstream of the FCS. Further studies on the dynamics of the FCS deletions in apically released virions from 11 infected HAE-ALI cultures of both healthy and lung disease donors revealed that the selective pressure for the FCS maintains the FCS stably in 9 HAE-ALI cultures but with 2 exceptions, in which the FCS deletions are retained at a high rate of >40% after infection of ≥13 days. Our study presents evidence for the role of unique properties of human airway epithelia in the dynamics of the FCS region during infection of human airways, which is likely donor dependent.

adjacent amino acids upstream or downstream of the FCS. The loss of the FCS has been identified in progeny virions replicated in Vero cells (17,(32)(33)(34)(35). The mutant viruses were stable, quickly took over the wild-type (WT) virus, and became the dominant population during passaging. Of note, various deletions surrounding the FCS have been identified in patients. This raises the question of how the FCS region deletions are selected in human airways.
In this study, we used transcriptome sequencing (RNA-seq) to analyze the viral transcriptome of SARS-CoV-2 in the infected human airway epithelia (HAE) cultured at an air-liquid interface (HAE-ALI), which mimics natural viral infection of human airways (36,37). While the viral transcriptome overall recapitulated that in Vero cells, we discovered that there is a selective pressure in HAE-ALI to suppress the deletions at the S1/S2 boundary and that this pressure appears individual donor dependent. We identified two FCS region deletions that are strikingly amplified in two HAE-ALI cultures after 2 to 3 weeks of infection, whereas these deletions were suppressed in nine other HAE-ALI cultures.

RESULTS
The SARS-CoV-2 transcriptome in SARS-CoV-2-infected HAE-ALI cultures. HAE-ALI B2-20 cultures were infected with SARS-CoV-2 at a multiplicity of infection (MOI) of 0.2 or 2 or mock infected. At 4 days postinfection (dpi), immunofluorescence assay for the SARS-CoV-2 N protein expression revealed effective SARS-CoV-2 infection in these cultures, with ;10% and ;30% of cells positive in the infections at MOIs of 0.2 and 2, respectively ( Fig. 1). This result was similar to our previous observation (37).
Total RNA samples were extracted from infected HAE-ALI cultures at 4 dpi and subjected to reverse transcription, followed by DNA nanoball sequencing. An average total reads of 18.27% and 26.54% were mapped to the SARS-CoV-2 reference genome (Wuhan-Hu-1 isolate; GenBank accession no. MN908947) in the groups of MOI 0.2 and MOI 2, respectively (Table 1). No significant difference was observed in the total reads in the two groups. Notably, the RNA-seq data obtained from SARS-CoV-2-infected Vero-E6 cells had up to 70% of the reads mapped to the viral genome (15), which was likely due to the high infectivity of Vero-E6 cells and that not all the cell types in HAE- . At 4 days postinfection, a piece of the insert membrane was fixed in 4% paraformaldehyde in PBS at 4°C overnight and subjected to direct immunofluorescence analysis. The membranes were stained with anti-SARS-CoV-2 N protein (NP). Images were taken on a Leica TCS SPE confocal microscope under 40Â, which was controlled by Leica Application Suite X software. The nuclei were stained with DAPI (49,6-diamidino-2-phenylindole). Bar, 20 mm.
SARS-CoV-2 Transcriptome and the Dynamics of the FCS ® ALI are permissive to the infection (37). Also, for the whole viral genome coverage, in contrast to the observation in SARS-CoV-2-infected Vero-E6 cells (15), we did not observe an obvious 59-leader peak in the infected HAE cells (Fig. 2). Instead, we observed ;2-fold higher reads in the 39 end than that in the 59-end viral genome.
We further analyzed the viral sgRNA expression in infected HAE cells. Junction-spanning reads covering the 59 leader and different sgRNAs were counted and analyzed (see Data Set S1 in the supplemental material). Different sgRNAs were abundantly expressed in infected HAE cells. As the negative-strand intermediates are only ;1% as abundant as their positive-sense counterparts (22,38), this indicates most of the identified sgRNAs were 1sgRNAs. N-protein-encoding RNA was the most abundantly expressed viral transcript and accounted for 23.11% and 16.93% of total junction-spanning reads in the groups of MOI 0.2 and MOI 2, respectively, followed by ORF3a, ORF7a, M, ORF8, S, E, ORF6 coding RNAs (Fig. 3). The junction-spanning reads associated with ORF7b and ORF9a/b were identified at a level of 0.01% or less of the total junction-spanning reads and were identified in only part of all the six samples in two MOI groups (Data Set S1). In SARS-CoV-2-infected HAE cells, S RNA transcript was expressed at a ratio of ;2% of total junction-spanning reads in both groups (Fig. 3), compared to that of ;8% in Vero cells. We detected relatively higher level of ORF3a (;8%) transcript in SARS-CoV-2-infected HAE cells (Fig. 3), in contrast to 5.22% in infected Vero cells (15).
Interestingly, in all identified spanning-junction reads, only ;50% correlated with the canonical sgRNA transcripts in both MOI infection groups. The other half junctionspanning reads represent either reads covering 59-leader sequence but with unex- pected 39 sites located in the middle of annotated ORFs or reads covering between different ORFs or inside an ORF without 59-leader sequence (Data Set S1). It is important to note that a lot of these noncanonical junction-spanning patterns were supported by only one read from the RNA-seq data, indicating that these noncanonical transcripts may arise from erroneous replicase activity. Identification of deletions surrounding the furin cleavage site at S1/S2. Among the ;50% noncanonical junction-spanning reads, we identified a high abundant 36-bp deletion, mut-del1, located at nucleotide (nt) 23,594 to 23,629 spanning the FCS (Fig. 4A) that encodes amino acids (aa) 678 TNSPRRAR;SVAS 689 ( Fig. 4B, ";" indicates cleavage). It displayed at frequencies of 21.04% and 14.79% of total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively ( Table 2). Another 15-bp deletion, mut-del2, located at nt 23,583 to 23,597 and encoding aa 675 QTQTN 679 (Fig. 4A), just two amino acids ahead of the FCS, was also identified. It accounted for 0.42% and 15.11% of the total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively ( Table 2). The ratio of mut-del1 is only slightly lower than the N sgRNA and nearly 10 times higher than the S sgRNA transcript, indicating a high fraction of this mutation comes from the viral genome (1gRNA).
To further reveal the ratio of the two deletions in total viral genome, the junctionspanning reads associated with the two deletions were normalized with the average reads covering the same deletions. The results showed that 26.31% and 6.67% of the Six total RNA samples were extracted from SARS-CoV-2-infected HAE-ALI cultures (at MOIs of 0.2 and 2, respectively) and subjected to whole RNA-seq. Three repeats in each MOI group were merged. Junction-spanning reads were identified using STAR (2.7.3a), and the transcript abundance, as shown as a percentage under HAE-ALI/MOI of 0.2 or 2, was estimated by counting the reads that span the junction of the corresponding RNA transcript. The left is the diagrammed subgenomic RNAs. The canonical junction-spanning reads related to each sgRNA were calculated, and the ratios are shown on right. The abundances of the subgenomic transcripts identified in Vero cells in a previous study (15) are listed for comparison. SARS-CoV-2 Transcriptome and the Dynamics of the FCS ® viral reads related to this region contain the mut-del1 deletion, while 0.37% and 8.17% of this region contain mut-del2 deletion in MOI 0.2 and MOI 2 groups, respectively ( Table 2). It should be noted that the total reads used for normalization include reads of both viral gRNA and sgRNAs. Thus, here we were unable to distinguish the origin of these two deletions from the viral genome and viral RNA transcripts in these total cellular transcriptome data.
Except for these two highly abundant mut-del1 and mut-del2 deletions, we also observed a 21-bp FCS deletion at nt 23,595 to 23,615, encoding aa 678 TNSPRRA 684 , but only in the MOI 2 group with 1.13% of the total junction-spanning reads, and a 39-bp deletion at the N terminus of the S protein (nt 21,743 to 21,781 encoding aa 61 NVTWFHAIHVSGT 73 ) with 0.27% and 0.60% of the total junction-spanning reads in MOI 0.2 and MOI 2 groups, respectively.
In addition to these deletions in the S gene, we identified about 50 different in- Zou et al.
® frame or frameshift deletions in the M encoding region that appeared in all six samples of both MOI groups, and there were even more deletions in the M coding region that appeared in only a part of the six RNA samples (Data Set S1). Although the ratio of single deletion was low, the 50 deletion patterns that appeared in all six RNA samples had the ratios of 2.39% and 3.18% in MOI 0.2 and MOI 2 groups, respectively, which is similar or even higher than the identified canonical junction-spanning reads related to M sgRNAs (Fig. 3). Notably, most of these identified deletion patterns of the M gene also appeared in SARS-CoV-2-infected Vero cells (15). Whether these deletions produce functional M protein or affect the function of M protein warrants further studies. In SARS-CoV-2-infected Vero-E6 cells, a high ratio of 27-bp deletion in E gene (nt 26,257 to 26,283) was identified (15), which, however, was not found in infected HAE cells.  (37). The apical virus release kinetics of the HAE-ALI KC19 is shown in Fig. 5A. Viral RNA was prepared either for RNA-seq or for PCR amplicon-seq of a 384nt sequence covering the FCS. Notably, mut-del1 was not significantly detected (,0.1%) in all the apically released viruses collected at .13 dpi (Table 3, Bx-20). Nevertheless, for viruses collected from HAE-ALI KC19 , the mut-del2 was detected at a high level (20.75% RNA-seq and 20.98% RNA-seq ) at 4 dpi and 13 dpi, respectively, which reached a close level of 41.79% PCR-seq at 21 dpi. Although the viruses derived from HAE-ALI B3-20 , HAE-ALI L209 , and HAE-ALI B4-20 contain a high level (23.17%, 30.33%, and 8.3%, respectively) of mut-del2 at 3 dpi, it decreased to a level of ,2% at $17 dpi (Table 3, Bx-20). HAE-ALI B9-20 never produced significant mut-del2 (,0.1%).

Dynamics of the FCS region deletions in virions apically released from SARS-CoV-2-infected HAE-ALI cultures derived from various donors. To further investigate the FCS region deletions during SARS-CoV-2 infection of HAE cells, we infected
Notably, SARS-CoV-2 isolate USA-WA1/2020 P0 stock provided by BEI was already passaged four times in Vero cells, and it was reported that there appeared significant heterogeneity at the S1/S2 boundary in Vero cell-propagated virus (32). To verify this, we sequenced the viral RNA of passage 0 (P0) (the originally received vial) and P1 (passaged once in Vero-E6 cells) virus stocks. The results showed that there was no detectable mut-del1 in the P0 stock but a high rate of 21.69% RNA-seq and 40.47 PCR-seq in the P1 stock. However, while there was ;2% of mut-del2 in the P0 stock, it slightly increased to only ;5% in the P1 stock (Table 3, P0 and P1). These results confirmed that even though there was no or a low level of mut-del1 and mut-del2 in the P0 stock, there was a high level of mut-del1 and a low level of mut-del2 in the P1 stock, which we used for infection of all HAE-ALI cultures.
Taken together, the results demonstrated that the mut-del1 appears at a very low rate in all the HAE-ALI produced viruses at late time points of infection ($17 dpi), which was confirmed by both RNA-seq and PCR amplicon-seq. Although the inoculum had the mut-del1 detected at a high frequency rate of 21.69% RNA-seq (40.47% PCR-seq ), this deletion was obviously suppressed when the virus propagated in the HAE-ALI cultures. In addition, except for the viruses produced from HAE-ALI KC19 , mut-del2 also    (Fig. 5B), indicating that the virus replicated in these two HAE-ALI cultures at a similar level. The sequencing results of the apically released viruses from infected HAE-ALI B15-20 showed that mut-del1 was detected at a rate of 31.87% at 3 dpi and an increased level of 54.22% at 13 dpi (Table 4, B15-20/mut-del1). However, mut-del2 was barely detectable at a low rate (0.18%) at 3 dpi, and this rate remained very low at 0.69% at 13 dpi (Table 4, B15-20/ mut-del2). For the viruses apically released from infected HAE-ALI B16-20 cultures, a stark difference was that the mut-del1 was suppressed during the infection period, whereas mut-del2 was also detected at very low levels (,1%) at both 3 and 13 dpi (Table 4, B16-20). The mut-del1 was detected at a rate of 6.78% at 3 dpi and nearly disappeared (at a rate of 0.08%) by the end of infection (13 dpi). The suppression of mut-del1 in HAE-ALI B16-20 cultures was similar to what was observed in the previously tested five ALI cultures ( Table 3). Now that we have tested seven HAE-ALI cultures from seven healthy donors, we next looked into the FCS deletions in four infected HAE-ALI cultures derived from bronchiolar epithelial cells isolated from four patients who had chronic obstructive pulmonary disease (COPD). While we did not observe significant differences in the titers of apically released viruses over the course of 21 days (Fig. 5A), we found that the FCS was largely retained at a rate of .99% in HAE-ALI N707 , HAE-ALI UC24 , and HAE-ALI L184 cultures at 3 dpi and in HAE-ALI N703 culture at 21 dpi (Table 5).
Together with the detection rates of the viruses produced from the infected HAE KC19 -ALI cultures, the above results suggested that the probability of FCS region deletions is dependent on the HAE-ALI cultures made from airway epithelial cells of different donors. In contrast to the HAE-ALI KC19 that produced a high rate of mut-del2, HAE-ALI B15-20 tended to generate the viruses that have a high rate of the mut-del1 deletion.

DISCUSSION
In this study, we analyzed the transcriptome of SARS-CoV-2 in polarized human bronchial airway epithelia, an in vitro model mimicking the SARS-CoV-2 infection in human lower airways (36,37). We found that the transcriptome in HAE-ALI reflects more closely the viral transcriptome in the airways of COVID-19 patients, providing further support for HAE-ALI as a physiologically relevant in vitro culture model to study SARS-CoV-2. Neither RNA-seq data of clinical SARS-CoV-2-positive nasopharyngeal specimens nor RNA-seq of SARS-CoV-2-infected HAE-ALI showed the 59-leader sequence read peak (39,40). In SARS-CoV-2-infected Vero-E6 cells, N sgRNAs accounted for up to 69% in total viral RNA transcripts, and S sgRNAs accounted for 8% of the total junction-spanning reads (15) (Fig. 3). Nevertheless, in HAE-ALI cultures, SARS-CoV-2 still expresses the abundant N protein transcript and a relatively low level of S gene mRNA. Of note, the overall sgRNA transcripts in infected HAE-ALI were mapped to only 50% of all the canonical sgRNAs, much lower than that in Vero cells (15), which is partially due to the high deletion rate of the FCS region derived from the inoculated virus (P1 stock, Table 3). Except for those arising from erroneous replicase events, some of these noncanonical transcripts may play unknown but functional roles in the coronavirus life cycle. Among all the SARS-CoV-2 viral genes, the S gene is the most variable one, in particular the S1/S2 junctional region which features the FCS. Increasing evidence has shown that the S1/S2 FCS region is highly unstable, and various deletions and mutations have been detected or isolated in SARS-CoV-2-infected Vero cells (17,(32)(33)(34)(35). A mutant with a 30-bp deletion, encoding aa 679 NSPRRAR;SVA 688 , showed enhanced replication ability in Vero cells and had the capability to dominate the genome population during passage in Vero cells (35). A 21-bp deletion encoding aa 679 NSPRRAR 686 was detected (.10%) in low (,2 to 3) passaged isolates (33). However, detection of the original clinical specimen where the mutant was derived and SARS-CoV-2-positive clinical specimens showed no such deletions (15 to 30 bp) in the FCS region (34), indicating that the mut-del1 or mut-del1-like (containing FCS) deletions are generated during the propagation in Vero cells. Apparently, SARS-CoV-2 is under strong selection pressure in Vero cells to acquire adaptive mutations in the S protein. Nevertheless, mut-del2 ( 675 QTQTN 679 ) has been identified not only in Vero cell-passaged isolates (41) but also in 3 of 68 clinical specimens (33), indicating that mut-del2 may be clinically more important (relevant) than mut-del1.
The S protein as a part of the viral envelope facilitates viral entry into infected cells. The S1 subunit contains the receptor binding domain, and the S2 domain mediates fusion of the viral envelope with a cellular membrane (24). The infectivity of SARS-CoV-2 necessitates the activation of S protein. There are two proteolytic activation events associated with S-mediated receptor binding and membrane fusion. The first is a priming cleavage that occurs at the S1/S2 boundary, and the second is the obligatory triggering cleavage that occurs within the S29 site (Fig. 4A). The priming cleavage at the S1/S2 boundary causes the conformation changes of the S1 subunit for receptor binding and of the S2 subunit for conversion of a fusion competent form, by enabling the S protein to better bind receptors or expose the hidden S29 cleavage site. The cleavage at S29 triggers the fusion of the viral envelope with the host cell membrane (24). Cleavage by furin at the S1/S2 site is required for subsequent transmembrane serine protease 2 (TMPRSS2)-mediated cleavage at the S29 site during viral entry into lung cells (27). However, a cathepsin B/L-dependent auxiliary activation pathway is available during infection of SARS-CoV-2 infection in TMPRSS2-negative cells (35,42), which is  (41). These mutants with amino acid deletions immediately upstream of FCS, like mut-del2, or downstream ( 685 RSV 687 or 689 SQS 691 ) also showed significant defects in S protein processing (41,42). Both types of the FCS region deletions were unable to utilize the furin and TMPRSS2-mediated plasma membrane fusion entry pathway and exhibited a more limited range of cell tropism (42,44), which might be randomly selected. This is substantiated by the fact that there were no FCS region deletions detected in SARS-CoV-2 propagated in TMPRSS2-expressing cells (42).
Overall, we believe that human airway epithelial cells express ACE2 and TMPRSS2 (37,(45)(46)(47), which plays an important role during S protein priming and viral entry, and the virus entry is mediated by the membrane fusion pathway. However, from 2 out of the 11 HAE-ALI cultures tested in this study, the lack of suppression of the FCS region deletion was also found in apically released virions of the infected HAE-ALI cultures made from KC19 and B15-20 donors. Previously, we discovered that the SARS-CoV-2 infection in HAE-ALI resulted in periodic recurrent replication peaks of progeny (37). Since the cleavage at the S29 site by TMPRSS2 necessitates the priming cleavage at S1/S2, the accumulation of FCS mutations in the progeny during the infection in HAE-ALI B15-20 and HAE-ALI KC19 may limit infection of the mut-del1 or mut-del2. Since the progeny virus was a pool mixed with WT and mutants, we did not observe a significant difference in apical virus release in HAE-ALI B15-20 and HAE-ALI KC19 . In a future study, we will plaque purify the two FCS mutants and examine their infectivity in HAE-ALI derived from various donors, as well as their transmissibility and pathogenicity in animals. We speculate that epithelial cells from these two donors may express much less TMPRSS2, and therefore, the virus utilizes the TMPRSS2-independent and cathepsindependent endosomal entry pathway (42,44,48), which likely does not require the S cleavage at S1/S2 (43) and thus prefers replication of the FCS deletion mutants.
SARS-CoV-2 infection is less studied in primary airway epithelial culture derived from individuals who already have critical pulmonary disease, such as COPD. COPD is a chronic inflammatory lung disease that causes obstructed airflow in the lung (49). The severity of COVID-19 in COPD patients is in general worse (50,51). Here, we did not observe an increased replication of SARS-CoV-2 in the four COPD HAE-ALI cultures. However, the FCS deletions were strongly suppressed. In the future, it is worthwhile to look into the retention of the FCS site in HAE-ALI cultures derived from donors of different sexes, age groups, and other chronic lung diseases, as well the expression levels of the cellular proteases furin, TMPRSS2, and cathepsin in these HAE-ALI cultures.
Importantly, the deletion of QTQTN (mut-del2) diminished SARS-CoV-2 entry and infection in Vero-E6 cells (41). Furthermore, three FCS-related deletion mutants, DPRRA;, DRAR;SVAS, and DNSPRRAR;SVA, have been shown to have reduced replication in vitro and lung disease in animal models (44,52,53), strongly supporting that the FCS is a virulence-related motif. Since the DQTQTN also abolished furin cleavage (41), we speculate that the mut-del2 mutant should have reduced lung disease in animals as well. Since the FCS is a key motif related to virulence, it is important to investigate the natural occurrence rate of the FCS region deletions, possibility or limitation of their human-to-human transmission, as well as their pathogenicity. Several studies tried to screen the FCS region deletions from patients-derived SARS-CoV-2. As discussed above, screening of 27 SARS-CoV-2-positive clinical specimens, including one specimen that had FCS deletions identified after passaging in Vero-E6 cells, failed to detect any FCS deletions (34). However, one study detected the 675 QTQTN 679 -deleted mutants (mut-del2) in 3 of 68 SARS-CoV-2-positive clinical specimens (33). In another detection of 51 SARS-CoV-2-positive patient specimens, although a high rate of 52.9% and 82.4% of the positive clinical samples contained the FCS upstream motif ( 661 ECDIPIGAG 669 ) and the PRRA deletions, respectively, the mutant population was at a very low level (0.33% 61.17% for FCS upstream motif deletion and 1.12% 61.21% for PRRA deletion) (54), arguing for the infectivity and transmissibility of these mutants. Notably, we detected a high rate of mut-del2 (7.86% and 20.97%) but not mut-del1 (,0.1%) in two SARS-CoV-2-positive nasopharyngeal aspirates. Thus, it is possible that mut-del1 is an artificial deletion from the inoculated virus cultured in Vero-E6 cells. Therefore, further studies need to be carried out to understand the significance of the prevalence of these FCS deletion mutants in the disease progression of COVID-19 patients.
Along with the usages of antibody drugs and the wide inoculation of the vaccine, which target the S protein, the virus may undergo further mutations under the pressure of human immune response. Supervision and screening the mutations in the S protein gene in clinical specimens is extremely important to identify the escaped isolates which may increase or decrease infectivity and transmissibility. Apparently, the in vitro-polarized HAE model, which can facilitate long-term infection of SARS-CoV-2, is an ideal model to study S gene mutants under various conditions.

MATERIALS AND METHODS
Ethics statement. Primary human bronchial epithelial cells were isolated from the lungs of healthy human donors and COPD patients by the Cells and Tissue Core of the Center for Gene Therapy, University of Iowa, and by the Department of Internal Medicine, University of Kansas Medical Center with the approvals of the institutional review boards (IRB) of the University of Iowa and University of Kansas Medical Center, respectively.
Viruses. SARS-CoV-2 (NR-52281), isolate USA-WA1/2020 (batch no. 70034262), was obtained from BEI Resources (Manassas, VA) and designated P0 passage. The virus used for infections of HAE-ALI was propagated once in Vero-E6 cells, designated P1 passage. Viruses were titrated by plaque assays on Vero-E6 cells and stored at 280°C as previously described (37). A biosafety protocol to work on SARS-CoV-2 infection in the biosafety level 3 (BSL3) lab was approved by the Institutional Biosafety Committee of the University of Kansas Medical Center.
HAE-ALI cultures. Primary HAE-ALI cultures, lots of B2-20, B3-20, B4-20, B9-20, B15-20, and B16-20, were directly prepared from bronchial airway epithelial cells isolated from various healthy donors. They were obtained from the Cells and Tissue Core of the Center for Gene Therapy, University of Iowa and polarized in Transwell inserts (0.33 cm 2 ; Costar, Corning, Tewksbury, WA). L209, KC19, and four COPD phenotypes (N707, UC24, L184, and N703) HAE-ALI cultures were prepared from propagated (passage 2) bronchial airway cells of the two healthy L209 and KC19 donors and four COPD donors. These six lung tissues were obtained from organ donors whose lungs were rejected for transplant and recovered for research by the Life Alliance Organ Recovery Agency at the University of Miami (Miami, Florida), Life Center Northwest (Seattle, Washington), and the Midwest Transplant Network (Kansas City, Kansas). Bronchial airway epithelia cells were polarized on Transwell inserts (1.1 cm 2 ) (Costar; Corning). The HAE-ALI cultures that had transepithelial electrical resistance (TEER) of .1,000 X·cm 2 , determined with an epithelial volt-ohm meter (MilliporeSigma, Burlington, MA), were used for infections.
Virus infections. Polarized HAE-ALI cultures were infected with SARS-CoV-2 at a multiplicity of infection (MOI) of 0.2 or 2. The inoculum of 100 ml or 300 ml was apically applied to the 0.33-cm 2 or 1.1-cm 2 Transwell inserts with an incubation period of 1 h at 37°C and 5% CO 2 . After aspiration of the inoculum, the apical surface of the insert was washed with 100 ml (or 300 ml) of Dulbecco's phosphate-buffered saline (D-PBS; Corning, Tewksbury, WA) three times to maximally remove the unbound viruses. The HAE-ALI cultures were then placed back into the incubator at 37°C and 5% CO 2 . To collect the apically released progeny from infected cultures, 100 ml (or 300 ml) of D-PBS was added to the apical chamber for 30 min at 37°C and 5% CO 2 . Thereafter, the apical wash was pipetted carefully from the apical chamber.
Immunofluorescence assay. The membrane of the infected HAE-ALI was cut out and fixed with 4% paraformaldehyde in phosphate-buffered saline (PBS) at 4°C overnight. The fixed membrane was washed in PBS for 5 min three times and then split into several pieces for whole-mount immunostaining. Following permeabilization with 0.2% Triton X-100 for 15 min at room temperature, the slide was incubated with a rabbit monoclonal anti-SARS-CoV-2 nucleocapsid (NP) (catalog no. 40143-R001; SinoBiological US, Wayne, PA) at a dilution of 1:25 in PBS with 2% fetal bovine serum for 1 h at 37°C. After washing, the slide was incubated with a rhodamine-conjugated secondary antibody, followed by staining of the nuclei with DAPI (49,6-diamidino-2-phenylindole).
RNA extraction. For total RNA extraction, four Transwell inserts of HAE-ALI cultures were dissolved in 1 ml of TRIzol reagent (ThermoFisher, Waltham, MA), following manufacturer's instructions. Viral RNA was isolated from the virions in apical washes. Fifty microliters of apical wash was used for the extraction of nuclease digestion-resistant viral RNA using the Quick-RNA Viral kit (catalog no. R1035; Zymo Research, Irvine, CA), as described previously (37). The final RNA samples were dissolved in 50 ml of deionized H 2 O and quantified for concentrations using a microplate reader (Synergy H; BioTek).
RNA-seq. For viral transcriptome, total RNA was extracted from HAE-ALI cultures infected with SARS-CoV-2 at an MOI of 0.2 and 2, respectively, or mock infected at 4 dpi. After RNA quality control and reverse transcription, DNA nanoball sequencing (DNSeq) was performed at BGI Genomics (Cambridge, MA). Briefly, RNA samples were tested using an Agilent 2100 bioanalyzer (Agilent RNA 6000 Nano kit). Samples with an RNA integrity number (RIN) of $8.0 were chosen for library construction. rRNA was removed from the total RNA samples by using RNase H or Ribo-Zero method. Then, samples were fragmented in a fragmentation buffer for thermal fragmentation to 130 to 160 nucleotides (nt). First-strand cDNA was generated by First Strand Mix, then Second Strand Mix was added to synthesize the second-strand cDNA. The reaction product was purified by magnetic beads and end repaired by addition of adaptors, followed by several rounds of PCR amplification to enrich the cDNA fragments. The PCR products were then purified and subjected to library quality control on the Agilent Technologies 2100 bioanalyzer. The double-stranded PCR products were heat denatured and circularized by the splint oligonucleotide sequence. The single-strand circle DNA (ssCir DNA) was formatted as the final library. The final library was amplified with phi29 to make DNA nanoballs (DNBs), which have more than 300 copies of one molecule. The DNBs were loaded into the patterned nanoarray and 2 Â 100 paired-end reads were generated in the way of combinatorial probe-anchor synthesis (cPAS).
For RNA-seq of the viral RNAs, the apical washes were collected from infected HAE-ALI cultures at the indicated times (days postinfection [dpi]; Table 3), and viral RNA was extracted as described above. For library preparation, the stranded-RNA seq kit (Thermo Fisher) was used following the manufacturer's protocol. The rRNA depletion step was added for the library preparation. The Illumina sequencer NextSeq550 was used to generate paired-end 2 Â 150 reads at GeneGoCell Inc. (San Diego, CA).
PCR amplicon-seq. For sequencing the FCS region of the S gene, viral RNA extracted from the apical washes was reverse transcribed using avian myeloblastosis virus (AMV) (Promega, Madison, WI). A 384nt sequence covering the S gene FCS region (nt 23,487 to 23,870) was amplified by 20 cycles of PCR using the primers containing the adaptor sequences: forward, 59-ACA CTC TTT CCC TAC ACG CTC TTC CGA TCT TTT TCA AAC ACG TGC AGG C-39, and reverse, 59-GAC TGG AGT TCA GAC GTG TGC TCT TCC GAT CTT CCA GTT AAA GCA CGG TTT AAT-39. The PCR products were analyzed on 1.5% agarose and excised for purification. The purified DNA samples were quantified on a microplate reader (Synergy LX; BioTek, Winooski, VT), and 500 ng of each DNA sample (20 ng/ml) was sent for PCR amplicon-seq (AMPLICON-EZ) at GENEWIZ, Inc. (South Plainfield, NJ).
(ii) Viral RNA-seq data (GeneGoCell Inc.). Raw sequence reads (fastq files) were processed through the following steps by the Genenius NGS bioinformatics pipeline (v2.1). Low-quality reads were removed using a quality score threshold of 25 (Q25). The resulting fastq files were analyzed by FastQC v0.10.1 for quality control (QC). Reads were aligned to the reference genome Wuhan-Hu-1. The alignment results were analyzed using the proprietary GeneGoCell program for variant calling on the target sites as follows. (i) Each read pair was processed to report the variant in the read. (ii) Each variant's allele frequency (AF) was calculated based on the number of variant reads/total reads covering the region (both variant and nonvariant). (iii) Variants with $1% AF and $3 variant reads were reported in a variant calling file (vcf). The output of the bioinformatics workflow was collected and further organized/processed in Microsoft Office 365.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. DATA SET S1, XLSX file, 0.9 MB.