CRISPR-Cas9 bends and twists DNA to read its sequence

In bacterial defense and genome editing applications, the CRISPR-associated protein Cas9 searches millions of DNA base pairs to locate a 20-nucleotide, guide-RNA-complementary target sequence that abuts a protospacer-adjacent motif (PAM). Target capture requires Cas9 to unwind DNA at candidate sequences using an unknown ATP-independent mechanism. Here we show that Cas9 sharply bends and undertwists DNA upon PAM binding, thereby flipping DNA nucleotides out of the duplex and toward the guide RNA for sequence interrogation. Cryo-electron-microscopy (EM) structures of Cas9:RNA:DNA complexes trapped at different states of the interrogation pathway, together with solution conformational probing, reveal that global protein rearrangement accompanies formation of an unstacked DNA hinge. Bend-induced base flipping explains how Cas9 “reads” snippets of DNA to locate target sites within a vast excess of non-target DNA, a process crucial to both bacterial antiviral immunity and genome editing. This mechanism establishes a physical solution to the problem of complementarity-guided DNA search and shows how interrogation speed and local DNA geometry may influence genome editing efficiency.


Introduction
CRISPR-Cas9 (clustered regularly interspaced short palindromic repeats, CRISPRassociated) nucleases provide bacteria with RNA-guided adaptive immunity against viral infections 1 and serve as powerful tools for genome editing in human, plant and other eukaryotic cells 2 . The basis for Cas9's utility is its DNA recognition mechanism, which involves base-pairing of one DNA strand with 20 nucleotides of the guide RNA to form an R-loop. The guide RNA's recognition sequence, or "spacer," can be chosen to match a desired DNA target, enabling programmable site-specific DNA selection and cutting 3 . The search process that Cas9 uses to comb through the genome and locate rare target sites requires local unwinding to expose DNA nucleotides for RNA hybridization, but it does not rely on an external energy source such as ATP hydrolysis 4,5 . This DNA interrogation process defines the accuracy and speed with which Cas9 induces genome edits, yet the mechanism remains unknown.
Molecular structures of Cas9 6 in pre-and post-DNA bound states revealed that the protein's REC ("recognition") and NUC ("nuclease") lobes can rotate dramatically around each other, assuming an "open" conformation in the apo Cas9 structure 7 and a "closed" conformation in the Cas9:guide-RNA 8,9 and Cas9:guide-RNA:DNA R-loop 9 structures. Furthermore, singlemolecule experiments established the importance of PAMs (5′-NGG-3′) for pausing at candidate targets 5,10 , and R-loop formation was found to occur through directional strand invasion beginning at the PAM 5 (Fig. 1a). However, these findings did not elucidate the actions that Cas9 performs to interrogate each candidate target sequence. These actions, repeated over and over, comprise the slowest phase of Cas9's bacterial immune function and its induction of site-specific genome editing 11 . Understanding the mechanism of DNA interrogation is critical to determining how Cas9 searches genomes to find bona fide targets and exclude the vast excess of non-target sequences.

Covalent cross-linking of Cas9 to DNA stabilizes the interrogation complex
Evidence that Cas9's target engagement begins with PAM binding 5 implies that during genome search, there exists a transient "interrogation state" in which Cas9:guide RNA has engaged with a PAM but not yet formed RNA:DNA base pairs (Fig. 1a). Cas9:guide-RNA complexes must repeatedly visit the interrogation state at each surveyed PAM, irrespective of the sequence of the adjacent 20-base-pair (bp) candidate complementarity region (CCR). While this state is the key to Cas9's DNA search mechanism, the interrogation complex has so far evaded structure determination due to its transience, with an estimated lifetime of <30 milliseconds in bacteria 11 .
To trap the Cas9 interrogation complex, we replaced residue Thr1337 with cysteine in S. pyogenes Cas9, the most widely used genome editing enzyme, and combined this protein with a single-guide RNA (sgRNA) 3 and a 30-bp DNA molecule functionalized with an N 4 -cystamine cytosine modification 12 (Fig. 1b). The DNA included a PAM but lacked any complementarity to the sgRNA spacer (Fig. 1c). Reaction of the cysteine thiol with the cystamine creates a protein:DNA disulfide cross-link on the side of the PAM distal to the site of R-loop initiation (Fig. 1b,d). The position of the cross-link was chosen based on previous high-resolution structures of the Cas9:PAM interface 9,13 (Extended Data Fig.  1a). Incubation of Cas9 T1337C with sgRNA and the modified DNA duplex resulted in a decrease in electrophoretic mobility for ~70% of the total protein mass under denaturing but non-reducing conditions (Extended Data Fig. 1b), consistent with protein:DNA cross-link formation. The cross-link did not inhibit Cas9's ability to cleave sgRNA-complementary DNA (Extended Data Fig. 1c), suggesting that the enzyme is not grossly perturbed by the introduced disulfide. More importantly, mechanistic hypotheses revealed by cross-linked complexes can be tested in non-cross-linked complexes.
We subjected the cross-linked interrogation complex to cryo-EM imaging and analysis (Extended Data Figs. 2, 3a-b). Ab initio volume reconstruction, refinement and modeling revealed two structural states of the complex. In one, the DNA lies as a linear duplex across the surface of the open form of the Cas9 ribonucleoprotein (Fig. 2a). Remarkably, in the other state, Cas9's two lobes pinch the DNA into a V shape whose helical arms meet at the site of R-loop initiation, employing a bending mode that underwinds the DNA duplex (Fig.  2b).

The linear-DNA conformation reveals a DNA scanning state of Cas9
In the "linear-DNA" conformation ( Fig. 2a), the interface of the DNA with the PAMinteracting domain is similar to that seen in the crystal structure of Cas9:R-loop 9 (Extended Data Fig. 3c). However, the REC lobe of the protein is in a position radically different from that observed in all prior structures of nucleic-acid-bound Cas9 8,9,13,14 , having rotated away from the NUC lobe into an open-protein conformation that resembles the apo Cas9 crystal structure 7 (Fig. 2a).
Notably, a linear piece of DNA docked into the PAM-binding cleft would result in a severe structural clash 4 in either the Cas9:R-loop 9 or the Cas9:sgRNA 8 crystal structure but only a minor one in the apo Cas9 crystal structure 7 , which can be relieved by slightly tilting and bending the DNA (Extended Data Fig. 3c). We propose that the open-protein conformation, originally thought to be unique to nucleic-acid-free Cas9, can also be adopted by the sgRNA-bound protein to enable its interaction with linear DNA. Indeed, cryo-EM analysis of the Cas9:sgRNA complex revealed only particles in the open-protein state (Extended Data Fig. 4), indicating that the original crystal structure of Cas9:sgRNA 8 , which was in a closedprotein state, represented only one possible conformation of the complex that happened to be captured in that crystal form. Single-molecule Förster resonance energy transfer experiments also support the ability of Cas9:sgRNA to access both closed and open conformations 15 . The linear-DNA/open-protein conformation captured in our cryo-EM structure, then, may represent the conformation of Cas9 during any process for which it must accommodate a piece of linear DNA, such as during sliding 10 or initially engaging with a PAM.

The bent-DNA conformation reveals PAM-adjacent DNA unwinding by Cas9
In the "bent-DNA" Cas9 interrogation complex, the protein grips the PAM as in the linear-DNA complex. The CCR (candidate complementarity region), on the other hand, is tilted at a 50° angle to the PAM-containing helix and leans against the REC lobe, which has risen into the same "closed" position as in the Cas9:sgRNA crystal structure (Fig. 2b, 3a). Compared to the high-resolution cryo-EM density contributed by the PAMcontaining duplex, CCR density is poorly resolved (Fig. 3a, Extended Data Fig. 3b), reflecting conformational heterogeneity; however, even at low resolution, its distinct helical shape (Fig. 2b, 3a) enabled construction of an atomic model that adheres to B-form DNA constraints between the PAM-distal tip and position +3 of the CCR (Fig. 3b). Connection of this B-form helix to the PAM-containing helix requires backbone distortion and helical underwinding at precisely the position from which an R-loop would be initiated if the sgRNA were complementary to the CCR 5 (Fig. 3b,c). Underwinding is accomplished through major groove compression (Fig. 3b), as observed for other protein-induced bends [16][17][18][19][20][21][22][23] .
At the distorted bending vertex, a missing wedge of cryo-EM density appears across from non-target-strand nucleotides Ade(+1) and Ade(+2), suggesting that target-strand nucleotides Thy(+1) and Thy(+2) have become unpaired from their partners (Fig. 3d). The overall weakness of density for Thy(+1) and Thy(+2) suggests dramatic mobility, and the modeled conformation of those nucleotides represents a physically plausible member of a diverse conformational ensemble (which also agrees with solution experiments to be discussed shortly). Therefore, in the bent-DNA conformation, two helical arms join at an underwound hinge whose target-strand nucleotides are heterogeneously positioned. DNA disorder is consistent with the function of the Cas9 interrogation complex, which is to flip target-strand nucleotides from the DNA duplex toward the sgRNA to test base pairing potential.

Cas9:sgRNA bends DNA in non-cross-linked complexes
To determine whether unmodified Cas9 can bend DNA, we produced interrogation complexes that lacked the cross-link and tested them in a DNA cyclization assay 24 (Fig.  4a). We created a series of 160-bp double-stranded DNA substrates that all bore a "J"-shape due to the inclusion of a special A-tract sequence that forms a protein-independent 108° bend 25 . Each substrate also included two PAMs spaced by a near-integral number of B-form DNA turns (31 bp). In eleven versions of this substrate, we varied the number of base pairs between the A-tract and the proximal PAM from 21 to 31 bp, effectively rotating the Cas9 binding sites around an entire turn of a B-form DNA helix (Extended Data Fig. 5). If Cas9 bends the DNA, each additional base pair added to the variable (21-31 bp) region will turn the Cas9-induced bend by ~34° with respect to the fixed A-tract bend. The relative direction of the two bends can be discerned from each substrate's ligase-catalyzed cyclization efficiency, which should increase when the two bends point in the same direction (DNA assumes a "C" shape) and decrease when the two bends point in opposite directions (DNA assumes an "S" shape), as a function of the proximity of the DNA ends to be sealed (Fig. 4a). We measured the cyclization efficiency of each substrate in the absence and presence of Cas9 and an sgRNA lacking homology to either of the two CCR sequences. Consistent with expectations for a bend, the Cas9-dependent enhancement (or reduction) of cyclization efficiency tracked a sinusoidal shape when plotted against the A-tract/PAM spacing, reflecting phase-dependent variation in the end-to-end distance of different substrates (Fig. 4b, Extended Data Fig. 6a, b). Additionally, by interpreting the absolute phase of the cyclization enhancement curve (that is, the spacing value at which the peak occurs, where the two bends point in the same direction) in the context of the known direction of the A-tract bend 24,26 , we conclude that the bending direction observed in this experiment is the same as that observed in the bent-DNA cryo-EM structure (Fig. 4c, Extended Data Fig. 6b, Supplementary Information).
Next, we wondered whether local DNA conformations observed in the cross-linked interrogation complex resemble those in the native complex. To characterize DNA distortion with single-nucleotide resolution, we measured the permanganate reactivity of individual thymines in the target DNA strand of a non-cross-linked interrogation complex (Fig. 5a). As anticipated for protein-induced base unstacking 27,28 , we detected a PAM-and Cas9dependent increase in permanganate reactivity at Thy(+1) and Thy(+2) (Fig. 5a-c, Extended Data Fig. 7a, b). The relationship between permanganate reactivity and Cas9:sgRNA concentration at these thymines suggests that the affinity of Cas9:sgRNA for this sequence is weak (10 μM), as expected for this necessarily transient interaction with off-target DNA (Fig. 5b, Extended Data Fig. 7a). Remarkably, Thy(+1) and Thy(+2) are precisely the nucleotides that appeared to be unpaired in the bent-DNA cryo-EM map of the crosslinked Cas9 interrogation complex (Fig. 3d), which shared the same DNA sequence as the permanganate substrate. These results suggest that Cas9 bends DNA through a backbone distortion that exposes target-strand nucleobases +1 and +2 to solvent and, more generally, that cryo-EM analysis of the cross-linked complex captured meaningful structural features of the native complex.

A Cas9 conformational rearrangement accompanies DNA bending
The described linear-and bent-DNA conformations present a new model for Cas9 function in which open-protein Cas9:sgRNA first associates with the PAM on linear DNA, then engages a switch to the closed-protein state to bend the DNA and expose its PAMadjacent nucleobases for interrogation (Supplementary Video 1). Because this transition involves energetically unfavorable base unstacking, we wondered how the unstacked state is stabilized.
Remarkably, the "phosphate lock loop" (Lys1107-Ser1109), which was proposed to support R-loop nucleation by tugging on the target-strand phosphate between the PAM and nucleotide +1 13 , is disordered in the linear-DNA structure but stably bound to the target strand in the bent-DNA structure (Extended Data Fig. 8a, b), highlighting this contact as a potential energetic compensator for the base unstacking penalty. In the permanganate assay, mutation of the phosphate lock loop decreased activity to the level observed without Cas9 or with a Cas9 mutant deficient in PAM recognition (which lacks the PAM-binding arginines, "xPBA") ( Fig. 5c, Extended Data Fig. 7b), indicating that the loop may play a role in DNA bending. However, a negative result in this assay could be attributed either to weakened DNA bending activity or to an overall destabilization of the protein:DNA interaction 4 .
Another notable structural element is a group of lysines (Lys233/Lys234/Lys253/Lys263, termed here the "helix-rolling basic patch") on REC2 (REC lobe domain 2) that contact the DNA phosphate backbone (at bp +8 to +13) in both the linear-and bent-DNA structures, an interaction that has not been observed before (Extended Data Fig. 8a, c). Mutation of these lysines attenuated anisotropy in the cyclization assay (Fig. 4b, Extended Data Fig. 6a) and abolished Cas9's permanganate sensitization activity (Fig. 5c, Extended Data Fig. 7b). Structural modeling of the linear-to-bent transition (Supplementary Video 2) suggests that the helix-rolling basic patch may couple DNA bending to inter-lobe protein rotations similar to those observed in multi-body refinements 29 of the cryo-EM images (Supplementary Video 3-4). Consensus EM reconstructions also revealed large segments of the REC lobe and guide RNA that become ordered upon lobe closure (Supplementary Video 1, Supplementary Information), implying that Cas9 can draw upon diverse structural transitions across the complex to regulate DNA bending.

The bent-DNA state makes R-loop nucleation structurally accessible
We propose that the function of the bent-DNA conformation is to promote local base flipping that can lead to R-loop nucleation. To probe the structure of a complex that has already proceeded to the R-loop nucleation step, we employed the same cross-linking strategy with adjusted RNA and DNA sequences that allow partial R-loop formation (Fig.  1c). Cryo-EM analysis of this construct revealed nucleotides +1 to +3 of the DNA target strand hybridized to the sgRNA spacer ( Fig. 6a, Extended Data Fig. 9). In contrast to the disorder that characterized this region in the bent-DNA map, all three nucleotides are well-resolved, apparently stabilized by their hybridization to the A-form sgRNA spacer. The increase in resolution extends to the non-target strand and to the more PAM-distal regions of the CCR, suggesting that the DNA becomes overall more ordered in response to R-loop nucleation. The ribonucleoprotein architecture resembles that of the bent-DNA structure except for slight tilting of REC2, which accommodates a repositioning of the CCR duplex toward the newly formed RNA:DNA base pairs (Fig. 6a). Therefore, unstacked nucleotides in the bent-DNA state can hybridize to the sgRNA spacer with minimal global structural changes, further supporting the bent-DNA structure as a gateway to R-loop nucleation.
Our structures outline a model for a poorly understood aspect of Cas9 function that is fundamental to CRISPR target search and capture (Fig. 6b). First, open-conformation Cas9:sgRNA associates with the PAM of a linear DNA target. By engaging the open-toclosed protein conformational switch, Cas9 bends and twists the DNA to locally unwind the base pairs next to the PAM. If target-strand nucleotides are unable to hybridize to the sgRNA spacer, the candidate target is released, and Cas9 proceeds to the next candidate. If the target strand is sgRNA-complementary, unwound nucleotides initiate an RNA:DNA hybrid that can expand through strand invasion to a full 20-bp R-loop, activating DNA cleavage.
Due to its energetic linkage to base flipping 30 , DNA bending provides a viable mechanical solution to any biological problem that requires unrestricted access to nucleobases 31,32 , including the sequence interrogation challenge faced by all DNA-targeting CRISPR systems 4,[33][34][35][36] . In our key structural snapshot of this process, Cas9 specifically employs a bending mode that involves underwinding (Fig. 3), which may be a topological necessity for the downstream propagation of flipping events, and which could underlie some features of Cas9's mechanical sensitivity [37][38][39] . In contrast, certain methyltransferases that flip only one nucleotide at a time can afford to do so without gross helical distortion 40,41 .
Interestingly, other proteins define the vertex of an underwound DNA bend using intimate contacts to the distorted nucleotides, either to intact base pairs in the case of transcription factors [18][19][20]23 or to flipped bases and their estranged partners in the case of base excision repair enzymes 16,17,21,22,42 . During initial DNA interrogation, Cas9 appears to make no such contacts, instead straddling the bending vertex and relying on mechanical strain to stabilize extrahelical nucleotide conformations without restricting their motion. To catalyze RNA-programmable strand exchange, Cas9 must distort DNA independent of its sequence, likening its functional constraints to those of the filamentous recombinase RecA 43 , which non-specifically destabilizes candidate DNA via longitudinal stretching [44][45][46] .
The mechanism illustrated here reveals for the first time the individual steps that comprise the slowest phase of Cas9's genome editing function 11 . The energetic tuning of binding, bending, and RNA:DNA hybrid nucleation dictates the speed of target capture-bent/ unstacked states must be stable enough to promote fast transitions to RNA-hybridized states, but not so stable that Cas9 wastes undue time on off-target DNA. Thus, the energetic landscape surrounding the states identified in this work will be a crucial subject of study to understand the success of current state-of-the-art genome editors and to inform the engineering of faster ones. Finally, DNA in eukaryotic chromatin is rife with bends, due either to intrinsic structural features of the DNA sequence 47 or to interactions with looping proteins 48 . Because the Cas9-induced DNA bend described here has a well-defined direction that may either match or antagonize incumbent bends, it will be important to test how local chromatin geometry affects Cas9's efficiency in both dissociating from off-target sequences and opening R-loops on real target sequences.

Nucleic acid preparation
All DNA oligonucleotides were synthesized by Integrated DNA Technologies except the cystamine-functionalized target strand, which was synthesized by TriLink Biotechnologies (with HPLC purification). DNA oligonucleotides that were not HPLC-purified by the manufacturer were PAGE-purified in house (unless a downstream preparative step involved another PAGE purification), and all DNA oligonucleotides were stored in water. Duplex DNA substrates were annealed by heating to 95°C and cooling to 25°C over the course of 40 min on a thermocycler. Guide RNAs were transcribed and purified as described previously 28 , except no ribozyme was included in the transcript. Briefly, in vitro transcription reactions included PCR-assembled DNA template, 40 mM Tris-Cl (pH 7.9 at 25°C), 25 mM MgCl 2 , 10 mM DTT, 0.01% (v/v) Triton X-100, 2 mM spermidine, 5 mM of each NTP, and 100 μg/mL T7 RNA polymerase. Transcription was allowed to proceed for 2.5 hr at 37°C, after which RNA was purified by urea-PAGE, ethanol-precipitated, and resuspended in RNA storage buffer (0.1 mM EDTA, 2 mM sodium citrate, pH 6.4). All sgRNA molecules were annealed (80°C for 2 min, then moved directly to ice) in RNA storage buffer prior to use. For both DNA and RNA, A 260 was measured on a NanoDrop (Thermo Scientific), and concentration was estimated according to extinction coefficients reported previously 49 . Oligonucleotide sequences can be found in the Supplementary Information.

SDS-PAGE analysis
For non-reducing SDS-PAGE, thiol exchange was first quenched by the addition of 20 mM S-methyl methanethiosulfonate (S-MMTS). Then, 0.25 volumes of 5X non-reducing SDS-PAGE loading solution (0.0625% w/v bromophenol blue, 75 mM EDTA, 30% glycerol, 10% SDS, 250 mM Tris-Cl, pH 6.8) were added, and the sample was heated to 90°C for 5 minutes before loading of 3 pmol onto a 4-15% Mini-PROTEAN TGX Stain-Free Precast Gel (Bio-Rad), alongside PageRuler Prestained Protein Ladder (Thermo Scientific). Gels were imaged using the Stain-Free imaging protocol (5-min activation, 3-s exposure) of Bio-Rad Image Lab 5.2.1 on a Bio-Rad ChemiDoc. For reducing SDS-PAGE, no S-MMTS was added, and 5% β-mercaptoethanol (βME) was added along with the non-reducing SDS-PAGE loading solution. For radioactive SDS-PAGE analysis, a 4-20% Mini-PROTEAN TGX Precast Gel (Bio-Rad) was pre-run for 20 min at 200 V (to allow free ATP to migrate ahead of free DNA), run with radioactive sample for 15 min at 200 V, dried (80°C, 3 hours) on a gel dryer (Bio-Rad), and exposed to a phosphor screen, subsequently imaged on an Amersham Typhoon using the Amersham Typhoon Control Software 2.0.0.6 (Cytiva).

Nucleic acid radiolabeling
Standard 5′ radiolabeling was performed with T4 polynucleotide kinase (New England Biolabs) at 0.2 U/μL (manufacturer's units), 1X T4 PNK buffer (New England Biolabs), 400 nM DNA oligonucleotide, and 200 nM [γ− 32 P]-ATP (PerkinElmer) for 30 min at 37°C, followed by a 20-min heat-killing incubation at 65°C. Radiolabeled oligos were then buffer exchanged into water using a Microspin G-25 spin column (GE Healthcare). For 5′ radiolabeling of sgRNAs, the 5′ triphosphate was first removed by treatment with Quick CIP (New England BioLabs, manufacturer's instructions). The reaction was then supplemented with 5 mM DTT and the same concentrations of T4 polynucleotide kinase (New England BioLabs) and [γ− 32 P]-ATP (PerkinElmer) used for DNA radiolabeling, and the remainder of the protocol was completed as for DNA.

Radiolabeled target-strand cleavage rate measurements
DNA duplexes at 10X concentration (20 nM radiolabeled target strand, 75 μM unlabeled non-target strand) were annealed in water with 60 μM cystamine dihydrochloride (pH 7). A 75-μL reaction was assembled from 15 μL 5X Mg-free disulfide reaction buffer (250 mM Tris-Cl, pH 7.4 at 25°C, 750 mM NaCl, 5 mM EDTA, 25% glycerol, 500 μM DTT), 7.5 μL 600 μM cystamine dihydrochloride (pH 7), 37.5 μL water, 3.75 μL 80 μM Cas9, 3.75 μL 100 μM sgRNA, 7.5 μL 10X DNA duplex. The reaction was incubated at 25°C for 2 hours, at which point the cross-linked fraction had fully equilibrated. To non-reducing or reducing reactions, 5 μL of 320 mM S-MMTS or 80 mM DTT (respectively) in 1X Mg-free disulfide reaction buffer was added. Samples were incubated at 25°C for an additional 5 min, then cooled to 16°C and allowed to equilibrate for 15 min. One aliquot was quenched into 0.25 volumes 5X non-reducing SDS-PAGE solution and subject to SDS-PAGE analysis to assess the extent of cross-linking (for the reduced sample, no βME was added, as the DTT had already effectively reduced the sample). Another aliquot was quenched for reducing urea-PAGE analysis as timepoint 0. DNA cleavage was initiated by combining the remaining reaction volume with 0.11 volumes 60 mM MgCl 2 . Aliquots were taken at the indicated timepoints for reducing urea-PAGE analysis.

Fluorescence and autoradiograph data analysis
Band volumes in fluorescence images and autoradiographs were quantified in Image Lab 6.1 (Bio-Rad). For fluorescence images recorded by the ChemiDoc, Image Lab's native .scn files were used for quantification. For images recorded by the Typhoon, the Typhoon software's native .gel files (square root encoded) were used for quantification. Data were fit by the least-squares method in Prism 7 (GraphPad Software).

Cryo-EM data processing and model building
Details of cryo-EM data processing and model building can be found in the Supplementary Information.
Because piperidine treatment leads to low levels of cleavage at every nucleotide, the exhaustive single-nucleotide ladder could be used to assign band identities, also confirmed by the dark/light pattern (piperidine-catalyzed cleavage at thymines is less efficient than at other nucleotides in the absence of permanganate modification and more efficient in the presence of permanganate modification). Data analysis was performed as follows: let v i denote the volume of band i in a lane with n total bands (band 1 is the shortest cleavage fragment, band n is the topmost band corresponding to the starting/uncleaved DNA oligonucleotide). The probability of cleavage at thymine i is defined as: Oxidation probability of thymine i is defined as: p ox,i = p cleave,i,+ pm − p cleave,i,−pm , where +pm indicates the experiment that contained 10 mM KMnO 4 and −pm indicates the nopermanganate experiment. An extensive description of this type of analysis can be found in ref. 28 .

Preparation of DNA cyclization substrates
Each variant DNA cyclization substrate precursor was assembled by PCR from two amplification primers (one of which contained a fluorescein-dT) and two assembly primers. Each reaction was 400 μL total (split into 4 × 100-μL aliquots) and contained 1X Q5 reaction buffer ( PCR products were phenolchloroform-extracted, ethanol-precipitated, and resuspended in 80 μL water. To this was added 15 μL 10X CutSmart buffer, 47.5 μL water, and 7.5 μL ClaI restriction enzyme (10,000 units/mL, New England BioLabs), and digestion was allowed to proceed overnight at 37°C. Samples were then combined with 0.25 volumes 5X native quench solution (25% glycerol, 250 μg/mL heparin, 125 mM EDTA, 1.2 mg/mL proteinase K, 0.0625% w/v bromophenol blue), incubated at 55°C for 15 minutes, and resolved on a preparative native PAGE gel (8% acrylamide:bis-acrylamide 37.5:1, 0.5X TBE) at 4°C. Fluorescent bands, made visible on a blue LED transilluminator, were cut out, and DNA was extracted, ethanolprecipitated, and resuspended in water.

Cyclization efficiency measurements
Each cyclization reaction contained the following components: 1 μL 10X T4 DNA ligase reaction buffer (New England BioLabs), 2 μL water, 1 μL 10X ligation buffer additives (400 μg/mL UltraPure BSA, 100 mM KCl, 0.1% NP-40), 2 μL 80 μM Cas9 (or proteinpurification size exclusion buffer), 2 μL 100 μM sgRNA (or RNA storage buffer), 1 μL 25 nM cyclization substrate, 1 μL T4 DNA ligase (400,000 units/mL, New England BioLabs) (or ligase storage buffer). All reaction components were incubated together at 20°C for 15 minutes prior to reaction initiation except for the ligase, which was incubated separately. Reactions were initiated by combining the ligase with the remainder of the components, allowed to proceed at 20°C for 30 minutes, then quenched with 2.5 μL 5X native quench solution. Samples were then incubated at 55°C for 15 minutes, resolved on an analytical native PAGE gel (8% acrylamide:bis-acrylamide 37.5:1, 0.5X TBE) at 4°C, and imaged for fluorescein on an Amersham Typhoon (Cytiva). Monomolecular cyclization efficiency (MCE) for a given lane is defined as (band volume of circular monomers)/(sum of all band volumes). Bimolecular ligation efficiency (BLE) is defined as (sum of band volumes of all linear/circular n-mers, for n≥2)/(sum of all band volumes). The non-specific degradation products indicated in Extended Data Fig. 6a were not included in the analysis.

Data availability
All data generated or analyzed during this study are included within this manuscript and its supporting information files except for the cryo-EM data/models, which can be accessed as follows: the DNA is bent next to the PAM (visible for class 1 at lower contour). In classes 3/4/8 the DNA continues along a more linear trajectory for half a turn past the PAM, into the region normally occupied by REC2; in these classes, density in the region of the putative collision (black arrow) is uninterpretable as either protein or DNA, likely due to particle damage, and is thus colored gray. x + ϕ 0 + b, with the following constraints: A > 0, b > A. The average of 224° and 212° is reported in Fig. 4c. J, J-factor (defined in Kahn & Crothers, 1992); Φ, phase difference; CBS, CAP-binding site. phosphate between nucleotides 0 and +1 of the target strand. c, Comparison of helix-rolling basic patch (magenta) in various structures, within unsharpened cryo-EM maps (first panel, threshold 5σ; second panel, threshold 6σ). Fig. 9. Cryo-EM analysis of Cas9:sgRNA:DNA with 3 RNA:DNA matches.

Supplementary Material
Refer to Web version on PubMed Central for supplementary material.