Crystal structures of SARS-CoV-2 ADP-ribose phosphatase: from the apo form to ligand complexes

High-resolution crystal structures of the ADP-ribose phosphatase domain (ADRP; also known as the macrodomain) from SARS-CoV-2 with multiple ligands illustrate how the protein undergoes conformational changes to adapt to the ligand in the manner observed for homologues from other viruses.


Introduction
Over the past several months, Severe Acute Respiratory Syndrome coronavirus 2 (SARS-CoV-2) has been spreading across the world, causing a disease termed COVID-19 (Coronaviridae Study Group of the International Committee on Taxonomy of Viruses, 2020). The emergence of SARS-CoV-2 and its route of viral transmission remain a mystery, but it is believed to have a zoonotic origin, likely in bats (Tang et al., 2020). In late December 2019, several patients in Wuhan, People's Republic of China were diagnosed with severe pneumonia of an unknown aetiology (Koh et al., 2020;Ciotti et al., 2020;Mü nnich et al., 1988;Bogoch et al., 2020). The virus has since spread rapidly around the world, infecting millions and killing hundreds of thousands (https://coronavirus.jhu.edu/ map.html). These developments forced the World Health Organization to declare the outbreak a pandemic (https:// www.who.int/emergencies/diseases/novel-coronavirus-2019/ situation-reports). In the absence of natural community immunity, a tested vaccine or approved drugs that would help to control the epidemic, billions of people are currently under quarantine or lockdown to minimize further transmission.
The aetiological agent of COVID-19 has been isolated and was identified as a novel coronavirus resembling SARS-CoV, which was responsible for an outbreak of disease in 2002-2003 (Wu et al., 2020). Like other coronaviruses, SARS-CoV-2 residues, but sometimes also onto serine residues. In nucleic acids, the modification is attached to a phosphoryl group at the terminal end of DNA/RNA. ARTCs only carry out MARylation and preferentially act on arginine residues. De-ADPribosylation requires several enzymes. The polymeric fragment of the modification is removed by poly(ADPribosyl)glycohydrolase (PARG), while the final ADP-ribose unit is cleaved from glutamate/aspartate residues by the macrodomain. Enzymes from the (ADP-ribosyl) hydrolase (ARH) family have specificity for serine and arginine cargos (Fontana et al., 2017;Moss et al., 1988). The released ADPr is used in recycling pathways.
Currently, six classes of macrodomains have been distinguished: MacroH2-like, AlC1-like, PARG-like, Macro2-type, SUD-M-like (also known as Mac2/Mac3) and MacroD-type (Rack et al., 2016). These categories are derived from structural similarities rather than from sequence similarity. Most viral macrodomains fall into the MacroD-like family, which encompasses the human homologues MacroD1 and MacroD2, and is associated with the removal of mono(ADP-ribosylation). In vivo experiments have shown that viral MacroD-like macrodomains can hydrolyze ADPr-1 00 -phosphate, but the catalytic efficiency of this process has raised doubts about its physiological implications (Egloff et al., 2006). Instead, it has been suggested that these macrodomains might play roles analogous to MacroD1 and MacroD2. Indeed, de-ADPribosylating activities on proteins and RNA, including the removal of the entire PAR chain, have been demonstrated for several viral macrodomains, for example those from SARS-CoV and H-CoV-229E (Li et al., 2016;Eckei et al., 2017;Munnur et al., 2019). Binding to PAR (Egloff et al., 2006) and RNA (Malet et al., 2009) has also been reported. Importantly, the wide range of affinities and activities practically prevents similarity-based assumptions about the physiological roles of these proteins.
On the physiological level, such biochemical activity means that the role of ADRP would be to counteract the function of ARTD/PARP proteins. The latter enzymes are upregulated by interferon, indicating their relevance in the innate immune response. Systematic knockdown studies of all 17 mammalian PARPs in a Mouse Hepatitis Virus model with macrodomain mutants implicated PARP12 and PARP14 in the control of virus replication (Grunewald et al., 2019). Interestingly, PARP12, which is able to auto-ADP-ribosylate, belongs to a family of zinc-finger CCCH domains that are known to bind to RNAs, including those of viral origin. The antiviral properties of the protein have been linked to its enzymatic activity, its colocalization with polyribosomes via an RNA-binding domain and its interference with translation machinery (Atasheva et al., 2014). Also, PARP10, which is known to modify RNA (Munnur et al., 2019), has been shown to inhibit viral replication (Atasheva et al., 2012(Atasheva et al., , 2014. The role of ADRP in jeopardizing the immune response has also been emphasized by studies showing that viruses with mutated macrodomains replicated poorly in bone-marrow-derived macrophages, which are the primary cells involved in mounting the innate immune response (Grunewald et al., 2019). Along the same lines, viruses with deactivated macrodomains were sensitive to interferon pretreatment (Kuri et al., 2011). It has recently been proposed that de-mono-ADP-ribosylation of STAT1 by ADRP may be linked to the Cytokine Storm Syndrome that is commonly observed in severe cases of COVID-19 (Claverie, 2020).
Since the role of macrodomains in pathogenesis is essential, it appears that their inhibition may help to reduce the viral load and facilitate recovery. Therefore, these proteins might be attractive targets for the development of small-molecule antivirals, assuming that highly selective compounds could be found that discriminate between viral and human macrodomains (Virdi et al., 2020). As a step towards this goal, we have determined the crystal structure of SARS-CoV-2 ADRP in multiple states: in the apo form and in complexes with 2-(N-morpholino)ethanesulfonic acid (MES), AMP and ADPr. With the apo crystals diffracting to atomic resolution, we have developed a robust system for structure-based experiments to identify potential small-molecule inhibitors.

Gene cloning, protein expression and purification
The gene for ADRP was synthesized using a codonoptimization algorithm for Escherichia coli expression and was cloned into a pET-11a vector (Bio Basic) and transformed into the E. coli BL21(DE3) Gold strain (Stratagene). For preparative purposes, for each protein batch a 4 l culture of LB Lennox medium was grown at 37 C (190 rev min À1 ) in the presence of 150 mg ml À1 ampicillin. Once the culture reached an OD 600 of $1.0, the temperature setting was changed to 4 C. When the bacterial suspension had cooled to 18 C it was supplemented with the following components at the indicated concentrations: 0.2 mM isopropyl -d-1-thiogalactopyranoside, 0.1% glucose and 40 mM K 2 HPO 4 . The temperature was set to 18 C for a 20 h incubation. The bacterial cells were harvested by centrifugation at 7000g and the cell pellets were collected.
We have developed two protocols for purification, differing in the buffer composition. For the first batch of protein [ADRP(b1)] HEPES-NaOH pH 8.0 was used as the primary buffering component, while subsequent purifications [ADRP(b2)] used Tris-HCl at an identical pH value, unless stated otherwise. All of the steps were the same and are described below. The cell pellets were resuspended in 12.5 ml lysis buffer [500 mM NaCl, 5%(v/v) glycerol, 50 mM HEPES (or Tris) pH 8.0, 20 mM imidazole with 10 mM -mercaptoethanol in Tris-based purification] per litre of culture and sonicated at 120 W for 5 min (4 s on, 20 s off). The insoluble material was removed by centrifugation at 30 000g for 1 h at 4 C. The supernatant was mixed with 3 ml Ni 2+ Sepharose (GE Healthcare Life Sciences) equilibrated in lysis buffer with the imidazole concentration increased to 50 mM, and the suspension was applied onto a Flex-Column (Kimble; catalogue No. 420400-2510) connected to a Vac-Man vacuum manifold (Promega). Unbound protein was washed out by controlled suction with 160 ml lysis buffer (50 mM imidazole). The bound protein was eluted with 15 ml lysis buffer supplemented to 500 mM imidazole pH 8.0. 2 mM dithiothreitol was added and the protein was subsequently treated overnight at 4 C with Tobacco Etch Mosaic Virus (TEV) protease at a 1:20 protease:protein ratio. The protein solution was concentrated using a 10 kDa molecular-weight cutoff filter (Amicon-Millipore) and was further purified on a Superdex 200 size-exclusion column in lysis buffer in which themercaptoethanol had been replaced by 1 mM tris(2-carboxyethyl)phosphine (TCEP). The fractions containing ADRP were pooled and run one more time through Ni 2+ Sepharose. The flowthrough was collected and buffer-exchanged into crystallization buffer [150 mM NaCl, 20 mM HEPES pH 7.5 (or Tris pH 8.0), 1 mM TCEP] via tenfold concentration and dilution repeated three times. The protein was immediately used in crystallization trials. The final concentration of ADRP(b1) was 22 mg ml À1 and the final concentration of ADRP(b2) was 32 mg ml À1 .

Crystallization
Crystallization screening was performed by the sitting-drop vapour-diffusion method in 96-well CrystalQuick plates (Greiner Bio-One). The plates were set up with a Mosquito liquid dispenser (TTP Labtech) utilizing 400 nl of purified protein sample, which was mixed with 400 nl of well solution and equilibrated against 135 nl of reservoir solution. ADRP(b1) was used to grow apo-form crystals and for crystallization with AMP and ADPr. The AMP complex was prepared by adding AMP (pH 6.5) to a final concentration of 12 mM. To obtain the ADPr complex, the protein was mixed with ADPr in a 1:2 molar ratio. Crystallization screening was performed using the MCSG1, MCSG4 (Anatrace), SaltRX (Hampton Research), PACT Suite (Qiagen) and Index (Hampton Research) screens. ADRP(b2) was set up at 18 mg ml À1 with the Pi-minimal (Jena Biosciences), Protein Complex Suite (Qiagen) and Index (Hampton Research) screens. In all cases, the plates were incubated at 289 K. ADRP(b1) crystals grew from a condition consisting of 0.1 M CHES pH 9.5, 30%(w/v) PEG 3000, yielding the structure denoted ADRP-APO1. The complex with ADPr was obtained from 0.01 M sodium citrate, 33% PEG 6000, giving the structure labelled ADRP-ADPr. The complex with AMP was grown from 0.1 M MES pH 6.5, 30%(w/v) PEG 4000, giving the structure labelled ADRP-AMP. ADRP(b2) crystals grew from 0.1 M MES pH 6.5, 30%(w/v) PEG 4000, yielding the ADRP-MES complex, and from 30 mM sodium/potassium tartrate, 150 mM AMPD-Tris pH 9.0, 34.3%(w/v) PEG 5000 MME, giving the crystals labelled ADRP-APO2.

Data collection, structure determination and refinement
Prior to flash-cooling in liquid nitrogen, the crystals were cryoprotected in their mother liquor supplemented with either an increased concentration of PEG 3000 up to 40% (ADRP-APO1), 5% glycerol (ADRP-ADPr), 7% ethylene glycol (ADRP-AMP) or 10% ethylene glycol (ADRP-MES). The research papers ADRP-APO2 crystals did not require cryoprotection. The X-ray diffraction experiments were carried out at 100 K on the Structural Biology Center 19-ID beamline at the Advanced Photon Source, Argonne National Laboratory. The diffraction images were recorded on a PILATUS3 X 6M detector. The data set was processed and scaled with the HKL-3000 suite (Minor et al., 2006). Intensities were converted to structurefactor amplitudes using TRUNCATE (French & Wilson, 1978;Padilla & Yeates, 2003) from the CCP4 package . The ADRP-APO1 structure was determined by molecular replacement (MR) using MOLREP (Vagin & Teplyakov, 2010) as implemented in the HKL-3000 software package with the SARS-CoV ADRP structure (PDB entry 2acf; Saikatendu et al., 2005) as a search probe. The subsequent structures were solved by MR using the refined SARS-CoV-2 ADRP structure as a model. In all cases, the initial solution was manually adjusted using Coot (Emsley et al., 2010) and then iteratively refined using Coot, Phenix (Liebschner et al., 2019) and REFMAC Winn et al., 2011). The final rounds of refinement were carried out in Phenix (ADRP-APO1, ADRP-ADPr and ADRP-MES) or REFMAC (ADRP-AMP and ADRP-APO2). The ADRP-APO1 and ADRP-ADPr structures were refined with TLS parameterization of anisotropic displacement parameters, while for the remaining structures a full anisotropic refinement was calculated. The same 5% of reflections were excluded throughout refinement (in both the REFMAC and Phenix refinements). The final models show nearly complete polypeptide chains. The residues that were not modelled owing to a lack of interpretable electron density include Gly1-Glu2 and Glu170 in chains A and B for ADRP-APO1; Gly1-Glu2-Val3 and Leu169-Glu170 in chain A, and Gly1-Glu2 and Glu170 in chain B for ADRP-ADPr; Gly1-Glu2 in chain A, and Gly1-Glu2 and Glu170 in chain B for ADRP-AMP; Gly1-Glu2-Val3 and Glu170 for ADRP-MES; and Gly1-Glu2 for ADRP-APO2. The stereochemistry of the structure was checked with MolProbity (Chen et al., 2010), PROCHECK (Laskowski et al., 1993) and the Ramachandran plot, and was validated with the PDB Validation Server. The data-collection and processing statistics are given in Table 1 Table 1 Data-processing and refinement statistics.
The values in parentheses are for the highest resolution shell. For ADRP-ADPr, the values for another shell in which the completeness achieves $90% are also given.  Karplus & Diederichs (2012). § R = P hkl jF obs j À jF calc j = P hkl jF obs j for all reflections, where F obs and F calc are observed and calculated structure factors, respectively. R free is calculated analogously for the test reflections, which were randomly selected and excluded from the refinement. } As defined by MolProbity (Chen et al., 2010). structure factors have been deposited in the PDB under accession codes 6vxs, 6w02, 6w6y, 6wcf and 6wen.

Protein production and structure determination
We used an E. coli codon-optimized synthetic gene with a sequence corresponding to SARS-CoV-2 ADRP to produce the protein for crystallographic and biochemical studies. The protein was crystallized under several conditions, yielding five crystal structures, denoted ADRP-APO1 (apo form), ADRP-ADPr (complex with ADPr), ADRP-AMP (complex with AMP), ADRP-MES [complex with 2-(N-morpholino)ethanesulfonic acid] and ADRP-APO2 (apo form). The ADRP-APO1 structure was solved first by molecular replacement using the SARS-CoV homologue structure (PDB entry 2acf; Saikatendu et al., 2005) as a search model. All of the subsequent structures were solved by MR using the refined SARS-CoV-2 ADRP structure as a template.
ADRP-APO1 was refined to 2.01 Å resolution. The protein crystallized in space group P1, with two molecules in the unit cell. None of the polypeptides contains ligand in the catalytic pocket, but there is an N-cyclohexyl-2-aminoethanesulfonic acid (CHES) molecule bound on the surface. Like ADRP-APO1, the ADRP-ADPr structure was solved in space group P1. It was refined with reflections extending to 1.50 Å resolution, although 88% completeness was only achieved to 1.65 Å resolution. The ADPr ligand is well defined in the electron-density map in both polypeptide chains. ADRP-AMP crystallized in space group P2 1 , also with two molecules The structure of SARS-CoV-2 ADRP. The ribbon diagram shows ADRP-APO2. The ADPr ligand molecule is shown based on superposition with the ADRP-ADPr structure.
in the asymmetric unit. The atomic model was refined to 1.45 Å resolution. In the ADPr-binding pocket, one of the protein molecules (chain A) binds an AMP ligand with occupancy 0.8, while the other (chain B) binds a MES molecule with occupancy 0.7. In the latter case, there is additional electron density in the position where the adenine ring binds, but its quality prevented an acceptable interpretation. The ADRP-MES crystals also belonged to space group P2 1 , but with a smaller unit cell and with only one protein molecule in the asymmetric unit. These crystals diffracted to 1.07 Å resolution. Two MES molecules were identified in the structure: one in the ADPr-binding pocket and another on the protein surface. Finally, the ADRP-APO2 structure was determined in space group C2, with one protein chain in the asymmetric unit. We used reflections extending to 1.35 Å resolution in refinement. The binding pocket in ADRP-APO2 has no small molecule present, with the exception of solvent. In all structures the polypeptide chains are nearly complete, with only a few residues missing at the termini, as detailed in Section 2. The data-collection and structure-refinement statistics are given in Table 1. All of the structures have been deposited in the Protein Data Bank (PDB).

Substrate-binding pocket
The well defined substrate-binding pocket is created by the C-terminal edges of the central -strands 3, 5, 6 and 7 and the surrounding fragments, primarily loop 3-2, the N-terminus of helix 1 and a long loop connecting 6 to 5, which contains the short 3 10 -helix 3. These elements encompass four conserved sequence motifs (Fig. 2)   first such block is present at the end of 3 and is followed by another that extends into helix 2. The third segment corresponds to the end of 5 and the last segment overlaps with helix 3.
Within the crevice, four sections can be distinguished, corresponding to adenine-binding, distal ribose-binding, diphosphate-binding and proximal ribose-binding sites, denoted here as A, R1, P1-P2 and R2, respectively. The ADRP/ADPr structure illustrates how the ligand molecule interacts with these subsites (Figs. 3 and 4). The adenine moiety is sandwiched between 2 and 7 in a mostly hydrophobic environment created by Ile23, Val49, Pro125, Val155 and Phe156. Polar contacts are facilitated by Asp22, which forms a hydrogen bond to the N6 atom via its carboxylate group, and by the main-chain amide of Ile23, which binds to the N1 atom. In addition, water-mediated contacts link the N3 atom to the main chain of Ala154 and Leu126. The A site has limited sequence conservation: only Pro125 and Asp22 are conserved among the homologues. Other hydrophobic residues are replaced by side chains with a similar chemical character. The striking exception is Phe156, which is replaced by Asn in the closest homologues from SARS-CoV and MERS-CoV. In other viral representatives it is substituted by another hydrophobic residue. The distal ribose ring only participates in water-mediated hydrogen bonds to the mainchain amide of Leu126 and the carbonyl group of Ala154 via the ring O atom and to the Asp157 main chain and side chain via the OH2 0 group. The diphosphate moiety binds between two loops, 3-2 and 6-(3)-5, that cover three segments with high sequence conservation, including a glycine-rich segment (Gly46-Gly47-Gly48) within the former loop. Here, the ligand forms direct hydrogen bonds to the main-chain amides of Val49, Ser128, Gly130, Ile131 and Phe132 and water-mediated contacts with Ala38, Ala39, Ala50, Val95 and Gly97. An elaborate network of water molecules also links the diphosphate to Gly47, Ala129 and Asp157. Finally, the proximal ribose ring is stabilized in the pocket by hydrophobic interactions with Phe132 and Ile131, as well as a set of hydrogen bonds with Gly46 (OH2 0 ), Gly48 (OH1 0 ) and Asn40 (OH3 0 ). All of these residues are conserved. Additional bonds to the main-chain peptides of Asn40, Lys44 and Ala50 are water-mediated. Interestingly, as described above, only a few hydrogen bonds involve protein side chains, with most such contacts utilizing main-chain atoms. This may explain why there is less pressure on amino-acid sequence preservation, since main-chain interactions can be accomplished with multiple side-chain combinations.
Similar contacts are observed in the ADRP-AMP structure (Fig. 3), in which the ligand superposes well with the AMP portion of the ADPr ligand (Fig. 5). The ADRP-MES complex, however, presents a somewhat different scenario, in which the 2-N-morpholine ring takes the place of the proximal ribose and a sulfonic acid substitutes for the distal phosphate. The latter group forms the hydrogen bonds observed in the ADPr complex and an additional network of solventfacilitated contacts. The ring moiety appears to primarily be anchored by hydrophobic interactions with Phe132 and Ile131, and a hydrogen bond might potentially be present between the morpholine O atom and Asn40, although the geometry is rather unfavourable.

Ligand-induced conformational changes
While the interactions with ligands do not trigger major conformational changes in the overall structure, significant shifts are observed in the binding pocket itself. This is consistent with the differential scanning fluorimetry (DSF) measurements, which show that AMP and ADP do not affect the thermal stability of ADRP and only ADPr causes a small (2.5 C) increase in T m (Supplementary Fig. S1). Superpositions of the apo forms with the complexed proteins indicate several adjustments (Fig. 5). Firstly, in the A site Phe156 is brought closer to the pocket lumen when it is occupied by the nucleotide, as seen in the ADRP-ADPr and ADRP-AMP complexes. The glycine-rich 3-2 loop shows a high degree of flexibility, with roughly the same geometry but slightly different positions in ADRP-APO1, ADRP-MES, ADRP-AMP and ADRP-APO2 (Fig. 5). In the latter structure, however, the Gly46-Gly47 peptide bond also has an alternative conformation. A significant change is observed in ADRP-ADPr, where the loop has to rearrange to make the main-chain amide N atoms accessible for interactions with the ribose OH1 0 and OH2 0 groups. Finally, the geometry of the 6- Stereoview of ADPr binding in the SARS-CoV-2 ADRP binding site. contributing to the P1-P2 and R2 sites, also adapt depending on the ligand identity. The apo and AMP-bound forms contain the 3 element within the 6-5 linker, while in the ADPr and MES complexes this region does not observe 3 10 -helix parameters. The primary reason for this is the flipping of the Ala129-Gly130 peptide bond, which in the absence of phosphate 2, or its mimetic, has the carbonyl group facing the P2 site. Otherwise, with P2 occupied, the Gly130 amide group is hydrogen-bonded to the ligand, as described above. Ile131 and Phe132 are also observed in two states. With the R2 pocket empty or containing MES, Ile131 adopts the pt rotamer (p, plus, centred near +60 ; t, trans, centred near 180 ), while in the presence of a ribose ring it converts to the mt state (m, minus, centred near À60 ) (Hintze et al., 2016). Phe132 follows a somewhat similar pattern: in the first scenario it adopts an m-10 conformation, while in the latter it adopts an m-80 conformation. These rearrangements are necessary to provide sufficient room for the ligand and proper interactions. Similar transformations in the ligand-binding pocket have been reported for other homologues (Egloff et al., 2006;Piotrowski et al., 2009;Wojdyla et al., 2009). In the ADRP-APO1 structure, while the described geometry of the 6-3-5 linker remains similar to that in ADRP-APO2, the entire section and the neighbouring 1 are shifted away from the binding pocket.

Similarity of ADPr binding between ADRP homologues
The PDB currently contains four other coronaviral ADRPs in complexes with ADPr, from SARS-CoV (PDB entry 2fav; Egloff et al., 2006), MERS-CoV (PDB entries 5hol and 5dus; Lei et al., 2018;Cho et al., 2016) and H-CoV-229E (PDB entry 3ewr; Xu et al., 2009), and also those from the animal-infecting Infectious Bronchitis Virus (IBV; PDB entry 3ewp; Piotrowski et al., 2009) and Feline Infectious Peritonitis Virus (FIPV; PDB entry 3jzt; Wojdyla et al., 2009). The SARS-CoV and MERS-CoV complexes mostly follow the pattern of inter-actions observed in the current structure (Fig. 6). The ligand geometry is also preserved. The elements that are distinct are located in the A and R1 sites. Most strikingly, Phe156 in the SARS-CoV-2 ADRP is replaced by Asn157 in the SARS-CoV homologue (Asn154 in MERS-CoV in PDB entry 5dus) that stacks against the adenine ring and at the same time creates water-mediated hydrogen bonds to the distal ribose. Three other sequence discrepancies with the MERS-CoV ADRP are located in this region: Ile23 is replaced by Ala21 (Ile24 in SARS-CoV), Val49 by Ile47 (Val50 in SARS-CoV) and Leu160 by Val158 (Leu161 in SARS-CoV). These changes are most likely to be responsible for a small discrepancy between the ADPr molecules bound to these structures.
A more divergent picture is observed in the distant homologues from H-CoV-229E, IBV and FIPV (Fig. 6), mainly in the A and R1 sites, with the caveat that the distal ribose in the H-CoV-229E ADRP complex has the wrong stereochemistry. In these homologues, we observe sequence variation in the Phe156 position, which is replaced by other hydrophobic residues. The adenine ring is significantly shifted with respect to SARS-CoV-2 ADRP. The interaction between the N1 atom of adenine and the Asp22 equivalent is lost, even though the latter amino acid is conserved in the three-dimensional context (Asp20 in IBV does not overlap in the primary sequence). The distal ribose is better anchored in place: hydrogen bonds link it either to the glutamate residue (Glu156 in H-CoV-229E and Glu191 in FIPV) that substitutes Leu160 or to the serine in the position of Val155 (Ser160 in IBV). Another notable difference is observed in the R2 site, where the equivalents of Ile131 in the H-CoV-229E and IBV proteins adopt outlier rotamers, yet the electron-density maps allow the more favourable conformations seen in our structure to be modelled. In these two models, the proximal ribose adopts an configuration of the anomeric C atom (Fig. 6). Such a state, with partial occupancy, has also been reported for one of the SARS-CoV complexes (Egloff et al., 2006) and is linked to the  Comparison of SARS-CoV-2 ADRP-ADPr with homologous complexes. (a) Superposition of SARS-CoV-2 ADRP-ADPr (yellow) with homologues from MERS-CoV (PDB entry 5dus; blue) and SARS-CoV (PDB entry 2fav; purple). (b) Superposition of SARS-CoV-2 ADRP-ADPr with homologues from H-CoV-229E (PDB entry 3ewr; grey), IBV (PDB entry 3ewp; teal) and FIPV (PDB entry 3jzt; green). In (a), SARS-CoV-2 residues are labelled in black. In both panels, selected residues of homologous proteins are labelled.
alternative, apo-like conformation of Gly47-Gly48. The configuration is most likely to illustrate the geometry of the putative substrate, as only then is the hydroxyl group exposed to the solvent, providing room for the macromolecule portion of the substrate.
The common feature in the R2 site among all homologues is the presence of equivalents of Phe132, Asn40 and the glycinerich loop: these elements have been shown to be crucial for ADRP activity of the SARS-CoV protein through mutational studies (Egloff et al., 2006;Li et al., 2016) and the study of macrodomains from viruses from other families (Malet et al., 2009;Li et al., 2016).

Catalytic mechanism
In the absence of potential catalytic residues that are conserved across all of the macrodomains, Jankevicius and coworkers proposed an enzymatic mechanism involving substrate-assisted catalysis, in which a water molecule that is responsible for nucleophilic attack on the anomeric C atom of the ribose is activated by the P group (Jankevicius et al., 2013). In the current ADRP-ADPr structure, the candidate water molecule (Wat) binds to the amide group of Ala50, the carbonyl of Ala38, the O atom of P and the OH1 0 group of the proximal ribose ring of ADPr (Figs. 3 and 4). In the ADRP-APO2 and ADRP-MES structures, the last hydrogen bond is replaced by an interaction with the carbonyl group of Gly47, enhancing the proton-abstraction capabilities of the environment. Presumably, based on the models in which ADPr exists as an anomer, a similar network would be likely to occur in the complex with ADPr protein or RNA substrates, assuming no major conformational rearrangements. The water molecule is ideally located to pursue a nucleophilic attack on the anomeric C atom.

Conclusions
The large, multidomain Nsp3 includes an ADP-ribose phosphatase domain (ADRP/MacroD), which is believed to interfere with the host immune response by removing ADPribose from ADP-ribosylated proteins or RNA. Our study presents five atomic and high-resolution structures of SARS-CoV-2 ADRP, including the apo form and complexes with MES, AMP and ADPr. Their analysis shows that the enzyme undergoes conformational changes upon ADPr binding, which is in agreement with several previous reports showing such rearrangements. The shifts, which affect both the main chain and side chains, are observed primarily around the proximal ribose, where the protein has to make room for the sugar moiety and adjust to both configurations of the anomeric C atom. The active-site water molecule is proposed to carry out a nucleophilic attack on the anomeric C atom of the ribose. Our high-resolution studies of ADRP complexes with ligands allow accurate modelling of the active site of ADRP and will aid in the design of compounds that can inhibit the activity of this enzyme.

Related literature
The following reference is cited in the supporting information for this article: Huynh & Partch (2015).