Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. 2010 Sep 14; 107(37): 16078–16083.
Published online 2010 Aug 30. doi:  10.1073/pnas.1007144107
PMCID: PMC2941324

Structure and inhibition of herpesvirus DNA packaging terminase nuclease domain


During viral replication, herpesviruses package their DNA into the procapsid by means of the terminase protein complex. In human cytomegalovirus (herpesvirus 5), the terminase is composed of subunits UL89 and UL56. UL89 cleaves the long DNA concatemers into unit-length genomes of appropriate length for encapsidation. We used ESPRIT, a high-throughput screening method, to identify a soluble purifiable fragment of UL89 from a library of 18,432 randomly truncated ul89 DNA constructs. The purified protein was crystallized and its three-dimensional structure was solved. This protein corresponds to the key nuclease domain of the terminase and shows an RNase H/integrase-like fold. We demonstrate that UL89-C has the capacity to process the DNA and that this function is dependent on Mn2+ ions, two of which are located at the active site pocket. We also show that the nuclease function can be inactivated by raltegravir, a recently approved anti-AIDS drug that targets the HIV integrase.

Keywords: crystal structure, human cytomegalovirus, DNA encapsidation, endonuclease inhibitor, combinatorial library

Human cytomegalovirus (HCMV) is a member of the herpes family of viruses or Herpesviridae. This group includes the human pathogens herpes simplex virus type 1 and 2, varicella zoster virus, Epstein–Barr virus, cytomegalovirus, roseolovirus, and Kaposi sarcoma-associated herpesvirus. Among these, HCMV, which belongs to the Betaherpesviridae subfamily, is widespread throughout the human population and causes the most morbidity and mortality. HCMV infection is rarely serious for people with a competent immune system, although it persists in the host cells and propagates to other individuals. In contrast, infection or reactivation of HCMV is a major cause of life-threatening complications in immunocompromised individuals, such as organ transplant recipients and leukemia or AIDS patients, and is the most significant viral cause of birth defects in industrialized countries (1).

The HCMV genome consists of linear dsDNA of 230 kb with the highest coding capacity among Herpesviridae. HCMV, like all other herpesviruses, replicates its genomic DNA into high molecular mass head-to-tail concatemers. The newly synthesized multicopy chains of DNA are then excised into unit-length genomes and each genome is packaged singly into one viral procapsid (2). Maturation into unit-length genome molecules involves viral DNA recognition and cleavage at the site-specific pac motifs, which are redundant motifs found at both the 5′ and 3′ genomic termini (2, 3). The dsDNA endonuclease and packaging activities are performed by a protein complex, the terminase, composed by subunits UL56 and UL89 (4, 5). UL56 has been reported to recognize the pac motif (6).

After cleaving one end of the DNA, the terminase translocates the viral DNA into the procapsid, deriving energy for this process through ATP hydrolysis. The procapsid is filled with the DNA molecule and the terminase performs a second dsDNA cleavage, thereby concluding the translocation (3). UL89 has predicted ATPase activity and is most probably the molecular motor for helicase-like DNA translocation (7). In addition, UL89 binds and cleaves DNA molecules, and this activity is enhanced when UL56 is present (5). In agreement, other studies have shown that UL89 interacts specifically with the C-terminal part of UL56 (8). UL56 has the capacity to bind linearized DNA but only upon addition of UL89 is the DNA cut into smaller fragments. This observation indicates that these proteins mediate a concerted reaction of DNA recognition and cleavage (5).

Proteins homologous to UL89 are known in all herpesviruses, as are the other terminase subunits, thus indicating that the DNA packaging mechanism is highly conserved (9). The ul89 ORF includes two exons separated by a 3,902 bp intron. It encodes a two-domain 674 amino acid protein with predicted N-terminal ATPase and C-terminal nuclease activity (Fig. S1) (7). Bacteriophages translocate shorter DNA molecules into their capsids by similar packaging systems (9). Their terminases have been intensively studied, in particular those of phages T4 and RB49 (10), λ (11), SPP1 (12), P22 (13), and Sf6 (14), and the structure of the nuclease domains of the large terminase subunits gp17 of RB49 (15) and G2P of SPP1 (16) and the full-length large terminase subunit gp17 of T4 (15) have been determined. A theoretical model for the structure of the C-terminal domain of UL89 has been proposed recently (17).

Several studies report inhibitors that prevent the formation of new virions through blockage of the termination system (1822). However, the structural and functional characterization of herpes packaging proteins, which could assist further discovery and development of antiviral molecules, has been hindered by the difficulties in expressing enough soluble material for structural analysis. We have overcome these problems by using ESPRIT (23, 24), a combinatorial library method for defining soluble constructs through random gene truncation and expression screening. This approach yielded a single active soluble construct corresponding to the C-terminal nuclease domain of UL89, the structure of which we report herein at a resolution of 2.15 Å.


Expression of UL89 with a Library-Based Construct Screen.

The two exons of the HCMV ul89 gene were cloned as a single DNA construct and initially tested for protein expression in several Escherichia coli strains and conditions. No protein obtained from these assays was stable enough to withstand purification. Similar results were obtained when the full-length gene was expressed in insect or mammalian cells. Extensive trials with refolding protocols were also unsuccessful. A number of constructs for each putative domain were designed, based on secondary structure prediction, globularity and disorder, but none of them expressed soluble protein in any system or condition assayed.

Subsequently, to find soluble domains, we used the combinatorial library method ESPRIT (23, 24), which generates comprehensive libraries of 5′ or 3′ truncated genetic constructs of the target. Both libraries were synthesized from the ul89 gene. We then screened 9,216 clones for each library form, corresponding to an approximate four-fold oversample of all possible domain boundaries, for expression of soluble protein. The two libraries were arrayed onto the same nitrocellulose membrane, and colonies were screened for putative soluble protein expression in colony format using measurements of in vivo biotinylation efficiency of a C-terminal biotin acceptor peptide by fluorescent streptavidin hybridization (23). Although a relatively large number of clones exhibited positive signals in the 3′ truncation library, small-scale 4 mL liquid expression trials yielded only marginally soluble uninteresting fragments of less than 20 kDa in size. In contrast, the 5′ truncation library yielded several partially soluble, purifiable constructs of similar size (approximately 37 kDa), from which the 48K22 construct was selected as showing the best behavior following scale-up testing (Fig. S1). This construct was only partially soluble (estimated at 5% of total UL89 protein), but was stable through scale-up to 12 L culture volumes and yielded approximately 1 mg of purifiable monodisperse protein per liter of culture. Other similar-sized constructs identified as partially soluble in small-scale testing did not maintain solubility during subsequent scale-up steps. Subsequent DNA sequencing and mass spectrometry fingerprinting identified construct 48K22 as a C-terminal fragment of UL89 (residues 418 to 674; Fig. S1), hereafter termed UL89-C. This fragment falls inside the predicted C-terminal nuclease domain, encoded in exon 2.

Overall Structure of UL89-C.

UL89-C displays a wedged shape with dimensions 40 × 35 × 46 . A central eight-stranded mixed β-sheet, with parallel and antiparallel strands, is flanked by helices α on both sides (Fig. 1A). At one side, hydrophobic interactions pack α2 and α3 against the sheet. At the other site, helices α1, α4, α5, and α6 form a bunch that interacts with the β-sheet by hydrophobic interactions from one side of α5 and α6 and by hydrophilic contacts made by the α1 and α4 C-terminal ends. Two 310 helices, η1 and η2, at loops connecting β1 to β2 and α6 to β10, border one end of the β-sheet. The strand order in the central sheet is 1, 9, 4, 3, 2, 5, 6, and 10 with topology +4, -1, -1, +3x, +1x, -5x, +6 (25) (Fig. 1C and Fig. S2). At both lateral edges of the β-sheet, β1 and β10 form short strands of only three amino acids each. At one end of the β sheet, long loops surround a cleft that typically harbors the active site in proteins sharing this fold. One of these loops flanking the active site cavity folds in a twisted β-hairpin, formed by β7 and β8 (Fig. 1A).

Fig. 1.
(A) Overall structure of UL89-C in a ribbon representation. The metal ions are indicated by yellow spheres. UL56-interacting helix α4 is highlighted in blue, α helices in cyan, 310 helices in green, and β strands in magenta. ( ...

UL89-C Belongs to the RNase H-Like Superfamily.

A search for structurally similar proteins revealed that UL89-C has the characteristic fold of the RNase H-like superfamily of nucleases and polynucleotidyl transferases (26). The closest structural relatives to UL89-C are the recently reported nuclease domains of the large terminase subunits of bacteriophages, RB49 and T4 gp17 (15) (RMSD 2.6 Å for 158 equivalent Cα and 2.7 Å for 159 equivalent Cα, respectively) and SPP1 G2P (16) (RMSD 3.0 Å for 146 equivalent Cα), the Holliday junction resolvase RuvC (27) (RMSD 2.6 Å for 115 equivalent Cα), the HIV-integrase (28, 29) (RMSD 2.6 Å for 78 equivalent Cα) and the avian sarcoma virus integrase (30) (RMSD 2.9 Å for 85 equivalent Cα). The crystal structures of all these proteins and other members of the superfamily display the same basic fold but vary in length and show almost no amino acid sequence identity (i.e., 7.7% identity between UL89-C and the closest structural relative, the nuclease domain of RB49 gp17, after structural alignment). The structural homology between these enzymes can be well described from the structural pattern of human RNase H1 (Hs-RNase H1) (31), which consists of a five-stranded β sheet surrounded by α helices on both sides. The order and orientation of the strands within the β-sheet is conserved: 3, 2, 1, 4, and 5, one of them being antiparallel to the other four (↑↓↑↑↑). These strands are equivalent to the UL89-C β-strands 4, 3, 2, 5, and 6, respectively, whereas helices αA, αB, and αE Hs-RNase H1 correspond to helices α2, α3 and α6 of UL89-C. All these elements are arranged similarly in all proteins of the superfamily, except for α6 (αE in Hs-RNase H1), which runs in the opposite direction in UL89-C, RB49 gp17, SPP1 G2P, and RuvC with respect to the other members of the superfamily (Fig. S3). UL89-C (257 aa) is larger than the bacteriophage homologous proteins gp17 (206 aa) (15) and G2P (178 aa) (16). It is also larger and more complex than RNase H, integrase or resolvase nuclease domains, with the central β-sheet composed of 8 strands rather than 5, and further α helices and other secondary structure elements. Other members of the superfamily, like Tn5 transposase (32, 33) and Piwi–Argonaute (34), also have additional structural elements around the basic RNase H fold, although they are quite different from those found in UL89, thus reflecting their diverse functions, substrates and interactions with other proteins.

Active Site Cleft.

The active site is located at one end of the central β-sheet in a cleft formed by conserved residues, four of them acidic (Fig. 1B). In all structures with a RNAse-like fold, the active site is located at a topologically equivalent position, at one end of the β-sheet where two parallel β-strands (β2 and β5) separate in a fork-like manner. Asp463, Glu534, and Asp651 coordinate two metal cations (see below). Asp463 is located at the C-terminal end of β2 whereas Glu534 is present at the end of β5. Asp651 is found at the beginning of α6, the last α-helix in the structure, which lies diagonally to the two β-strands on one of the faces of the central β-sheet (Fig. 1). These three acidic amino acids are fully conserved and confer a strong electronegative character to the active site (Fig. 2 and Fig. S2). A further conserved aspartate residue, Asp650, is located close to the active site cleft (Fig. 1B) but does not interact directly with any of the metal ions. Comparison of the active site of UL89-C with other RNAse-like nucleases (Fig. S3) shows that the presence of several acidic residues coordinating metal ion is a signature of the superfamily, in particular the central residue is always an aspartate (Asp463 in UL89-C). The other residues coordinating the metal may vary. For example in bacteriophage SPP1 G2P (16), one of the closest structural relatives to UL89, an aspartate residue (Asp321) occupies a position equivalent to Glu534 in UL89-C and a histidine residue (His400) that of Asp651. However, in T4 and RB49 gp17 (16) the residues coordinating the metal ion are identical to those of UL89-C (Fig. S3). Moreover, in these structures an aspartate residue occupies the equivalent position of Asp650 in UL89-C, although two additional acidic residues of the active pocket present in the bacteriophage structures, Asp 406 and Glu401 in RB49 gp17, are not present in UL89-C.

Fig. 2.
UL89-C surface representations. (A) Conservation surface of UL89-C based on a Risler matrix calculation. Residue color is shown on the basis of conservation score within the eight human herpesvirus: orange, more than 80% similarity; green, full conservation. ...

The Active Site Accommodates Two Cations.

In the crystal not soaked with MnCl2, one metal ion was clearly identified at the active site. This corresponds to metal B, as defined by Nowotny and Yang (35). In molecule D of this crystal, an electron density peak initially assigned as a water molecule (the strongest peak of the water list) could also correspond to another metal ion located at a second position with low occupancy. Indeed, an anomalous difference map calculated from diffraction data from a crystal soaked with MnCl2 showed two peaks at these two positions (Fig. S4). The first one is coordinated by Asp463 and Glu534 and the second by Asp463 and Asp 651 (Fig. 1B). Asp463, Glu534 and Asp651 (and the closest residues Pro464, Ala465, Gly535, Asn536, and Asp650) are fully conserved among human herpesvirus terminases (Fig. S2), thereby suggesting that they are essential for cation coordination and thus for catalysis. Indeed, Mg2+ or Mn2+ cations are required for the functioning of these enzymes and a two-metal catalysis has been proposed for their enzymatic mechanism (35, 36).

In Vitro Nuclease Assays and Mutants.

An in vitro assay demonstrated that UL89-C has the capacity to degrade linear and circular DNA and this function is strongly activated by Mn2+ (Fig. 3 A and B). In the presence of this cation, UL89-C converts supercoiled circular plasmid DNA to nicked open circular DNA, subsequently to linear DNA and finally to completely degraded DNA (Fig. S5). Similarly, UL89-C also degrades linear DNA (Fig. 3 A and C). Similar behavior was previously described for the UL89 full-length protein (5). The reaction performed in the same conditions but in the presence of Mg2+ instead of Mn2+ converted only supercoiled circular plasmid to nicked open circular DNA. With Ca2+, the DNA degradation was even less efficient (Fig. 3 A and B). To verify that the residues of the structurally inferred active site were truly involved in the nuclease activity of the protein, we designed a set of single and double mutants and tested their activity. The single mutants D463A, D651A, and the double mutant D463A/E534A showed only residual activity (Fig. 3 C and D). These results confirmed that UL89-C harbors the nuclease activity critical for the function of the full-length protein.

Fig. 3.
UL89-C in vitro nuclease assay. (A) Effect of divalent cations on UL89-C linear dsDNA nuclease activity. Lane 1: linear (digested with HindIII) pUC18 plasmid in the absence of the nuclease. Lane 2: nuclease reaction in the absence of divalent cations. ...

Inactivation by Raltegravir.

The structural similarity between the herpesvirus terminase nuclease domain and the HIV integrase prompted us to test the inhibitory properties of integrase inhibitors on UL89-C. One of these integrase inhibitors, raltegravir (MK0518), was approved by the FDA in 2007 for the treatment of AIDS (37). Raltegravir turned out to be a strong inhibitor of the nuclease activity of UL89-C (Fig. 4). A recent structure of the prototype foamy virus integrase in complex with DNA and the inhibitor shows that raltegravir binds at the active site, directly coordinating the metal ions (38). Presumably, it would bind in a similar way to UL89-C. In contrast to raltegravir, another integrase inhibitor, elvitegravir (GS9137), had no inhibitory effect on UL89-C under similar conditions.

Fig. 4.
Inhibition of UL89-C nuclease activity by raltegravir. Lane 1: Circular pUC18 plasmid in the absence of the nuclease. Lane 2: Linear (digested with HindIII) pUC18 plasmid in the absence of the nuclease. Lane 3: Nuclease assay with the UL89-C WT protein. ...


A Powerful Construct Screening Technique to Obtain UL89-C.

UL89, like other herpesvirus DNA packaging proteins, is scarcely expressed in a soluble, purifiable form. Even using insect or mammalian eukaryotic expression systems, we and others were unable to purify a soluble form of this protein in sufficient amounts for crystallographic or even limited proteolysis studies. The ESPRIT (23, 24) analysis reported here permitted the oversampling of all possible domain boundaries as hexahistidine tag fusion positions. However, even this approach resulted in a very low number of soluble expression constructs. This finding is indicative of the challenging nature of UL89. The resulting construct encoding the UL89 C-terminal domain expressed protein that was partially soluble, but the purifiable material was monodisperse and well-behaving through subsequent concentration and crystallization steps. The identification of this otherwise obscure expression-compatible construct is illustrative of the power of this technique to find rare soluble forms of difficult proteins, and this approach appears particularly effective for viral proteins with uncertain domain boundaries (39).

DNA Binding.

UL89-C cleaves dsDNA in vitro (Fig. 3), as reported previously for the full-length protein (5). This domain should bear the structural determinants for DNA binding. An electrostatic surface calculation indicates that a number of positively charged residues, located in different loops, surround the active site cleft (Fig. 2B). From this calculation, the shape of the surface, and superpositions with Bacillus halodurans (40) and human RNase H structures in complex with a DNA/RNA hybrid (31) and Tn5 transposase in complex with DNA (33), we manually built a model for dsDNA bound to UL89-C (Fig. S6). In this model, the loops Lβ2-β3 and Lβ5-α3 fit into the major groove of the DNA, whereas positively charged side chains appear in close proximity to the phosphates. The sugar-phosphate backbone enters the active site but does not get close enough to the metal ion positions. A distortion (bent) from the regular straight B-DNA used in the model would be necessary for the scissile phosphate to reach the metal ions without clashes of the DNA with the active site surrounding loops. In the bacteriophage large terminase structures, some of these loops (i.e., β2-β9 and β5-α3) are shorter or less protruding, resulting in a slightly less deep active site. However, Smits et al. (16) reported that a protruding β-hairpin in the SPP1 G2P structure would clash with the DNA and suggested that the conformation of this loop changes upon DNA binding. This loop corresponds to loop α5-α6 in ULC89-C, which was disordered and not visible in our structure, in agreement with its proposed flexibility.

In RNase H, the equivalent loops to UL89 Lβ2-β3 and Lβ5-α3 contact the minor groove of the RNA/DNA hybrid, instead of the major groove of the dsDNA, as in our DNA-UL89-C docking model. This observation is not contradictory because the conformation of the RNA/DNA hybrid is a mixture between A and B forms, where the minor groove is wider and the bases are accessible. However, the viral dsDNA is most likely in the B-conformation (albeit probably distorted) where the minor groove is too narrow to permit the entrance of loops Lβ2-β3 and Lβ5-α3. Therefore, interaction with the bases, if present, would be performed through the major groove. Indeed, loop Lβ5-α3 is longer and more protruding than its RNase H equivalent. The shallow RNA/DNA hybrid minor groove cannot accommodate this loop, but a deeper B-DNA major groove would fit (Fig. S6).

UL89-C Within the Terminase Complex.

In the crystal structure, UL89-C shows four protein molecules in the asymmetric unit, A, B, C, and D. Molecules A and B interact with each other about a local two-fold axis, as do molecules C and D (Fig. S7). The interaction surface is at the edge of the central β-sheet so that the sheet extends from one protein to its neighbor. Although UL89 dimers have been detected by cross-linking and gel filtration of the full-length protein (8), UL89-C eluted as a monomer in the size-exclusion chromatography. Thus, with the data available, it is unclear whether the dimer observed in the crystal structure has any physiological relevance or whether it is due to crystal packing. Furthermore, phage and herpesvirus terminases are believed to form toroidal structures and assemble as such against the 12-fold portal protein (5, 10), for which a dimer like that observed in the crystal structure of UL89-C would not fit. It has been demonstrated by cryo-EM that the phage T4 gp17, a homolog of UL89, forms pentamers (15). In the present structure there is no evidence of ring formation and it is likely that the oligomerization determinants for such an arrangement are outside the UL89-C domain.

UL89 interacts with the UL56 subunit of the terminase. On the basis of results from deletion experiments, the amino acids of UL89 proposed to be involved in this interaction span from residues 580–600 (8). This segment corresponds to the exposed helix α4 (Fig. 1 and Fig. S2) and is thus suitable for interaction with UL89 partners. The segment includes three residues that are fully conserved among human herpesvirus, namely Lys583, Ala586, and Asn595. This observation suggests a similar interaction scheme within the family. Furthermore, helix α4 has no counterpart in RNase H or integrases, which are enzymes that do not interact with any protein equivalent to UL56.

UL89 as a Drug Target.

Viral DNA encapsidation machinery has no counterpart in the mammalian cell, thus implying that the proteins involved in this process represent promising selective targets for antiviral therapy. Several studies have reported that inhibitors of DNA packaging in herpesviruses specifically target UL89 and UL56, although the binding sites of the proteins have not been elucidated (1822). Our study demonstrates that the UL89 C-terminal domain of HCMV and the equivalent domains in all herpesviruses bear the essential nuclease function of the terminase for DNA packaging (Fig. 3). We reveal the three-dimensional structure of this domain in detail and describe the essential residues for the nuclease function, which we demonstrate can be inhibited by raltegravir, an HIV integrase inhibitor approved by the FDA for AIDS treatment in October 2007 (37). This study therefore opens a way for the design of further optimized inhibitors against UL89-C that may be useful for the development of unique antiherpes drugs.

Materials and Methods

Identification of the UL89-C Soluble Construct from a Complete 5′ and 3′ Gene Truncation Library.

The ul89 gene from the HHV5 towne strain comprises two exons; these were amplified, cloned separately, and subsequently ligated together. The library was constructed as described (23). Briefly, for the 5′ deletion library, the gene was cloned into a pET9a-derived vector out of frame with a tobacco etch virus cleavable N-terminal hexahistidine tag (MGHHHHHHDYDIPTTENLYFQG) and in frame with a short linker and C-terminal biotin acceptor peptide (SNNGSGGGLNDIFEAQKIEWHE). The presence of AatII and AscI sites between the hexahistidine tag encoding DNA and the ul89 gene permitted unidirectional truncation of the 5′ end of the gene using an exonuclease III. Hexahistidine tag fusions of the truncated gene were generated following recircularization of the plasmid with T4 DNA ligase. Following transformation, the plasmid library was harvested from the E. coli cloning strain (Omnimax T1; Invitrogen) and used to transform BL21-CodonPlus-RIL (Stratagene). Robotic processing of the library to identify putative soluble expression constructs was done as described (23, 24). Briefly, 18,432 colonies comprising 9,216 for the 5′ and 3′ deletion libraries were picked robotically into microtiter plates of TB broth and grown overnight. These were gridded robotically onto nitrocellulose membranes over LB agar to grow dense colony arrays, which were then induced with IPTG. Colonies were lysed in situ and hybridized with Alexa488 Streptavidin (Invitrogen). A fluorimager was used to identify colonies expressing biotinylated proteins. The 96 most intense positive clones of each library were isolated from the library and grown as 4 mL liquid expression cultures. Nickel affinity purifications were performed robotically and purified proteins were assessed by SDS-PAGE.

Purification of Wild-Type and Mutant UL89-C Proteins.

E. coli Rosetta cells were transformed with plasmid pHAR-UL89C, pHAR-UL89C-D463A, pHAR-UL89C-D651A, and pHAR-UL89C-D463A-E534A where the biotin acceptor peptide had been suppressed by an introduction of a stop codon in the natural position. Cells were grown at 37 °C to an OD600 of 0.5, protein expression was induced by addition of IPTG to a final concentration of 0.1 mM with further incubation for 72 h at 16 °C. Cells were harvested at 5000 g, resuspended in binding buffer (50 mM Tris pH 8, 200 mM NaCl, 20 mM imidazole, and 200 μL of DNase I at 2 mg/mL) and sonicated. Insoluble material was sedimented by centrifugation (20000 g, 4 °C, 25 min), and the supernatant was passed through a 0.45 μm filter. Affinity purification was performed with a 5 mL HisTrap HP column (GE Healthcare). The elution was performed with 20 bed volumes of a linear gradient using a buffer comprising 50 mM Tris pH 8, 200 mM NaCl, and 500 mM imidazole. The fractions were analyzed by SDS-PAGE and pooled. A second chromatographic step was carried out on a Mono Q column (GE Healthcare) using binding buffer consisting of 30 mM Tris pH 8, and 50 mM NaCl. Elution was achieved using a linear salt gradient of 50 mM-1 M NaCl. Fractions were analyzed by SDS-PAGE and those that showed higher purity were pooled, concentrated, and subjected to size-exclusion chromatography using a Superdex 75 10/300 GL column (GE Healthcare). The protein eluted at 10.5 mL, corresponding to 31 kDa. Wild-type and mutant proteins were expressed and purified with similar efficiencies.

Crystallization and Heavy-Atom Derivatization.

Protein UL89-C was crystallized by mixing 2 μL of protein solution containing 10 mg/mL of UL89C, 30 mM Tris buffer (pH 8), 50 mM NaCl and 5 mM EDTA with 2 μL of precipitant solution containing 10% (w/v) polyethylene glycol 8000, 150 mM calcium acetate hydrate and 100 mM Mes (pH 6), using the sitting drop vapor diffusion method. Crystals were flash-cooled in 12 % polyethylene glycol 400 as cryoprotectant. To prepare Mn2+-derivatised crystals, native crystals were soaked for 1 h in the crystallization solution enriched with 50 mM MnCl2. To prepare Hg heavy-atom derivatives, native protein crystals were soaked for 24 h in the crystallization solution enriched with 0.5 mM ethylmercuricthiosalicylic acid sodium salt.

Structure Solution and Refinement.

A native dataset was collected at the ESRF ID14-2 beamline to a resolution of 2.15 Å. Crystals belonged to space group P212121 with cell dimensions a = 82.8, b = 87.9, c = 189.4 and α = β = γ = 90 °C. A dataset from a native crystal soaked with Mn2+ was collected at ESRF ID29; the crystals belonged to the same space group with the similar cell dimensions. In addition, data for an Hg-derivative were collected at ESRF BM16, at a wavelength of 1.00726 Å (Hg-bound absorption edge). Native and derivative diffraction data were processed using XDS (41), and then scaled, reduced and merged with XSCALE (41) (Table S1). Phases were obtained by single isomorphous replacement anomalous scattering (SIRAS). SHARP (42) was used to determine the positions of 5 Hg atoms using data to 3.5 Å, and phasing the data to 2.15 Å. The resulting map was of insufficient quality for automatic tracing and most of the polypeptide chain had to be built manually using Coot (43). The crystals contained four UL89-C molecules per asymmetric unit. Atomic positions and their associated B-factors were refined with Refmac5 (44) using noncrystallographic symmetry restraints. The model was improved by alternating cycles of automatic refinement and interactive model building (Fig. S8). The final refinement cycles included TLS refinement. The Mn2+-soaked crystal structure showed two strong electron density peaks at the active site corresponding to the metal ions (Fig. S4). These ions were included and the structure refined with Refmac5. The quality of the stereochemistry of the two structures was assessed with Procheck (45) (Table S1).

In Vitro Nuclease Assay.

Purified wild-type and mutant UL89-C domains (final concentration 2 μM) were incubated with 200 ng of circular and linear (digested with HindIII) pUC18 plasmid (2,686 bp) in a reaction containing 30 mM Tris pH 8 and 50 mM NaCl for 1 h at 37 °C. The effect of several metal ions was studied by adding 3 mM (final concentration) MgCl2, CaCl2, or MnCl2. The activity was terminated by adding EDTA to a final concentration of 30 mM. The samples were analyzed by agarose gel electrophoresis with ethidium bromide staining. For the inhibitory assay, a range of concentrations of raltegravir (Chemietek) were added to the reaction. A stock solution of 5 mM raltegravir was prepared at 50% DMSO and was further diluted with 30 mM Tris pH 8, 50 mM NaCl to obtain the final concentration.


Search for folding relatives was performed with MATRAS (46). Structural alignments and RMSD calculations were performed with SSM (47). Fig 1 and Figs. S3, S4, S6, S7, and S8 were drawn with Pymol (48). Fig. 2 was generated with GRASP (49) and Pymol.

Supplementary Material

Supporting Information:


We thank Ms. Jenny Colom for help during protein purification, Dr. Jordi Bernués for help during nuclease assays, Ms. Zuzanna Kaczmarska for elvitegravir inhibition assays and Prof. David Stuart for hosting M.N. to perform expression tests in eukaryotic systems. This study was supported by Ministerio de Ciencia e Innovación Grant BFU2008-02372/BMC (M.C.), Generalitat de Catalunya Grant 2009 SGR 1309 (M.C.), and the European Commission (Spine2-Complexes LSHG-CT-2006-031220). Synchrotron data collection was supported by the European Synchrotron Radiation Facility and the European Union. Crystallization screening and preliminary X-ray analysis were performed at the Automated Crystallography Platform, Barcelona.


The authors declare no conflict of interest.

*This Direct Submission article had a prearranged editor.

Data deposition: The atomic coordinates for the UL89-C protein structures have been deposited in the Protein Data Bank, www.pdb.org (PDB ID codes 3N4P and 3N4Q).

This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1007144107/-/DCSupplemental.


1. Mocarski ES, Shenk T, Pass RF. In: Fields Virology. Knipe DM, Howley PM, editors. Philadelphia: Lippincott Williams & Wilkins; 2007. pp. 2701–2772.
2. McVoy MA, Nixon DE, Hur JK, Adler SP. The ends on herpesvirus DNA replicative concatemers contain pac2 cis cleavage/packaging elements and their formation is controlled by terminal cis sequences. J Virol. 2000;74:1587–1592. [PMC free article] [PubMed]
3. Baines JD, Weller SK. In: Viral Genome Packaging Machines: Genetics, Structure, and Mechanism. Catalano CE, editor. New York: Springer; 2005. pp. 135–150.
4. Bogner E. Human cytomegalovirus terminase as a target for antiviral chemotherapy. Rev Med Virol. 2002;12:115–127. [PubMed]
5. Scheffczik H, Savva CG, Holzenburg A, Kolesnikova L, Bogner E. The terminase subunits pUL56 and pUL89 of human cytomegalovirus are DNA-metabolizing proteins with toroidal structure. Nucleic Acids Res. 2002;30:1695–1703. [PMC free article] [PubMed]
6. Bogner E, Radsak K, Stinski MF. The gene product of human cytomegalovirus open reading frame UL56 binds the pac motif and has specific nuclease activity. J Virol. 1998;72:2259–2264. [PMC free article] [PubMed]
7. Champier G, et al. New functional domains of human cytomegalovirus pUL89 predicted by sequence analysis and three-dimensional modelling of the catalytic site DEXDc. Antivir Ther. 2007;12:217–232. [PubMed]
8. Thoma C, et al. Identification of the interaction domain of the small terminase subunit pUL89 with the large subunit pUL56 of human cytomegalovirus. Biochemistry. 2006;45:8855–8863. [PubMed]
9. Przech AJ, Yu D, Weller SK. Point mutations in exon I of the herpes simplex virus putative terminase subunit, UL15, indicate that the most conserved residues are essential for cleavage and packaging. J Virol. 2003;77:9613–9621. [PMC free article] [PubMed]
10. Sun S, Kondabagil K, Gentz PM, Rossmann MG, Rao VB. The structure of the ATPase that powers DNA packaging into bacteriophage T4 procapsids. Mol Cell. 2007;25:943–949. [PubMed]
11. Ortega ME, Gaussier H, Catalano CE. The DNA maturation domain of gpA, the DNA packaging motor protein of bacteriophage lambda, contains an ATPase site associated with endonuclease activity. J Mol Biol. 2007;373:851–865. [PMC free article] [PubMed]
12. Camacho AG, Gual A, Lurz R, Tavares P, Alonso JC. Bacillus subtilis bacteriophage SPP1 DNA packaging motor requires terminase and portal proteins. J Biol Chem. 2003;278:23251–23259. [PubMed]
13. Nemecek D, et al. Subunit conformations and assembly states of a DNA-translocating motor: The terminase of bacteriophage P22. J Mol Biol. 2007;374:817–836. [PMC free article] [PubMed]
14. Zhao H, et al. Crystal structure of the DNA-recognition component of the bacterial virus Sf6 genome-packaging machine. Proc Natl Acad Sci USA. 2010;107:1971–1976. [PMC free article] [PubMed]
15. Sun S, et al. The structure of the phage T4 DNA packaging motor suggests a mechanism dependent on electrostatic forces. Cell. 2008;135:1251–1262. [PubMed]
16. Smits C, et al. Structural basis for the nuclease activity of a bacteriophage large terminase. EMBO Rep. 2009;10:592–598. [PMC free article] [PubMed]
17. Couvreux A, et al. Insight into the structure of the pUL89 C-terminal domain of the human cytomegalovirus terminase complex. Proteins. 2010;78:1520–1530. [PubMed]
18. Underwood MR, et al. Inhibition of human cytomegalovirus DNA maturation by a benzimidazole ribonucleoside is mediated through the UL89 gene product. J Virol. 1998;72:717–725. [PMC free article] [PubMed]
19. Buerger I, et al. A novel nonnucleoside inhibitor specifically targets cytomegalovirus DNA maturation via the UL89 and UL56 gene products. J Virol. 2001;75:9077–9086. [PMC free article] [PubMed]
20. North TW, Sequar G, Townsend LB, Drach JC, Barry PA. Rhesus cytomegalovirus is similar to human cytomegalovirus in susceptibility to benzimidazole nucleosides. Antimicrob Agents Chemother. 2004;48:2760–2765. [PMC free article] [PubMed]
21. Dittmer A, Drach JC, Townsend LB, Fischer A, Bogner E. Interaction of the putative human cytomegalovirus portal protein pUL104 with the large terminase subunit pUL56 and its inhibition by benzimidazole-d-ribonucleosides. J Virol. 2005;79:14660–14667. [PMC free article] [PubMed]
22. Hwang JS, et al. Identification of acetylated, tetrahalogenated benzimidazole d-ribonucleosides with enhanced activity against human cytomegalovirus. J Virol. 2007;81:11604–11611. [PMC free article] [PubMed]
23. Tarendeau F, et al. Structure and nuclear import function of the C-terminal domain of influenza virus polymerase PB2 subunit. Nat Struct Mol Biol. 2007;14:229–233. [PubMed]
24. Yumerefendi H, Tarendeau F, Mas PJ, Hart DJ. ESPRIT: An automated, library-based method for mapping and soluble expression of protein domains from challenging targets. J Struct Biol. 2010 doi: 10.1016/j.physletb.2003.10.071. [PubMed] [Cross Ref]
25. Richardson JS. Handedness of crossover connections in beta sheets. Proc Natl Acad Sci USA. 1976;73:2619–2623. [PMC free article] [PubMed]
26. Yang W, Steitz T. Recombining the structures of HIV integrase, RuvC, and RNase H. Structure. 1995;15:131–134. [PubMed]
27. Ariyoshi M, et al. Atomic structure of the RuvC resolvase: A holliday junction-specific endonuclease from E. coli. Cell. 1994;78:1063–1072. [PubMed]
28. Dyda F, et al. Crystal structure of the catalytic domain of HIV-1 integrase: Similarity to other polynucleotidyl transferases. Science. 1994;266:1981–1986. [PubMed]
29. Maignan S, Guilloteau JP, Zhou-Liu Q, Clément-Mella C, Mikol V. Crystal structures of the catalytic domain of HIV-1 integrase free and complexed with its metal cofactor: High level of similarity of the active site with other viral integrases. J Mol Biol. 1998;282:359–368. [PubMed]
30. Lubkowski J, et al. Structure of the catalytic domain of avian sarcoma virus integrase with a bound HIV-1 integrase-targeted inhibitor. Proc Natl Acad Sci USA. 1998;95:4831–4836. [PMC free article] [PubMed]
31. Nowotny M, et al. Structure of human RNase H1 complexed with an RNA/DNA hybrid: Insight into HIV reverse transcription. Mol Cell. 2007;28:264–276. [PubMed]
32. Rice P, Mizuuchi K. Structure of the bacteriophage Mu transposase core: A common structural motif for DNA transposition and retroviral integration. Cell. 1995;82:209–220. [PubMed]
33. Davies DR, Goryshin IY, Reznikoff WS, Rayment I. Three-dimensional structure of the Tn5 synaptic complex transposition intermediate. Science. 2000;289:77–85. [PubMed]
34. Song JJ, Smith SK, Hannon GJ, Joshua-Tor L. Crystal structure of Argonaute and its implications for RISC slicer activity. Science. 2004;305:1434–1437. [PubMed]
35. Nowotny M, Yang W. Stepwise analyses of metal ions in RNase H catalysis from substrate destabilization to product release. EMBO J. 2006;25:1924–1933. [PMC free article] [PubMed]
36. Yang W. An equivalent metal ion in one- and two-metal-ion catalysis. Nat Struct Mol Biol. 2008;15:1228–1231. [PMC free article] [PubMed]
37. Summa V, et al. Discovery of raltegravir, a potent, selective orally bioavailable HIV-integrase inhibitor for the treatment of HIV–AIDS infection. J Med Chem. 2008;51:5843–5855. [PubMed]
38. Hare S, Gupta SS, Valkov E, Engelman A, Cherepanov P. Retroviral intasome assembly and inhibition of DNA strand transfer. Nature. 2010;464:232–236. [PMC free article] [PubMed]
39. Ruigrok RW, Crépin T, Hart DJ, Cusack S. Towards an atomic resolution understanding of the influenza virus replication machinery. Curr Opin Struct Biol. 2010;20:104–113. [PubMed]
40. Nowotny M, Gaidamakov SA, Crouch RJ, Yang W. Crystal structures of RNase H bound to an RNA/DNA hybrid: Substrate specificity and metal-dependent catalysis. Cell. 2005;121:1005–1016. [PubMed]
41. Kabsch W. Automatic indexing of rotation diffraction patterns. J Appl Crystallogr. 1988;21:67–72.
42. Vonrheein C, Blanc E, Roversi P, Bricogne G. Automated structure solution with autoSHARP. Methods Mol Biol. 2007;364:215–230. [PubMed]
43. Emsley P, Cowtan K. Coot: Model-building tools for molecular graphics. Acta Crystallogr D. 2004;60:2126–2132. [PubMed]
44. Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallogr D. 1997;53:240–255. [PubMed]
45. Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK: A program to check the stereochemical quality of protein structures. J Appl Crystallogr. 1993;26:283–291.
46. Kawabata T. MATRAS: A program for protein 3D structure comparison. Nucleic Acids Res. 2003;31:3367–3369. [PMC free article] [PubMed]
47. Krissinel E, Henrick K. Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D. 2004;60:2256–2268. [PubMed]
48. DeLano WL. The PyMOL Molecular Graphics System. DeLano Scientific LLC Palo Alto CA USA. 2008. http://www.pymol.org.
49. Nicholls A, Sharp KA, Honig B. Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins. 1991;11:281–296. [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • Gene
    Gene records that cite the current articles. Citations in Gene are added manually by NCBI or imported from outside public resources.
  • MedGen
    Related information in MedGen
  • Protein
    Protein translation features of primary database (GenBank) nucleotide records reported in the current articles as well as Reference Sequences (RefSeqs) that include the articles as references.
  • PubMed
    PubMed citations for these articles
  • Structure
    Three-dimensional structure records in the NCBI Structure database for data reported in the current articles.
  • Taxonomy
    Taxonomy records associated with the current articles through taxonomic information on related molecular database records (Nucleotide, Protein, Gene, SNP, Structure).
  • Taxonomy Tree
    Taxonomy Tree

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...