Logo of emborepLink to Publisher's site
EMBO Rep. 2009 Jun; 10(6): 592–598.
Published online 2009 May 15. doi:  10.1038/embor.2009.53
PMCID: PMC2685612
Scientific Report

Structural basis for the nuclease activity of a bacteriophage large terminase


The DNA-packaging motor in tailed bacteriophages requires nuclease activity to ensure that the genome is packaged correctly. This nuclease activity is tightly regulated as the enzyme is inactive for the duration of DNA translocation. Here, we report the X-ray structure of the large terminase nuclease domain from bacteriophage SPP1. Similarity with the RNase H family endonucleases allowed interactions with the DNA to be predicted. A structure-based alignment with the distantly related T4 gp17 terminase shows the conservation of an extended β-sheet and an auxiliary β-hairpin that are not found in other RNase H family proteins. The model with DNA suggests that the β-hairpin partly blocks the active site, and in vivo activity assays show that the nuclease domain is not functional in the absence of the ATPase domain. Here, we propose that the nuclease activity is regulated by movement of the β-hairpin, altering active site access and the orientation of catalytically essential residues.

Keywords: bacteriophage, X-ray crystallography, terminase, DNA translocation


In tailed bacteriophages, the genomic double-stranded DNA is packaged into a preformed capsid. In general, DNA encapsidation is driven by ATP hydrolysis and requires the portal protein, the large and small terminase proteins and a cis-acting packaging initiation (pac) site (Black, 1989; Catalano et al, 1995; Fujisawa & Morita, 1997; Alonso et al, 2006; Rao & Feiss, 2008). DNA is translocated through the portal protein, while the large and small terminase proteins, G2P and G1P, respectively, in the Bacillus subtilis phage SPP1, coordinately recognize and cleave the pac site, and initiate DNA encapsidation. The large terminase consists of two domains, an amino-terminal ATPase domain that provides energy for translocation and a carboxy-terminal nuclease domain (Chai et al, 1995; Gual et al, 2000; Camacho et al, 2003).

The viral DNA is replicated as concatamers and the nuclease activity associated with the C-terminus of the large terminase is required to ensure that genome-sized DNA is packaged into the capsid. Two different nuclease events occur. The initial cut of the DNA is specific and occurs after the small terminase has been recruited to the pac site (Deichelbohrer et al, 1982; Bravo et al, 1990; Camacho et al, 2003). This specific cleavage occurs once per concatamer and packaging into the first viral shell starts from the end of the DNA concatamer generated by this cut. In phages such as SPP1, P22 and T4, the second cut occurs in a sequence-independent manner when a genome greater than a unit-length has been packaged and the virus particle is full (Black, 1989; Fujisawa & Morita, 1997; Catalano, 2000; Alonso et al, 2006).

Recently, the structures of full-length T4 gp17 and the gp17 nuclease domain from the related phage RB49 have been determined (Alam et al, 2008; Sun et al, 2008). The gp17 nuclease domains from both phages were shown to be RNase H family nucleases and were most similar to the RuvC Holliday junction endonuclease. Three acidic residues in the active site of gp17 were shown to be essential for catalysis (Alam et al, 2008). Significantly, a model for DNA translocation was proposed that involved translation of the nuclease domain (Sun et al, 2008). In this model, the translocating DNA was bound to a second site opposite the active site of the nuclease. Interestingly, both SPP1 G2P and T4 gp17 exist as monomers in isolation (Gual et al, 2000; Sun et al, 2008). However, in the active packaging motor, gp17 has been shown to form a pentameric assembly (Sun et al, 2008).

It is at present not known how the nuclease activity of G2P is regulated. Two models of headful cleavage regulation have been proposed. In one model, the nuclease is always active and the headful cut occurs owing to the reduced speed of DNA translocation when the capsid is full (Cue & Feiss, 1997; Alam et al, 2008). Alternatively, the nuclease activity could be regulated through conformational changes induced by interaction with the ATPase domain of G2P or other components of the translocation motor (Bravo et al, 1990; Camacho et al, 2003; Sun et al, 2008).

Here, we present the crystal structure of the nuclease domain from the large terminase of SPP1. The core of the nuclease domain has the same fold as RNase H family endonucleases and is most similar to the recently determined structure of gp17 from the phage T4. A structure-based alignment of the G2P and gp17 nucleases permits an accurate analysis of functionally important regions. Extension of the central β-sheet and a C-terminal β-hairpin, not present in other members of the RNase H family, appear to be conserved features among large terminases. Structural observations indicate that this β-hairpin would obstruct interaction with DNA. Here, we also show that the nuclease domain of G2P is not active in isolation, suggesting that interactions with the ATPase domain are essential for activity. We propose that the regulation of headful cleavage activity is achieved by conformation changes involving movement of the β-hairpin.

Results And Discussion

The structure of the G2P nuclease domain

The C-terminal nuclease domain of G2P from bacteriophage SPP1 (residues 232–422, subsequently referred to as G2PΔN; Fig 1A) was purified to homogeneity and crystallized. The structure was determined from the anomalous scattering of the single-ordered selenomethionine and contains a single molecule in the asymmetrical unit. The final model, refined at 1.9 Å resolution (supplementary Table S1 online), contains residues 237–414 of G2P.

Figure 1
The structure of the SPP1 large terminase G2P is similar to gp17 despite poor sequence similarity. (A) Ribbon diagram of the nuclease domain. The central curved β-sheet (blue) is surrounded by a three-helix bundle on the concave face (yellow) ...

The nuclease domain of G2P contains a central seven-stranded β-sheet sandwiched between two clusters of α-helices (Fig 1A). The central β-sheet (β1–β7) curves around the C-terminal helix α6, which is one of the three helices on the concave face of the central β-sheet. In addition, a single helical turn is present in the loop between the first and fifth strands of the central β-sheet, and a two-stranded antiparallel β-hairpin projects from the three-helix cluster. The surface of G2P is mostly negatively charged (Fig 1B). In particular, the active site groove is largely acidic, although sporadic basic residues are located on the periphery of the groove. These few positively charged residues are not conserved in closely related phages (supplementary Fig S1 online), suggesting that DNA binds to the active site mostly through shape-complementarity rather than charge–charge interactions.

Comparison with the T4 gp17 nuclease structure

The sequences of the SPP1 G2P and the T4 gp17 large terminases are poorly conserved (overall sequence identity of 12%). Despite the significant evolutionary distance between the two proteins (Fig 1C,D), the folds of the nuclease domains of the two proteins are identical, suggesting that the mechanisms of catalysis and regulation of nuclease activity will be similar. The structure-based sequence alignment (Fig 1D) allows accurate identification of conserved and dissimilar features. It is remarkable that the β-hairpin (β8 and β9) and the extended β-sheet, which are not present in the RNase H-like endonucleases, are observed in the structure of the T4 gp17 (Alam et al, 2008; Sun et al, 2008). Their conserved nature indicates the importance of these additional structural elements for function. The conserved β-hairpin occupies different positions in the G2P and T4 gp17 proteins, and is disordered in the structure of RB49 gp17, suggesting that it is flexible. Its location at the interface with the ATPase domain in the structure of the T4 gp17 (Sun et al, 2008) indicates that its conformation depends on interactions between domains.

On the basis of the T4 gp17 structure, a model for DNA translocation by the DNA packaging motor was proposed in which translation of the nuclease domain drove DNA packaging (Sun et al, 2008). During DNA translocation, the nuclease domain was proposed to interact with DNA through electrostatic interactions at a second DNA-binding site located at a surface opposite to the nuclease active site. Assuming conservation of the translocation mechanism, the positively charged residues should be conserved in other nucleases. Six residues were predicted to be important for the interaction with DNA, but only one of these is present in the G2P nuclease (Fig 1D). No solvent-exposed basic residues located near this binding site are conserved in phages closely related to SPP1 (supplementary Fig S1 online). We also note that there are no strictly conserved positively charged residues at the surface of the nuclease domain (supplementary Fig S1 online). These observations are inconsistent with the translocation model proposed earlier (Sun et al, 2008).

Conserved acidic residues are required for catalysis in RNase H family proteins. These residues coordinate a pair of Mg2+ ions spaced approximately 4.0 Å apart that are proposed to activate the attacking nucleophile and stabilize the transition state (Steitz & Steitz, 1993; Yang et al, 2006; Nowotny et al, 2007). Although four acidic residues coordinate the two Mg2+ ions in most RNase H family members, this arrangement is not strictly required for coordinating the two ions (Fig 2A; Yang et al, 2006). Mutation analyses have established acidic residues essential for activity in other RNase H family proteins. Three of these acidic residues are identically situated in previously determined structures of RNase H family proteins and coincide with acidic residues in G2P (Asp 266, Asp 321 and Asp 403; Fig 2A). Acidic residues have also been identified in the large terminases of T4 and T5 that are essential for catalysis (Ponchon et al, 2006; Alam et al, 2008). The sequence-based alignments suggested conservation of three acidic residues (Alam et al, 2008). However, the structure-based alignment presented here shows that the position of Asp 542 in T4 gp17 is occupied by His 400 in the structure of G2P (Figs 1D, ,2A)2A) and not by Asp 396 as predicted previously (Alam et al, 2008). His 400 is remarkably conserved (supplementary Fig S1 online) and is often a histidine or acidic residue in terminase sequences, strengthening the suggestion that it might be important for catalysis.

Figure 2
Four residues form the nuclease active site in G2P. (A) The active sites of G2P, T4 gp17, RB49 gp17, RuvC, RNase H and the DNA polymerase I 3′–5′ exonuclease are shown with catalytically important residues highlighted. Residue ...

Mutation analysis of the nuclease active site residues

To establish which residues are required for catalysis in G2P, mutants were made of His 400 and all the acidic residues near the active site (Fig 2B,C). Conservative mutations of Asp 266, Asp 321 and Asp 403 disrupted nuclease activity, suggesting that these residues are essential for nuclease activity (Fig 2C; supplementary Fig S2 online). Mutation of other acidic residues partly disrupted nuclease activity, suggesting that the local environment is important for catalysis. Mutation of His 400 in G2P to alanine significantly impaired the in vivo nuclease activity (Fig 2C).

Location of the catalytic metal-binding sites

Mg2+ ions were not added during crystallization and were not observed in the G2PΔN crystal structure. However, the structural conservation suggests the involvement of Mg2+ ions in DNA cleavage. Therefore, crystals of G2PΔN were soaked with 10 mM concentrations of Mg2+ or Mn2+ in an attempt to observe the metal ligands in the active site. A significant peak (19.7σ in the anomalous difference map) was observed in the active site after soaking with Mn2+ (supplementary Fig S3 online). A smaller second peak (6.1σ in the anomalous difference map) was observed at a distance of 3.9 Å from the first peak. It has previously been observed for RNase H that the metal-binding sites do not attain full affinity until a nucleic acid is bound and similar principles might be applicable in the case of G2P (Nowotny et al, 2007). The two identified metal sites appear to be equivalent to the observed and predicted sites in gp17 (Alam et al, 2008).

The manganese present after soaking is coordinated by His 400 and Asp 403 in G2P. This is consistent with the significant reduction in nuclease activity observed after mutation of His 400. To investigate further the metal ion coordination, G2PΔN and other RNase H family proteins containing either one or two bound metal ions were superimposed onto RNase H (supplementary Fig S4 online). Despite variation in the structures, the common strands of the β-sheet aligned well (supplementary Fig S4B online). The metal ions from other RNase H family members clustered together near the equivalent of Asp 266 in G2P; approximately the location of the second small peak observed in the Mn2+ anomalous map (supplementary Fig S4B online). By contrast, the positions of Mn2+ in G2P and the Mg2+ in RB49 gp17 were similar to each other but translated with respect to other RNase H family proteins by approximately 5 Å. This suggests that the orientation of the active site has changed in relation to the remainder of the protein. This change might be related to the strict regulation of nuclease activity in terminases.

Implications for the regulation of nuclease activity

A model of G2PΔN with DNA bound in the catalytic active site was generated to comprehend better the interaction between G2P and DNA. G2PΔN was structurally aligned with RNase H nucleic acid complexes and a B-DNA model was aligned with the nucleic acid from the complex structure (Nowotny et al, 2007). This alignment was manually optimized to improve the contacts with G2P (Fig 3A), while maintaining the orientation of DNA. Specifically, the ridge adjacent to the catalytic residues, particularly Tyr 269, was positioned in the major groove. The basic residues near the groove are in close proximity to the DNA phosphate backbone in the model.

Figure 3
The ATPase domain is required for the nuclease activity of G2P. (A) A model of DNA bound to G2PΔN was generated on the basis of the alignment of G2PΔN with an RNase H–nucleic acid complex. The G2PΔN ribbon with bound DNA ...

Unexpectedly, the protruding C-terminal β-hairpin is predicted to clash with an extended linear nucleic acid, suggesting that the conformation of this loop must change for nucleic acid binding to occur. This β-hairpin is also present in the gp17 structures (Sun et al, 2008), indicating that it is a conserved feature among terminases. Three lines of evidence suggest that the β-hairpin is flexible. The B-factors observed in the β-hairpin are the highest observed in the structure (supplementary Fig S5A online) and the largest differences in Cα positions are observed at the β-hairpin in a comparison between the native and Mn2+ structures (supplementary Fig S5B online). Indeed, the β-hairpin adopts a different conformation in the T4 gp17 structure to that observed in G2P (Fig 1C) and is disordered in the RB49 gp17 structure (Sun et al, 2008). Additionally, we performed normal mode analysis calculations to investigate possible conformational changes in the G2PΔN. These showed low-energy modes that involve significant movements of the β-hairpin (supplementary Movie 1 online). This flexibility of the β-hairpin should permit conformational changes that allow the active site to bind DNA. Interestingly, His 400 and Asp 403 in G2P are adjacent to the β-hairpin. Small conformational changes can induce large changes in catalytic rate in RNase H family proteins because the Mg2+ ions require precise coordination (Yang et al, 2006). Therefore, it is possible that conformational changes involving the β-hairpin could influence the nuclease active site in addition to altering active site accessibility.

In vivo nuclease assays were performed to determine the activity of full-length G2P and G2PΔN (Fig 3B). Full-length G2P was found to be active and to degrade plasmid DNA whether wild type or expressed with a C-terminal fusion. Unexpectedly, the G2P nuclease domain expressed in isolation was not active. No activity of G2PΔN was observed when expressed with an N-terminal His tag (the construct used to obtain the crystal structure) or without. This suggests that the ATPase domain of G2P affects the conformation of the nuclease domain to enable hydrolysis of phosphodiester bonds. In agreement, the ATPase domain of T4 gp17 also enhances the nuclease activity as deletion of the ATPase domain reduces nuclease activity 13-fold (Alam et al, 2008).

The terminase nuclease is predicted to cleave viral DNA in two different environments. During initiation of packaging, the pac site is bound to several G1P oligomers (Chai et al, 1995; Gual & Alonso, 1998). G2P is then recruited to the G1P:pac complex where it cleaves the bent pac site (Gual et al, 2000). Once the capsid is full, the DNA is cleaved by G2P in a sequence-independent manner. It is assumed that the DNA substrate for this cut is not bent. In this case, the nuclease could be activated by changes in the conformation of the β-hairpin. Therefore, it is possible that the terminase nuclease cleaves optimally either when the substrate DNA is bent or when activity is induced by conformational changes associated with capsid filling. In the T4 gp17 structure, the β-hairpin is located at the junction of the three domains, which is an ideal position to sense conformational changes. Indeed, extrapolation of our DNA-bound model to the full-length large terminase structure observed in T4 gp17 suggests that significant conformation changes must occur to enable cleavage of linear DNA.


The structural data reported here show that the core of the G2P large terminase nuclease domain has an RNase H fold and is similar to the structures of T4 and RB49 gp17 despite low sequence identity. A structure-based alignment shows conserved features present only in terminases. The conservation of crucial residues required for catalysis suggests that the nuclease domain in G2P will cleave DNA using a similar mechanism to other RNase H family proteins, although a different position of the metal ions might be related to the tight control of nuclease activity. A combination of structural observations with nuclease assays and normal mode analysis calculations indicates that conformational changes involving the β-hairpin, which is not present in other RNase H endonucleases, might be required for nuclease activity.


Cloning, expression and purification of G2PΔN and mutants. The complementary DNA encoding either full-length G2P (Refseq accession NC_004166, locus SPP1p003) or residues 232–422 (G2PΔN) was cloned into either pET22b or pET28a (Novagen, Madison, WI, USA) expression vectors between the NdeI and XhoI (full-length G2P with a C-terminal His tag) or NdeI and HindIII (full-length untagged protein and N-terminal His-tagged G2PΔN) restriction sites. The construct that expressed untagged G2PΔN was created by subcloning G2PΔN from pET28 into the NdeI and HindIII restriction sites in pET22b. Point mutations were introduced by QuikChange mutagenesis (Stratagene, La Jolla, CA, USA). G2PΔN was expressed in B834 (DE3) cells overnight at 16°C. Selenomethionine G2PΔN was prepared as described by Ramakrishnan et al (1993). Proteins were purified by Ni2+ affinity chromatography followed by size-exclusion chromatography on a Superdex 75 column (GE Healthcare Life Sciences, Uppsala, Sweden) equilibrated in 20 mM sodium phosphate buffer pH 7.5, 1 M NaCl, 5% glycerol and 5 mM DTT.

Crystallization, data collection and structure solution. Crystals were obtained by sitting drop vapour diffusion in which 150 nl of selenomethionine-labelled G2PΔN at 30 mg/ml was mixed with 150 nl of reservoir solution containing 80 mM Tris pH 8.5 and 2.3 M Li2SO4. Crystals grew after approximately 2 months and were frozen directly from the drop. The X-ray data were collected at the European Synchrotron Radiation Facility (ESRF) beamline BM14 and the structure was determined as described in the supplementary information online. Normal mode analysis was performed using the El Nemo server (Suhre & Sanejouand, 2004). Coordinates and structure factors have been deposited with the Protein Data Bank, accession codes are 2WBN and 2WC9.

Endonuclease activity assays. Nuclease activity was observed by degradation of plasmid DNA in Escherichia coli as described (Kanamaru et al, 2004). Briefly, each G2P construct was transformed into E. coli BL21 Rosetta pLysS cells. Luria–Bertani cultures were inoculated from overnight cultures and grown at 37°C. Expression was induced by the addition of 1 mM isopropylthiogalactoside at an A600 of approximately 0.8 for 2 h at 37°C. Samples of each culture were taken before induction and after expression to prepare plasmid DNA (4 ml samples) and check protein expression (1 ml samples). The expression of each construct was verified by SDS–polyacrylamide gel electrophoresis. Plasmid DNA was prepared using a commercial miniprep kit (Qiagen, Hilden, Germany) and analysed by 1% agarose gel electrophoresis.

Supplementary information is available at EMBO reports online (http://www.emboreports.org)

Supplementary Material

Supplementary Material

Supplementary Mov S1


We thank Robert Byrne and Andrey Lebedev for help during structure determination and Martin Walsh for help during the data collection at the UK beamline BM14 (ESRF, Grenoble), which is supported by the UK Biotechnology and Biological Sciences Research Council, Engineering and Physical Sciences Research Council and Medical Research Council. This study was supported by a Wellcome Trust fellowship to A.A.A. and by grants CSD2007-00010 from Ministerio de Educación y Ciencias, S-0505/-MAT0283 from Comunidad de Madrid to J.C.A.


The authors declare that they have no conflict of interest.


  • Alam TI, Draper B, Kondabagil K, Rentas FJ, Ghosh-Kumar M, Sun S, Rossmann MG, Rao VB (2008) The headful packaging nuclease of bacteriophage T4. Mol Microbiol 69: 1180–1190 [PubMed]
  • Alonso JC, Tavares P, Lurz R, Trautner TA (2006) Bacteriophage SPP1, 2nd edn. Oxford, UK: Oxford University Press
  • Black LW (1989) DNA packaging in dsDNA bacteriophages. Annu Rev Microbiol 43: 267–292 [PubMed]
  • Bravo A, Alonso JC, Trautner TA (1990) Functional analysis of the Bacillus subtilis bacteriophage SPP1 pac site. Nucleic Acids Res 18: 2881–2886 [PMC free article] [PubMed]
  • Camacho AG, Gual A, Lurz R, Tavares P, Alonso JC (2003) Bacillus subtilis bacteriophage SPP1 DNA packaging motor requires terminase and portal proteins. J Biol Chem 278: 23251–23259 [PubMed]
  • Catalano CE (2000) The terminase enzyme from bacteriophage lambda: a DNA-packaging machine. Cell Mol Life Sci 57: 128–148 [PubMed]
  • Catalano CE, Cue D, Feiss M (1995) Virus DNA packaging: the strategy used by phage λ. Mol Microbiol 16: 1075–1086 [PubMed]
  • Chai S, Lurz R, Alonso JC (1995) The small subunit of the terminase enzyme of Bacillus subtilis bacteriophage SPP1 forms a specialized nucleoprotein complex with the packaging initiation region. J Mol Biol 252: 386–398 [PubMed]
  • Cue D, Feiss M (1997) Genetic evidence that recognition of cosQ, the signal for termination of phage λ DNA packaging, depends on the extent of head filling. Genetics 147: 7–17 [PMC free article] [PubMed]
  • Deichelbohrer I, Messer W, Trautner TA (1982) Genome of Bacillus subtilis bacteriophage SPP1: structure and nucleotide sequence of pac, the origin of DNA packaging. J Virol 42: 83–90 [PMC free article] [PubMed]
  • Fujisawa H, Morita M (1997) Phage DNA packaging. Genes Cells 2: 537–545 [PubMed]
  • Gual A, Alonso JC (1998) Characterization of the small subunit of the terminase enzyme of the Bacillus subtilis bacteriophage SPP1. Virology 242: 279–287 [PubMed]
  • Gual A, Camacho AG, Alonso JC (2000) Functional analysis of the terminase large subunit, G2P, of Bacillus subtilis bacteriophage SPP1. J Biol Chem 275: 35311–35319 [PubMed]
  • Kanamaru S, Kondabagil K, Rossmann MG, Rao VB (2004) The functional domains of bacteriophage t4 terminase. J Biol Chem 279: 40795–40801 [PubMed]
  • Krissinel E, Henrick K (2004) Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions. Acta Crystallogr D Biol Crystallogr 60: 2256–2268 [PubMed]
  • Nowotny M, Gaidamakov SA, Ghirlando R, Cerritelli SM, Crouch RJ, Yang W (2007) Structure of human RNase H1 complexed with an RNA/DNA hybrid: insight into HIV reverse transcription. Mol Cell 28: 264–276 [PubMed]
  • Ponchon L, Boulanger P, Labesse G, Letellier L (2006) The endonuclease domain of bacteriophage terminases belongs to the resolvase/integrase/ribonuclease H superfamily: a bioinformatics analysis validated by a functional study on bacteriophage T5. J Biol Chem 281: 5829–5836 [PubMed]
  • Ramakrishnan V, Finch JT, Graziano V, Lee PL, Sweet RM (1993) Crystal structure of globular domain of histone H5 and its implications for nucleosome binding. Nature 362: 219–223 [PubMed]
  • Rao VB, Feiss M (2008) The bacteriophage DNA packaging motor. Annu Rev Genet 42: 647–681 [PubMed]
  • Steitz TA, Steitz JA (1993) A general two-metal-ion mechanism for catalytic RNA. Proc Natl Acad Sci USA 90: 6498–6502 [PMC free article] [PubMed]
  • Suhre K, Sanejouand YH (2004) ElNemo: a normal mode web server for protein movement analysis and the generation of templates for molecular replacement. Nucleic Acids Res 32: W610–W614 [PMC free article] [PubMed]
  • Sun S, Kondabagil K, Draper B, Alam TI, Bowman VD, Zhang Z, Hegde S, Fokine A, Rossmann MG, Rao VB (2008) The structure of the phage T4 DNA packaging motor suggests a mechanism dependent on electrostatic forces. Cell 135: 1251–1262 [PubMed]
  • Yang W, Lee JY, Nowotny M (2006) Making and breaking nucleic acids: two-Mg2+-ion catalysis and substrate specificity. Mol Cell 22: 5–13 [PubMed]

Articles from EMBO Reports are provided here courtesy of The European Molecular Biology Organization
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...


  • Compound
    PubChem Compound links
  • Gene (nucleotide)
    Gene (nucleotide)
    Records in Gene identified from shared sequence links
  • Nucleotide
    Published Nucleotide sequences
  • Protein
    Published protein sequences
  • PubMed
    PubMed citations for these articles
  • Structure
    Published 3D structures
  • Substance
    PubChem Substance links

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...