Logo of jbacterPermissionsJournals.ASM.orgJournalJB ArticleJournal InfoAuthorsReviewers
J Bacteriol. Feb 2007; 189(4): 1473–1477.
Published online Sep 29, 2006. doi:  10.1128/JB.01227-06
PMCID: PMC1797351

Complete Genome of Acute Rheumatic Fever-Associated Serotype M5 Streptococcus pyogenes Strain Manfredo[down-pointing small open triangle]


Comparisons of the 1.84-Mb genome of serotype M5 Streptococcus pyogenes strain Manfredo with previously sequenced genomes emphasized the role of prophages in diversification of S. pyogenes and the close relationship between strain Manfredo and MGAS8232, another acute rheumatic fever-associated strain.

Streptococcus pyogenes (alternatively referred to as group A Streptococcus) is responsible for diverse diseases in humans, including pharyngitis, toxic shock syndrome, impetigo, and scarlet fever, and the postinfection sequela acute rheumatic fever (ARF) (7). We sequenced the genome of a serotype M5 strain of S. pyogenes, strain Manfredo, which was isolated from an ARF patient in 1952 in the United States (19). The genomes of 11 other S. pyogenes strains have been sequenced, including strains that were associated with various clinical conditions and representatives of the following eight serotypes: M1 (10, 22), M2 (4), M3 (5, 14), M4 (4), M6 (2), M12 (4), M18 (20), and M28 (11). Based on multilocus sequence typing (MLST), which has been used to investigate the genetic relationships of S. pyogenes (8), the 12 sequenced S. pyogenes strains can be placed into nine sequence types (ST) that are distributed throughout the S. pyogenes population (Fig. (Fig.1).1). Strain Manfredo (ST99) is most closely related to the serotype M6 strain MGAS10394 (ST382) (3) and the serotype M18 strain MGAS8232 (ST42) (20). In each case, three of the seven MLST alleles are identical to those of Manfredo (www.mlst.net), and only two of these alleles are present in all three strains, suggesting that although these strains are clearly related, they are not clonal.

FIG. 1.
Phylogenetic diversity of the sequenced S. pyogenes strains: unrooted neighbor-joining tree constructed using concatenated sequences of the seven loci used in MLST for a representative selection of STs from the S. pyogenes MLST database (www.mlst.net ...

The Manfredo sequence was assembled, finished, and annotated as described previously (12, 15), using Artemis to collate data and facilitate annotation (18). The genome consists of a single 1,841,271-bp circular chromosome, which contains 1,819 protein-coding sequences (CDSs). Approximately 14% of these CDSs (254 CDSs) are contained in prophages. A comparison of all the other S. pyogenes sequenced strains except serotype M3 strain SSI-1 showed that the genomes are colinear (14), with multiple prophage insertions throughout the chromosomes (2). In the case of Manfredo and SSI-1 there is a large central inversion (~1.3 Mb) (Fig. (Fig.2),2), which probably resulted from reciprocal recombination between rrn-comX regions that are similar distances from the terminus of replication (14) (Fig. (Fig.2).2). Notably, this large inversion is visible in the comparison of Manfredo with MGAS10394 and MGAS8232 (Fig. (Fig.2),2), which are more closely related to Manfredo than SSI-1 is (Fig. (Fig.1).1). This suggests that the inversion occurred independently in Manfredo and SSI-1. There is an additional smaller (~200-kb) rearrangement near the terminus of strain SSI-1 (compared to MGAS315) (Fig. (Fig.2),2), due to reciprocal recombination between prophages across the replication axis (14). This smaller inversion was not found in the Manfredo chromosome, which lacks prophages inserted at equivalent sites (Fig. (Fig.2).2). It seems clear that intrachromosomal recombination is an important mechanism contributing to the evolution of both S. pyogenes genomes and prophages and has the potential to generate novel recombinant prophages with alternative cargos (14).

FIG. 2.
Comparison of the genome structures of S. pyogenes: pairwise comparisons of the S. pyogenes MGAS315, SSI-1, Manfredo, MGAS10394, and MGAS8232 chromosomes displayed using the Artemis Comparison Tool (ACT) (6). The sequences were aligned using the predicted ...

The Manfredo genome contains five prophages ([var phi]Man.1, [var phi]Man.2, [var phi]Man.3, [var phi]Man.4, and [var phi]Man.5) (Fig. (Fig.2)2) that exhibit mosaic relationships with other S. pyogenes prophages. To illustrate the relationships of the prophages, for each of the sequenced S. pyogenes strains all of the resident prophages were concatenated (joined end to end) and compared using DOTTER (21), which displays nucleotide similarity as a dot matrix plot (Fig. (Fig.3).3). In Fig. Fig.3,3, individual colored bars on the x and y axes represent the concatenated prophage sequences for each of the strains indicated, and vertical and horizontal lines indicate the junctions between individual prophages on the x and y axes, respectively. Diagonal lines indicate sequence similarity; self-matching of the concatenated sequences generated the continuous central diagonal line. Lines on either side of the continuous central diagonal line indicate regions where there is extended sequence identity in the forward (parallel) and reverse (perpendicular) orientations for intersecting sequences. Of particular note is the fact that the distribution of prophages and the overall genetic relatedness between the strains are clearly not congruent. Divergent prophages are inserted at exactly the same sites in closely related strains, while identical sites in divergent strains can be occupied by highly conserved prophages. For example, [var phi]Man.5 and [var phi]10394.8 are clearly distinct prophages (Fig. (Fig.3)3) occupying the same sites in the closely related serotype M5 strain Manfredo and serotype M6 strain MGAS10394 (Fig. (Fig.1),1), while [var phi]Man.4 and [var phi]10750.1 are similar prophages (Fig. (Fig.3)3) at the same chromosomal location in distantly related serotype M5 strain Manfredo and serotype M4 strain MGAS1075 (Fig. (Fig.33).

FIG. 3.
Comparative nucleotide sequence analysis of S. pyogenes prophages: dot matrix showing the relatedness of the nucleotide sequences of prophages generated with DOTTER (21). The prophages used in the comparison (in order) were joined end to end and were ...

Four of the five Manfredo prophages encode putative virulence factors that are identical or very similar (>98% amino acid identity) to proteins encoded by CDSs present in prophages in other S. pyogenes strains. These factors include for [var phi]Man.1, streptococcal phage DNase SpyM50534; for [var phi]Man.2, streptococcal phage DNase SpyM50691; for [var phi]Man.3, exotoxin H SpyM51021 and exotoxin I (pseudogene) SpyM501024; and for [var phi]Man.4, streptococcal phage DNase SpyM501263 and exotoxin C SpyM501264. The fifth prophage, [var phi]Man.5, does not contain any CDSs with similarity to CDSs encoding known virulence factors and appears to be a satellite phage. The potential for prophage recombination events to generate novel combinations of virulence genes is highlighted by the fact that mosaic structures are evident in the regions carrying the virulence determinants. For example, prophages [var phi]Man.2 and [var phi]8232.4, which are inserted at the same attachment site in their respective genomes, exhibit extended similarity to each other but also include clearly divergent sequences (Fig. (Fig.3).3). The sequences at left ends of these prophages are divergent and encode streptococcal phage DNases with less than 30% amino acid identity. Notably, the streptococcal phage DNase of [var phi]8232.4 is almost identical (99.624% amino acid identity) to the streptococcal phage DNase encoded by [var phi]Man.1, a prophage that displays far less sequence conservation with [var phi]8232.4 than [var phi]Man.2 displays (Fig. (Fig.33).

Excluding genes associated with prophages, ~68% of the CDSs in Manfredo have orthologs in all of the other sequenced strains. Therefore, taking into account the contribution that prophages make, between 14 and 18% of the genome is composed of CDSs that are not conserved in one or more of the sequenced strains. Some of the CDSs in this variable component may contribute to the clinical differences between the strains. For the 12 sequenced strains, there is very strong evidence linking both serotype M5 strain Manfredo and serotype M18 strain MGAS8232 with ARF (7, 20), and the genetic relationship between these two strains is interesting (Fig. (Fig.1).1). However, reciprocal FASTA (16) analysis identified only a single CDS that was conserved in Manfredo and MGAS8232 but absent in the other S. pyogenes strains. This CDS encodes a surface-anchored protein (SpyM50104) that is a shortened allelic variant of a fibronectin-binding protein commonly found in the variable FCT (9) region of S. pyogenes genomes. The fibronectin-binding protein variants of Manfredo and MGAS8232 appear to be truncated compared to the other variants, but they retain functionally important N-terminal signal and C-terminal sortase processing motifs (VPXTG) that predict that they are likely to be expressed on the cell surface. However, while the conservation of this locus in strains that have been very clearly associated with ARF may be worthy of further investigation, it must be emphasized that there is no evidence at this stage that the locus influences the pathology of these strains. It must also be emphasized that it is by no means clear that these are the only two representatives of the 12 sequenced strains that have the capacity to cause ARF, since establishing epidemiological relationships between ARF and individual S. pyogenes strains is notoriously difficult (13). Alternatively, it is possible that subtle differences in the conserved components of the genome may be more important in distinguishing ARF isolates. Investigating the distribution and functional effects of single-nucleotide polymorphisms (SNPs) in ARF strains and related non-ARF strains may hold the key to unraveling the causes of this complex disease. To this end, 14,962 SNPs were identified in the comparison of ARF strains (Manfredo and MGAS8232), in contrast to the 15,460 SNPs identified in a comparison of similarly related strains (Manfredo and MGAS10394). To pinpoint the functionally significant SNPs among the genetic noise, wider population studies are required.

Nucleotide sequence accession number.

The sequence and annotation of the Manfredo genome has been deposited in the EMBL database under accession number AM295007.


We acknowledge the support of the Wellcome Trust Sanger Institute core sequencing and informatics groups.

This work was supported by the Wellcome Trust through its Beowulf Genomics initiative.


[down-pointing small open triangle]Published ahead of print on 29 September 2006.


1. Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman. 1990. Basic local alignment search tool. J. Mol. Biol. 215:403-410. [PubMed]
2. Banks, D. J., S. B. Beres, and J. M. Musser. 2002. The fundamental contribution of phages to GAS evolution, genome diversification and strain emergence. Trends Microbiol. 10:515-521. [PubMed]
3. Banks, D. J., S. F. Porcella, K. D. Barbian, S. B. Beres, L. E. Philips, J. M. Voyich, F. R. DeLeo, J. M. Martin, G. A. Somerville, and J. M. Musser. 2004. Progress toward characterization of the group a Streptococcus metagenome: complete genome sequence of a macrolide-resistant serotype M6 strain. J. Infect. Dis. 190:727-738. [PubMed]
4. Beres, S. B., E. W. Richter, M. J. Nagiec, P. Sumby, S. F. Porcella, F. R. DeLeo, and J. M. Musser. 2006. Molecular genetic anatomy of inter- and intraserotype variation in the human bacterial pathogen group A Streptococcus. Proc. Natl. Acad. Sci. USA 103:7059-7064. [PMC free article] [PubMed]
5. Beres, S. B., G. L. Sylva, K. D. Barbian, B. F. Lei, J. S. Hoff, N. D. Mammarella, M. Y. Liu, J. C. Smoot, S. F. Porcella, L. D. Parkins, D. S. Campbell, T. M. Smith, J. K. McCormick, D. Y. M. Leung, P. M. Schlievert, and J. M. Musser. 2002. Genome sequence of a serotype M3 strain of group A Streptococcus: phage-encoded toxins, the high-virulence phenotype, and clone emergence. Proc. Natl. Acad. Sci. USA 99:10078-10083. [PMC free article] [PubMed]
6. Carver, T. J., K. Rutherford, M. Berriman, M. A. Rajandream, B. Barrell, and J. Parkhill. 2005. ACT: the Artemis comparison tool. Bioinformatics 21:3422-3423. [PubMed]
7. Cunningham, M. W. 2000. Pathogenesis of group A streptococcal infections. Clin. Microbiol. Rev. 13:470-511. [PMC free article] [PubMed]
8. Enright, M. C., B. G. Spratt, A. Kalia, J. H. Cross, and D. E. Bessen. 2001. Multilocus sequence typing of Streptococcus pyogenes and the relationships between emm type and clone. Infect. Immun. 69:2416-2427. [PMC free article] [PubMed]
9. Felsenstein, J. 1989. PHYLIP—Phylogeny Inference Package (version 3.2). Cladistics 5:164-166.
10. Ferretti, J. J., W. M. McShan, D. Ajdic, D. J. Savic, G. Savic, K. Lyon, C. Primeaux, S. Sezate, A. N. Suvorov, S. Kenton, H. S. Lai, S. P. Lin, Y. D. Qian, H. G. Jia, F. Z. Najar, Q. Ren, H. Zhu, L. Song, J. White, X. L. Yuan, S. W. Clifton, B. A. Roe, and R. McLaughlin. 2001. Complete genome sequence of an M1 strain of Streptococcus pyogenes. Proc. Natl. Acad. Sci. USA 98:4658-4663. [PMC free article] [PubMed]
11. Green, N. M., S. Zhang, S. F. Porcella, M. J. Nagiec, K. D. Barbian, S. B. Beres, R. B. LeFebvre, and J. M. Musser. 2005. Genome sequence of a serotype M28 strain of group A Streptococcus: potential new insights into puerperal sepsis and bacterial disease specificity. J. Infect. Dis. 192:760-770. [PubMed]
12. Holden, M. T. G., E. J. Feil, J. A. Lindsay, S. J. Peacock, N. P. J. Day, M. C. Enright, T. J. Foster, C. E. Moore, L. Hurst, R. Atkin, A. Barron, N. Bason, S. D. Bentley, C. Chillingworth, T. Chillingworth, C. Churcher, L. Clark, C. Corton, A. Cronin, J. Doggett, L. Dowd, T. Feltwell, Z. Hance, B. Harris, H. Hauser, S. Holroyd, K. Jagels, K. D. James, N. Lennard, A. Line, R. Mayes, S. Moule, K. Mungall, D. Ormond, M. A. Quail, E. Rabbinowitsch, K. Rutherford, M. Sanders, S. Sharp, M. Simmonds, K. Stevens, S. Whitehead, B. G. Barrell, B. G. Spratt, and J. Parkhill. 2004. Complete genomes of two clinical Staphylococcus aureus strains: evidence for the rapid evolution of virulence and drug resistance. Proc. Natl. Acad. Sci. USA 101:9786-9791. [PMC free article] [PubMed]
13. Kehoe, M. A., V. Kapur, A. M. Whatmore, and J. M. Musser. 1996. Horizontal gene transfer among group A streptococci: implications for pathogenesis and epidemiology. Trends Microbiol. 4:436-443. [PubMed]
14. Nakagawa, I., K. Kurokawa, A. Yamashita, M. Nakata, Y. Tomiyasu, N. Okahashi, S. Kawabata, K. Yamazaki, T. Shiba, T. Yasunaga, H. Hayashi, M. Hattori, and S. Hamada. 2003. Genome sequence of an M3 strain of Streptococcus pyogenes reveals a large-scale genomic rearrangement in invasive strains and new insights into phage evolution. Genome Res. 13:1042-1055. [PMC free article] [PubMed]
15. Parkhill, J., M. Achtman, K. D. James, S. D. Bentley, C. Churcher, S. R. Klee, G. Morelli, D. Basham, D. Brown, T. Chillingworth, R. M. Davies, P. Davis, K. Devlin, T. Feltwell, N. Hamlin, S. Holroyd, K. Jagels, S. Leather, S. Moule, K. Mungall, M. A. Quail, M. A. Rajandream, K. M. Rutherford, M. Simmonds, J. Skelton, S. Whitehead, B. G. Spratt, and B. G. Barrell. 2000. Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature 404:502-506. [PubMed]
16. Pearson, W. R., and D. J. Lipman. 1988. Improved tools for biological sequence comparison. Proc. Natl. Acad. Sci. USA 85:2444-2448. [PMC free article] [PubMed]
17. Perriere, G., and M. Gouy. 1996. WWW-query: an on-line retrieval system for biological sequence banks. Biochimie 78:364-369. [PubMed]
18. Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajandream, and B. Barrell. 2000. Artemis: sequence visualization and annotation. Bioinformatics 16:944-945. [PubMed]
19. Siegel, A., E. Johnson, and G. Stollerman. 1961. Controlled studies of streptococcal pharyngitis in a pediatric population. 1. Factors related to the attack rate of rheumatic fever. N. Engl. J. Med. 265:559-566.
20. Smoot, J. C., K. D. Barbian, J. J. Van Gompel, L. M. Smoot, M. S. Chaussee, G. L. Sylva, D. E. Sturdevant, S. M. Ricklefs, S. F. Porcella, L. D. Parkins, S. B. Beres, D. S. Campbell, T. M. Smith, Q. Zhang, V. Kapur, J. A. Daly, L. G. Veasy, and J. M. Musser. 2002. Genome sequence and comparative microarray analysis of serotype M18 group A Streptococcus strains associated with acute rheumatic fever outbreaks. Proc. Natl. Acad. Sci. USA 99:4668-4673. [PMC free article] [PubMed]
21. Sonnhammer, E. L., and R. Durbin. 1995. A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene 167:GC1-10. [PubMed]
22. Sumby, P., S. F. Porcella, A. G. Madrigal, K. D. Barbian, K. Virtaneva, S. M. Ricklefs, D. E. Sturdevant, M. R. Graham, J. Vuopio-Varkila, N. P. Hoe, and J. M. Musser. 2005. Evolutionary origin and emergence of a highly successful clone of serotype M1 group A Streptococcus involved multiple horizontal gene transfer events. J. Infect. Dis. 192:771-782. [PubMed]
23. Thompson, J. D., T. J. Gibson, F. Plewniak, F. Jeanmougin, and D. G. Higgins. 1997. The CLUSTAL_X Windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25:4876-4882. [PMC free article] [PubMed]

Articles from Journal of Bacteriology are provided here courtesy of American Society for Microbiology (ASM)
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...