• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of pnasPNASInfo for AuthorsSubscriptionsAboutThis Article
Proc Natl Acad Sci U S A. Feb 12, 2008; 105(6): 2123–2127.
Published online Feb 5, 2008. doi:  10.1073/pnas.0710333105
PMCID: PMC2538888

Superinfection as a driver of genomic diversification in antigenically variant pathogens


A new pathogen strain can penetrate an immune host population only if it can escape immunity generated against the original strain. This model is best understood with influenza viruses, in which genetic drift creates antigenically distinct strains that can spread through host populations despite the presence of immunity against previous strains. Whether this selection model for new strains applies to complex pathogens responsible for endemic persistent infections, such as anaplasmosis, relapsing fever, and sleeping sickness, remains untested. These complex pathogens undergo rapid within-host antigenic variation by using sets of chromosomally encoded variants. Consequently, immunity is developed against a large repertoire of variants, dramatically changing the scope of genetic change needed for a new strain to evade existing immunity and establish coexisting infection, termed strain superinfection. Here, we show that the diversity in the alleles encoding antigenic variants between strains of a highly antigenically variant pathogen was equal to the diversity within strains, reflecting equivalent selection for variants to overcome immunity at the host population level as within an individual host. This diversity among strains resulted in expression of nonoverlapping variants that allowed a new strain to evade immunity and establish superinfection. Furthermore, we demonstrated that a single distinct allele allows strain superinfection. These results indicate that there is strong selective pressure to increase the diversity of the variant repertoire beyond what is needed for persistence within an individual host and provide an explanation, competition at the host population level, for the large genomic commitment to variant gene families in persistent pathogens.

Keywords: antigenic variation, genomic diversity, population immunity

Pathogens that establish persistent infection in individual hosts increase their success for onward transmission to a susceptible population. Antigenic variation allows evasion of immune clearance even within a fully immunocompetent individual host and is a common strategy of pathogens ranging from small genome RNA viruses to eukaryotic parasites. Unlike the RNA viruses that use a high mutation rate and large viral burst size to create new variants de novo (1), bacterial and protozoal pathogens, including those responsible for relapsing fever, sleeping sickness, and syphilis, generate new variants by regulatory or recombinatorial mechanisms in which preexisting coding sequences are differentially activated or translocated into active expression sites (2, 3). As the host develops immunity to each variant, the pathogen generates additional novel variants using this genomic store, allowing persistence but, at the same time, inducing immunity against a broad repertoire of antigens. Within endemic regions this process results in both a high prevalence of infection and widespread population immunity. This epidemiologic pattern is exemplified by vector-borne pathogens in tropical regions, and we hypothesize that this population immunity provides a strong selective pressure for diversification of the variant-encoding alleles and emergence of new strains.

We chose to test this hypothesis by using Anaplasma marginale, the most globally prevalent vector-borne pathogen of livestock (4). A. marginale is naturally transmitted among wild and domestic ruminants and in tropical, endemic regions prevalence exceeds 70% (5, 6). Newborn animals are infected within the first few months and, through sequential generation of antigenic variants, remain infected for life (7, 8). Antigenic variants are generated by gene conversion in which an intact hypervariable region (HVR) or an HVR oligonucleotide segment is recombined from nonexpressed chromosomal loci into the major surface protein-2 (Msp2) expression site (9, 10). This mechanism is very similar to that used by African trypanosomes in which the complete (or nearly complete) variable surface glycoprotein (VSG) coding sequences or oligonucleotide segments are recombined into an expression site to generate a new VSG variant (11). Importantly, A. marginale has a relatively small genome (1/20th the size of Trypanosoma brucei), facilitating the sequencing of the complete repertoire of chromosomal msp2 donor alleles in each of multiple strains (9, 12, 13), data required to test the hypothesis but not yet available for other highly antigenically variable vector-borne pathogens such as Babesia, Plasmodium, and Trypanosoma (1417). In this article, we report the genomic-level allelic diversity among multiple strains of A. marginale, test whether allelic diversity between strains is linked to ability to infect an already persistently infected host, and propose a model in which host population immunity is a driving selective pressure for genomic diversification in highly antigenically variant pathogens.


Alellic Diversity Encoded Within and Among A. marginale Strains.

We first compared the diversity among the HVRs encoded by the full set of genomic msp2 donor alleles within each of five A. marginale strains, St. Maries, Florida, 6DE, B, and EMΦ (12, 13). Within strain HVR sequence diversity encoded by the donor alleles was similar (Fig. 1Left), consistent with the capacity of each strain to generate sufficient variation for long-term within-host persistent infection using either recombination of a complete unique HVR or a HVR segment into the expression site (9, 10, 18). Strikingly, HVR diversity between strains (Fig. 1 Right) was similar to within strain diversity (no significant difference, P > 0.2, when mean, maximal, or minimal levels of diversity were compared by using the Mann–Whitney U test). This high degree of interstrain msp2 allelic diversity occurs despite overall genomewide synteny in gene order and content between strains and is illustrated by the markedly greater diversity in msp2 alleles among strains as compared with proteins encoded by groEL, groES, atpA, recA, and gltA (diversity in msp2 alleles was significantly greater than allelic diversity in these five loci, P < 0.0006). Although identical msp2 alleles were shared between certain strain pairs, each strain contained at least one unique msp2 allele in all pairwise comparisons, whereas the EMΦ strain had no identical alleles with any of the examined strains [supporting information (SI) Fig. 6]. Diversity was not associated with geographic distance of isolation: the degree of msp2 allelic diversity among 6DE, B, and EMΦ, each of which was isolated from the same population of animals, was as great as between these three strains and the St. Maries and Florida strains that were from distinct isolations (Fig. 1 Right). This high level of interstrain diversity was also observed when locus-specific alleles were compared by using the complete genome sequences of the St. Maries and Florida strains (SI Fig. 6).

Fig. 1.
Diversity in A. marginale msp2 allelic repertoire. The diversity among the complete genomic set of alleles within individual strains (Left) or between strain pairs (Right) was determined by alignment of the encoded HVRs. The range is indicated by the ...

Superinfection by Strains with Nonoverlapping Variant Repertoires.

To test whether encoding a distinct set of allelic variants allowed a strain to superinfect an already persistently infected host, we first established infection in four calves with the St. Maries strain of A. marginale and tracked variant expression by repeated sampling during >12 months of persistent infection. The A. marginale variant population became progressively more complex and had a “complexity score” (18) of >2.4 at 12 months postinfection, reflecting gene conversion using oligonucleotide segments from different msp2 pseudogene alleles to create variant mosaics, and confirming exposure to a broad array of antigenic variants (18). The calves were then challenged by tick transmission of the EMΦ strain. There are no identical msp2 alleles shared between these two strains; the most closely related alleles are 76% identical (SI Fig. 6). Using strain-specific PCR, the presence of the transmitted EMΦ strain, and the established St. Maries strain, was detected by 2 weeks postchallenge in all animals (Fig. 2). Tracking of infection over a 4-month period using quantitative strain-specific PCR revealed continued independent replication of both strains in all animals (mean bacteremia levels of 106.7±0.4 and 106.2±0.3 for the St. Maries and EMΦ strains, respectively). To confirm this observation of strain superinfection, we then repeated the experiment with initial infection of four animals by using the EMΦ strain, establishment of persistent infection for 12 months with development of complex Msp2 variant mosaics, and tick-borne challenge by using the St. Maries strain. Superinfection was detected in all animals within 2 weeks after tick transmission (Fig. 2), and independent replication of both strains continued in all animals over the 4-month observation period (mean bacteremia levels of 106.1±0.4 and 105.4±0.4 for the St. Maries and EMΦ strains, respectively). In both sets of experiments, tick transmission to naïve control animals resulted in infection with only the single transmitted strain. These results confirm that strains with distinct msp2 allelic repertoires can establish superinfection in long-term persistently infected hosts.

Fig. 2.
Superinfection by strains with nonoverlapping genomic allelic repertoires. The established strain and, after tick transmission, the challenge strain were detected by strain-specific PCR (26) in each animal (nos. 1–4). (Upper) EMΦ strain ...

Superinfection by Strains with a Single Diverse Allele.

Accepting the hypothesis that strains expressing nonoverlapping repertoires of msp2 alleles are capable of superinfection raised the question as to what level of diversity was sufficient to allow superinfection. To test whether a single highly diverse allele was sufficient, we first established infection in four calves with the 6DE strain, and, after 12 months of persistent infection with variant tracking to ensure exposure to a broad array of variants, challenged these animals by tick transmission of the St. Maries strain. All four animals became superinfected with the St. Maries strain with detection within 2 weeks by strain-specific PCR detection (Fig. 3A) and both strains maintained infection in all animals. As only the St. Maries 9H1 allele is highly diverse when compared with the alleles in the 6DE strain (63% identity to the most closely related 6DE allele; SI Fig. 6), only this allele should confer the ability to evade the preexisting immunity developed against the full repertoire of 6DE variants. To test whether this 9H1 allele was indeed used to establish superinfection, we amplified and sequenced 168 individual St. Maries strain msp2 variants at 14 days after transmission. All 168 of the sequenced variants contained the 9H1 HVR sequence in the expression site (Fig. 3B). This usage of the St. Maries 9H1 allele at the time of superinfection was significantly (P = 0.002; χ2 test of likelihood ratio; α = 0.05) greater than its predicted random expression based on the full genomic complement of alleles. In contrast, the <30% usage of 9H1 in St. Maries strain superinfection of EMΦ animals (for which there would be no specific selection for the expression of the 9H1 allele) and in primary infection of a naïve animal was not significantly different from random (P = 0.37 and 0.11, respectively) (Fig. 3B). This pattern of preferential usage of unique msp2 alleles was confirmed in the reverse experiment in which the 6DE strain established and maintained superinfection in animals persistently infected with the St. Maries strain (Fig. 4A). There was significantly (P = 0.013; χ2 goodness of fit test based on a predicted 1:1 ratio of identical versus nonidentical alleles; α = 0.05) greater usage of the unique 6DE alleles, 35 and 42, as compared with the shared alleles upon superinfection (Fig. 4B). In contrast, there was no evidence of selection for usage of unique alleles in primary 6DE strain infection of a naïve animal (P = 0.69).

Fig. 3.
Superinfection associated with expression of a single unique allele. (A) The established 6DE strain and, after tick transmission, the challenge St. Maries strain were detected by PCR in each of four animals (nos. 1–4). C, naïve control ...
Fig. 4.
Superinfection associated with expression of unique alleles. (A) The established St. Maries strain and, after tick transmission, the challenge 6DE strain were detected by PCR in each of four animals (nos. 1–4). C, naïve control calf; M, ...


Selective pressure limited to the need to generate sufficient antigenic diversity for within-host antigenic variation would predict convergence to an optimal set of alleles, balancing the need for immune escape with retention of surface protein function. Our studies examining the complete genomic repertoire of variant alleles among strains of the bacteria A. marginale did not support this prediction but revealed essentially the same level of diversity among strains as within a strain. A similarly high level of interstrain diversity in the alleles encoding antigenic variants has recently been inferred for the var genes in the malarial parasite Plasmodium falciparum (19). Although the complete repertoire of var alleles was not analyzed (because of the complexity associated with ≈60 alleles per genome), targeted sampling of the var dblα domain revealed extensive diversity among strains.

One explanation for the high degree of interstrain diversity in alleles encoding antigenic variants is that there have been multiple evolutionary pathways to generate sufficient within strain diversity to allow persistent infection in individuals and continued vector-borne transmission. However, the very similar level of allelic diversity among strains as within strains argues that the interstrain diversity is not random but rather reflects an essentially equal selective pressure. The need for a strain not only to be acquired by a vector (selective pressure for within host persistence), but to be successfully transmitted and establish infection in new mammalian hosts within an endemic region that has a high level of population immunity against existing strains could provide this equal selective pressure. This model is illustrated in Fig. 5 where two strains, A and B, which differ in their allelic repertoire are introduced into a population of individual hosts persistently infected with the existing endemic strain. Strain A, which encodes the same allelic variant repertoire as the endemic strain, is not able to establish infection in the population because of the presence of existing immunity against a broad array of the shared antigenic variants. This strain A model is supported by studies within endemic regions where A. marginale strains encoding the same allelic msp2 repertoire do not superinfect (20, 21). In contrast, strain B, which encodes a different set of alleles, can evade preexisting immunity and establish infection within the population as the proportion of strain B-susceptible animals is large. The data presented here using the St. Maries and EMΦ strains support this strain B penetrance model and are consistent with the detection of two strains encoding distinct msp2 alleles within individual animals in populations with high prevalence of infection (13, 20).

Fig. 5.
Model for pathogen strain penetrance into a persistently infected host population as a selection pressure for allelic diversification. Each circle represents an individual host within an endemic population. Strain A (red circles), introduced as either ...

How the allelic repertoire diversifies to generate new strains and on what time scale is not well understood for any of the antigenically complex pathogens. However, the presence of identical alleles at two different chromosomal loci within a strain (observed for both the St. Maries and Florida strains) (12) suggests that gene duplication may provide the template for additional interallelic recombination (22), alone or combined with mutation, to generate diverse alleles with consequent selection at the level of successful superinfection. The ability of a strain with a single diverse allele to superinfect long-term persistently infected hosts, represented by the experiments using the St. Maries and 6DE strains, support that duplication followed by divergence would confer a significant selective advantage. For A. marginale, diversification of a single allele may be sufficient because of the use of each allele in simple recombination and in complex mosaics because of segmental gene conversion (10, 18), whereas more extensive diversification may be needed to provide selective advantage in other pathogens, such as P. falciparum, that encode variants predominately by using complete gene sequences (15, 19). Thus this process may lead not only to diversification but also to expansion of the overall genomic commitment to encoding antigenic variants. The latter event has been proposed to explain the presence of >1,500 vsg alleles in T. brucei, a repertoire far in excess of that needed for within host antigenic variation but which may confer an advantage at the host population level (23, 24). Strong selective pressure for genetic diversification and perhaps genomic expansion as well clearly underscores the challenge for vaccine development to protect against diverse strains within endemic regions and, in addition, may underlie emergence of new strains with distinct transmission and virulence phenotypes.

Materials and Methods

Pathogen Strains and msp2 Allelic Repertoire.

Diversity was determined by using the HVR amino acid sequences encoded by the complete repertoire of msp2 alleles in each A. marginale strain. The alleles were identified in the complete genome sequences of the St. Maries (GenBank accession no. CP000030) (12) and Florida strains (GenBank accession no. EU113268-75). The msp2 alleles in the 6DE (GenBank accession nos. AY928483, AY928497, AY928489, AY928505, AY928507, AY928511), EMΦ (GenBank accession nos. AY928484, AY928485, AY928490, AY928498, AY928506), and B strains (GenBank accession nos. AY928486, AY928491, AY928494, AY928499, AY928502, AY928508) were identified by targeted allelic sequencing (13). To determine the intrastrain diversity of the donor msp2 alleles, the trimmed HVR amino acid sequences within each strain were aligned by using the alignX module of Vector NTI and then adjusted by hand for the best fit. Diversity was reported as a percentage: 100 – percent amino acid identity. Interstrain diversity was similarly determined by using all pairwise comparisons of alleles. The percent identities within a strain and between each strain pair are presented in SI Fig. 6, and the means and ranges for diversity are reported in Fig. 1. To establish baseline diversity among strains, the amino acid sequences encoded by five housekeeping genes (groEL, groES, atpA, recA, and gltA) among five different strains were compared.

Establishment of Primary Infection.

A total of 16 age- and gender-matched Holstein calves, confirmed to be A. marginale negative by the Msp-5 CI-ELISA (25), were each infected by tick transmission using infected Dermacentor andersoni males. Calves were infected by tick feeding using 6DE (n = 4), EMΦ (n = 4), or St. Maries (n = 8) strain-infected Reynolds Creek D. andersoni. The level of A. marginale bacteremia during persistent infection was determined by quantitative real-time PCR assay for the single copy gene msp5 (26). The expression of complex mosaic variants during persistent infection was confirmed by sequencing of the expression site as described (18). The complexity of the HVR was quantified as described (18) with 0 representing an HVR derived from recombination of a single donor allele, 1 representing a segmental conversion from a second donor allele, 2 representing segmental conversion from three donor alleles, etc. The mean complexity score in all calves progressed to >2.4 at 12 months of infection before the challenge by tick transmission of the second A. marginale strain.


A cohort of 10 D. andersoni males from the Reynolds Creek stock, infected with the challenge EMΦ strain, were allowed to transmission feed (26) for 7 days on each of four individual Holstein calves persistently infected with the St. Maries strain of A. marginale (representing EMΦ to St. Maries challenge). Similarly, 10 D. andersoni males infected with the challenge St. Maries strain were allowed to transmission feed on each of four individual calves persistently infected with the EMΦ strain (representing St. Maries to EMΦ challenge). The same procedure was used in both St. Maries to 6DE challenge in four calves persistently infected with the 6DE, and 6DE to St. Maries challenge in four calves persistently infected with the St. Maries strain. Cohorts of these infected ticks were identically fed on uninfected, naïve calves at the same time to confirm tick transmission of each strain to cattle. For discrimination between the St. Maries and 6DE strains, the msp1α sequence was amplified by using previously reported strain-common primers (26). The St. Maries and EMΦ strains were differentiated by using strain-specific msp1α primers (St. Maries, forward 5′-CAGCAGAGTATGTGTCCTCC-3′ and reverse, 5′-CATTGGAGCGCATCTCTTGC-3′; EMΦ, forward 5′-TGTTAGCAGAGTGTGTGT CCG-3′ and reverse, 5′-GCCTGACCGCTTTGAGATGA-3′). The amplicons were size-separated and visualized in a 1% agarose gel electrophoresis after staining with ethidium bromide. All amplicons were sequenced to confirm strain identity. Strain-specific quantification was done by using the same amplification primers combined with unique Taqman probes: EMΦ strain, 5′(TET)-CCAGCTGATAGCTCGTCAGCGA-3′; St. Maries strain, 5′(FAM)-TCAGCTGATAGCTCGTTAGCGG-3′. Amplification reactions consisted of an initial denaturation step of 95°C for 10 min, followed by 55 cycles of 95°C for 15 s, 55°C for 15 s, and 72°C for 7 min. Standard curves were generated by using amplification of 102 to 107 copies of full-length msp1α from each strain cloned into PCR-4 TOPO (Invitrogen).

Allelic Usage in Superinfection.

The expressed allele was identified by amplification and sequencing of the single msp2 expression site (18, 27). The expression site HVR amplicons were cloned into the PCR-4 TOPO vector and transformed into TOP10 Escherichia coli cells. The presence of inserts was confirmed by EcoRI digestion, and a minimum of 90 clones were sequenced to assure detection of ≥5% of the superinfecting strain population with 99% confidence. The superinfecting St. Maries and 6DE variants were mapped to each of the msp2 pseudogene repertoires encoded by both strains as described (18). The strain of origin for identical alleles within the expression site was confirmed based on two polymorphisms (positions 523 and 547 relative to the msp2 ATG) in 6DE relative to St. Maries.

Supplementary Material

Supporting Figure:


We thank Ralph Horn, Bev Hunter, James Allison, Melissa Flatt, Duayne Chandler, and Alicia Ewing for technical assistance. This work was supported by National Institutes of Health Grant AI44005, U.S. Department of Agriculture-Cooperative State Research, Education, and Extension Service-National Research Initiative Grant 2004-35600-14175, Wellcome Trust Grant GR075800M, and U.S. Department of Agriculture-Agricultural Research Service Grant 5348-32000-027-00D/-01S.


The authors declare no conflict of interest.

This article is a PNAS Direct Submission.

Data deposition: The sequences reported in this paper have been deposited in the GenBank database (accession nos. EU113268-75).

This article contains supporting information online at www.pnas.org/cgi/content/full/0710333105/DC1.


1. Johnson WE, Desrosiers RC. Annu Rev Med. 2002;53:499–518. [PubMed]
2. Barbour AG, Restrepo BI. Emerg Infect Dis. 2000;6:449–457. [PMC free article] [PubMed]
3. Centurion-Lara A, LaFond RE, Hevner K, Godornes C, Molini BJ, Van Voorhis WC, Lukehart SA. Mol Microbiol. 2004;52:1579–1596. [PubMed]
4. Palmer GH, Brown WC, Rurangirwa FR. Microbes Infect. 2000;2:167–176. [PubMed]
5. Guglielmone AA. Vet Parasitol. 1995;57:109–119. [PubMed]
6. Kocan KM, de la Fuente J, Guglielmone AA, Melendez RD. Clin Microbiol Rev. 2003;16:698–712. [PMC free article] [PubMed]
7. French DM, Brown WC, Palmer GH. Infect Immun. 1999;67:5834–5840. [PMC free article] [PubMed]
8. French DM, McElwain TF, McGuire TC, Palmer GH. Infect Immun. 1998;66:1200–1207. [PMC free article] [PubMed]
9. Brayton KA, Knowles DP, McGuire TC, Palmer GH. Proc Natl Acad Sci USA. 2001;98:4130–4135. [PMC free article] [PubMed]
10. Brayton KA, Palmer GH, Lundgren A, Yi J, Barbet AF. Mol Microbiol. 2002;43:1151–1159. [PubMed]
11. Taylor JE, Rudenko G. Trends Genet. 2006;22:614–620. [PubMed]
12. Brayton KA, Kappmeyer LS, Herndon DR, Dark MJ, Tibbals DL, Palmer GH, McGuire TC, Knowles DP., Jr Proc Natl Acad Sci USA. 2005;102:844–849. [PMC free article] [PubMed]
13. Rodriguez JL, Palmer GH, Knowles DP, Jr, Brayton KA. Gene. 2005;361:127–132. [PubMed]
14. Berriman M, Ghedin E, Hertz-Fowler C, Blandin G, Renauld H, Bartholomeu DC, Lennard NJ, Caler E, Hamlin NE, Haas B, et al. Science. 2005;309:416–422. [PubMed]
15. Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S, et al. Nature. 2002;419:498–511. [PMC free article] [PubMed]
16. Al-Khedery B, Allred DR. Mol Microbiol. 2006;59:402–414. [PubMed]
17. Brayton KA, Lau AO, Herndon DR, Hannick L, Kappmeyer LS, Berens SJ, Bidwell SL, Brown WC, Crabtree J, Fadrosh D, et al. PLoS Pathog. 2007;3:1401–1413. [PMC free article] [PubMed]
18. Futse JE, Brayton KA, Knowles DP, Jr, Palmer GH. Mol Microbiol. 2005;57:212–221. [PubMed]
19. Barry AE, Leliwa-Sytek A, Tavul L, Imrie H, Migot-Nabias F, Brown SM, McVean GA, Day KP. PLoS Pathog. 2007;3:e34. [PMC free article] [PubMed]
20. Palmer GH, Knowles DP, Jr, Rodriguez JL, Gnad DP, Hollis LC, Marston T, Brayton KA. J Clin Microbiol. 2004;42:5381–5384. [PMC free article] [PubMed]
21. Palmer GH, Rurangirwa FR, McElwain TF. J Clin Microbiol. 2001;39:631–635. [PMC free article] [PubMed]
22. Rich SM, Sawyer SA, Barbour AG. Proc Natl Acad Sci USA. 2001;98:15038–15043. [PMC free article] [PubMed]
23. Marcello L, Barry JD. J Eukaryotic Microbiol. 2007;54:14–17. [PubMed]
24. Barry JD, Marcello L, Morrison LJ, Read AF, Lythgoe K, Jones N, Carrington M, Blandin G, Bohme U, Caler E, et al. Biochem Soc Trans. 2005;33:986–989. [PubMed]
25. Torioni de Echaide S, Knowles DP, Jr, McGuire TC, Palmer GH, Suarez CE, McElwain TF. J Clin Microbiol. 1998;36:777–782. [PMC free article] [PubMed]
26. Futse JE, Ueti MW, Knowles DP, Jr, Palmer GH. J Clin Microbiol. 2003;41:3829–3834. [PMC free article] [PubMed]
27. Brayton KA, Meeus PF, Barbet AF, Palmer GH. Infect Immun. 2003;71:6627–6632. [PMC free article] [PubMed]

Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...