![]() | ![]() |
Formats:
|
||||||||||||
Polymorphism at the defensin gene in the Anopheles gambiae complex: testing different selection hypotheses 1 Organisation de Coordination pour la lutte Contre les Endémies en Afrique Centrale (OCEAC), BP 288, Yaoundé, Cameroon 2 Institut de Recherche pour le Développement (IRD) - UR016, BP 1857, Yaoundé, Cameroon 3 Division of Parasitic Diseases, Centers for Diseases Control and Prevention, Entomology Branch, MS F-22, 4770 Buford Highway, Chamblee, GA 30341, USA 4 Center for Tropical Disease Research and Training, Dept of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556, USA 5 Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH), 12735 Twinbrook Parkway, Room 2W13A, Rockville MD 20852, USA *Corresponding author’s address: Frederic Simard, OCEAC, P.O. Box 288, Yaounde, Cameroon, Phone: +237 223 2232; Fax: +237 220 1854; E-mail: simard/at/ird.fr The publisher's final edited version of this article is available at Infect Genet Evol. See other articles in PMC that cite the published article.Abstract Genetic variation in defensin, a gene encoding a major effector molecule of insects immune response was analyzed within and between populations of three members of the Anopheles gambiae complex. The species selected included the two anthropophilic species, An. gambiae and An. arabiensis and the most zoophilic species of the complex, An. quadriannulatus. The first species was represented by four populations spanning its extreme genetic and geographical ranges, whereas each of the other two species was represented by a single population. We found (i) reduced overall polymorphism in the mature peptide region and in the total coding region, together with specific reductions in rare and moderately frequent mutations (sites) in the coding region compared with non coding regions, (ii) markedly reduced rate of nonsynonymous diversity compared with synonymous variation in the mature peptide and virtually identical mature peptide across the three species, and (iii) increased divergence between species in the mature peptide together with reduced differentiation between populations of An. gambiae in the same DNA region. These patterns suggest a strong purifying selection on the mature peptide and probably the whole coding region. Because An. quadriannulatus is not exposed to human pathogens, identical mature peptide and similar pattern of polymorphism across species implies that human pathogens played no role as selective agents on this peptide. Keywords: Anopheles gambiae, Africa, malaria, vector, arthropod, immunity, Defensin, evolution, polymorphism, selection INTRODUCTION The completion of the sequencing of the genome of An. gambiae (Holt et al. 2002), together with the successful germ line transformation of this mosquito (Grossman et al., 2001), and the identification of key molecules and genes affecting susceptibility of mosquitoes to Plasmodium under laboratory conditions (Barillas-Mury et al., 1996; Blandin et al., 2004; Osta et al., 2004a,b) provide strong support for malaria control via the introduction and spread of refractory genes into vector populations (Collins et al., 2000). Immune-response molecules are particularly promising as determinants of vector susceptibility and have been the focus of a number of recent studies (Christophides et al., 2002; Dimopoulos, 2003; Osta et al., 2004a,b; Meister et al., 2005). Some evidence suggests that Anopheles susceptibility to Plasmodium depends on the specific genotype of the vector and the parasite (Tahar et al., 2002; Lambrechts et al., 2005). Such finely tuned host-pathogen relationships are expected to mark their signature on the molecular make-up of the genes involved. However, molecular evolution of genes encoding immune response molecules of arthropod diseases vectors has received little attention, despite providing unique and valuable insights into the susceptibility to pathogens in other host species (Schlenke & Begun, 2003). Here, we describe and analyze polymorphism in the defensin gene within and between populations of the An. gambiae complex to address the following questions. Can selection be detected on this gene? If so, what mode of selection? And finally, do human pathogens mediate selection on defensin? Defensin, a member of the cysteine-rich immune peptides, is a primary effector molecule produced by mosquitoes in response to infection with various pathogens (Richman et al., 1996). Defensin is synthesized mainly in the fat body of both larvae and adults and secreted into the haemolymph. It is expressed constitutively at low rates in adults and larvae, but following an infection challenge expression increases dramatically (Richman et al., 1997; Dimopoulos et al., 1998; Eggleston et al., 2000). Sporozoites of Plasmodium gallinaceum (and oocysts to a lower extent) are killed by defensin, but the relevance of this in-vitro study to natural defense needs to be determined (Shahabuddin et al., 1998). Silencing of defensin in An. gambiae demonstrated that it is required for antimicrobial defense against Gram-positive bacteria (Blandin et al., 2002). Defensin is encoded by a single copy gene located at division 41 on the third chromosome of An. gambiae (Vizioli et al., 2001). It is comprised of two exons separated by a short intron. The 102 amino acids (aa) pre-pro-defensin includes a 25 aa signal peptide and a 77 aa segment that is cleaved to produce a 40 aa mature defensin. The signal and pro-peptide sequences of An. gambiae share little similarity with those from other insects, but the mature peptide is conserved and all insect defensins contain six cysteine residues. The promoter region is rich with sequence motifs similar to transcription regulatory elements of insect and mammalian immune response genes. These include binding sites for nuclear factor kappa B, GATA factors, nuclear factor interleukin 6 and interferon consensus elements among others (Eggleston et al., 2000). Induction of defensin transcription can be mediated by Gambif1 (Barillas-Mury et al., 1996), a member of the Rel protein group that mediates transcription regulation of immune response of Drosophila and other insects. We chose populations from the highly anthropophilic species, An. gambiae, the moderately anthropophilic species, An. arabiensis, and the highly zoophilic species, An. quadriannulatus, representing high, moderate and no exposure to human pathogens, respectively (Hadis et al., 1997; Lemasson et al., 1997; Mouchet et al., 2004). The populations of An. gambiae and An. arabiensis are major vectors of Plasmodium falciparum with typical salivary glands infection rates of 3–9% (Hay et al., 2000; Mouchet et al., 2004 and references therein), but only the An. gambiae populations from Nigeria and eastern Kenya also transmit Wucheraria bancrofti, the causative agent of lymphatic filariasis (LF). The four An. gambiae populations include members of the M (Senegal) and S (West and East Kenya and Nigeria) molecular forms and span the maximal genetic distance measured among An. gambiae populations across the continent (Lehmann et al., 2003). Our study thus may help identify the potential and the limitation of such an approach for understanding the evolutionary forces that determine susceptibility to pathogens. MATERIALS AND METHODS Samples and collection methods Mosquitoes were collected between 1994 and 1999 from Asembo Bay (1994, hereafter referred to as western Kenya), Jego (1996, hereafter referred to as eastern Kenya), Gwamlar in central Nigeria (1999), and Barkedji in Senegal (1995). Aliquots of An. quadriannulatus DNA were kindly provided by F. H. Collins, from specimens collected in 1986 in a rural area of southern Zimbabwe (see Collins et al., 1988 for more details). Indoor-resting adult mosquitoes (mostly females) were collected by pyrethrum-spray or aspiration in eastern Kenya and Nigeria. In Senegal, blood-seeking mosquitoes were collected by human-baited night catches. In western Kenya, blood fed and blood seeking females were collected at dawn by aspiration from net traps hung over the beds of sleeping volunteers. At each site, mosquitoes were collected within one week from houses less than 5 km apart. Further details on sampling methods and sites of collection are given by Lehmann et al. 2003. Anophelines were identified as members of the An. gambiae complex using morphological keys (Gillies & De Meillon, 1968; Gillies & Coetzee, 1987). Species identification was carried out using PCR (Scott et al., 1993). The PCR-RFLP assay (Favia et al., 1997) was used to determine An. gambiae molecular form. An. gambiae specimens from Kenya and Nigeria were all of the S form, while those from Senegal were of the M form. All An. arabiensis specimens included in this study were collected in Asembo Bay (western Kenya). DNA extraction and sequencing Genomic DNA was extracted from whole mosquitoes as previously described (Lehmann et al., 1996; 2003) and suspended in 100 μl of TE. PCR reactions were carried out using 2 μl of template DNA (from an aliquot of whole-mosquito extracts diluted 1:20 in distilled water) in 50 μl reaction containing 5 units Taq polymerase (Boehringer Mannheim or Gibco BRL) in manufacturer’s buffer, 1.5 mM MgCl2, 200 μM each dNTP (PE Applied Biosystems) and 50 pmol each forward and reverse primers. Primers were designed based on the complete An. gambiae defensin sequence (Eggleston et al., 2000; GenBank Accession number: AF063402). A 1.4 Kb region (position 1524–2972 in the published sequence) encompassing the whole transcribed region of defensin as well as 5′ and 3′ non-transcribed flanking regions, was amplified with forward primer Df1524L (5′ GCG GGG TGA ATG TTA TCT CT 3′) and reverse primer Df2972R (5′ ACA ATA AAA GGA ACG CAA GC 3′). Cycling conditions for amplification included denaturation at 94°C for 5 minutes, followed by 35 cycles at 94°C for 30 seconds, 52°C for 30 seconds and 72°C for 1 minute, with a final extension step at 72°C for 5 minutes. PCR products were examined on a 1% agarose gel, and cloned using the pGem T-vector kit (Promega). Individual transformed colonies (white) were selected. The size of the DNA insert was screened by PCR using pUC/M13 forward and reverse primers. In most cases, a single appropriately sized insert was chosen at random, and sequenced in both directions after purification with the Wizard PCR Purification Kit (Promega). In addition to the previous forward and reverse primers, internal nested primers Df2217L (5′ CGG TGC CAA TCT CAA TAC CCT TT 3′) and Df2280R (5′ GAC AAC GGG AAA AAG GGA TG 3′) were used as sequencing primers. Cycle sequencing was performed using PE BigDye Terminator Ready Reaction Kit according to manufacturer’s recommendations (PE Applied Biosystems). Sequencing reaction products were analyzed on an ABI 377 automated sequencer (PE Applied Biosystems). Sequences were checked for accuracy on both strands using Sequence Navigator (PE Applied Biosystems). Multiple alignment was performed with the Pileup program of GCG (Genetics Computer Group, 1999) using default options, and was adjusted by eye. DNA sequences have been deposited in GenBank under accession numbers DQ211988–DQ212056. To avoid sampling bias, a single allele (haplotype sequence) was arbitrarily selected from each specimen for the analysis. PCR error Because of multiple insertion/deletion (indels) in defensin, direct sequencing was not possible. Sequences were determined from 2–4 independent clones of the same allele, to identify errors resulting from mis-incorporation of nucleotides by Taq polymerase during the PCR amplification. We estimated PCR error rate to be 0.001 per bp in accordance with published records (Kwiatowski et al., 1991). High variation between alleles, allowed distinguishing different alleles and different clones of the same allele. Although we used statistics that are less sensitive to the effect of PCR errors (e.g., nucleotide diversity instead of the number of segregating sites and theta, if derived based on the latter), the polymorphism reported here is biased upwards because of PCR errors. Nevertheless, our inference is unbiased because instead of relying on the absolute values of polymorphism, we compared polymorphism between different functional regions of the gene that have the same probability to include a PCR error once differences in sequence length were accommodated (below). Data analysis Nucleotide diversity (π) was estimated using DnaSp 4.0 (Rozas & Rozas 1999). A more complete summary of polymorphism was obtained by the site frequency spectra (Tajima 1989; Braverman et al. 1995), which describes the frequency of sites that are invariant (f=0), singleton (f=1), and polymorphic (f=2, 3, … n/2), where f is the frequency of the rare nucleotide at this site/position and n is the number of sequences. These spectra distinguish between rare (e.g., singletons) and common mutations (sites where the rarest nucleotide was observed 4–7 times, which is the maximum possible frequency given 9–14 sequences per population). Most neutral mutations are lost or require a very long time to become common in a population and this time is expected to be shorter for positively selected mutations and longer for deleterious mutations. Hence, rare mutations represent a greater fraction of new and mildly deleterious mutations, whereas common ones represent a greater fraction of ancient and neutral mutations. Furthermore, the site frequency spectrum is especially suited to compare polymorphism in regions of the gene without bias due to PCR errors, because it accounts for sequence length variation. We compared and tested nucleotide diversity of synonymous and nonsynonymous sites using bootstrapping in MEGA 3.0 (Kumar et al., 2004). Differentiation between populations was assessed by sequence-based F statistics analogous to Wright F statistics (Wright, 1978), calculated according to Hudson et al. 1992 and tested for significance by permutation in DnaSp 4.0 (Rozas & Rozas 1999). Calculations not available in DnaSp and MEGA were carried out using programs written by TL in SAS (SAS Institute Inc., 1990). Selection inference is primarily based on comparisons between different functional regions of the gene (defined below) to avoid confounding the effect of population demography and PCR errors that affect all regions of the gene equally. Similarly to a comparison of synonymous and non-synonymous mutations, this approach is conservative because polymorphism in shorter DNA fragments is subject to higher sampling variation, reducing the power to detect differences between regions. Physical linkage between adjacent regions may further reduce the differences between them even if selection operated on only one region. The advantage of this approach, however, is that significant differences represent robust evidence for selection. RESULTS Genetic diversity Within-population polymorphism in the defensin gene region was moderate to high (Table 1). The lowest polymorphism (in the whole gene) was observed in An. arabiensis (pi=0.015) and the highest in the western Kenyan population of An. gambiae (pi=0.028). Examination of intra-population variation using a sliding window revealed over ten fold difference across various segments of the gene, and considerable albeit lesser differences between species and populations (Fig. 1
If selection has not shaped variation in defensin, a similar pattern of polymorphism is expected across its functional regions. These regions were defined a-priori and included: (i) mature peptide (120 bp), (ii) signal peptide and the clipped pro-peptide (186 bp), (iii) total transcribed but non coding regions including the intron (385 bp, excluding all gaps), and (iv) the flanking non-transcribed regions (600 bp, excluding all gaps and missing data). Nucleotide diversity in the coding region was lower than that of the non-coding region in all populations (Table 1, P<0.016, binomial test). Comparing the site frequency spectra between different functional regions provided a more comprehensive test of that variation. Frequency spectra were grouped into ‘rare alleles’ (singletons), ‘moderate alleles’ (sites where the rare nucleotide was observed two or three times), and ‘common alleles’ (sites where the rare nucleotide was observed four or more times). Invariant sites were included to accommodate total length variation between regions. Within population heterogeneity was detected in four out of six populations (Table 1). An overall within-population test was performed after finding no evidence for between-population heterogeneity (heterogeneity X2=56.5, df=45, P>0.1). Marked deviations from homogeneity were concentrated in the coding region, showing deficiency of rare and moderate alleles in both the mature peptide and the signal/pro-peptide segments, but no significant deficiency at the high frequency sites. Additionally, an excess of rare alleles (singletons) and common alleles was observed in the transcribed-non-coding region (Table 1). The heterogeneity in magnitude and pattern of polymorphism across functional regions and the reduced polymorphism in the coding region demonstrate selection on defensin. The expected signature of balancing (diversifying) selection would be increased polymorphism in the coding region while that of directional selection would involve reduced frequency of common alleles in that region; neither pattern was observed. The results, however, are consistent with purifying selection eliminating rare mutations from the coding regions. Synonymous vs nonsynonymous substitutions In the coding region, within-population nucleotide diversity at synonymous sites was 4 fold higher than that of nonsynonymous sites and this difference was significant in five collections based on the two-sided test of neutral evolution (Table 2). This difference was most extreme in the mature peptide where no replacement mutations were observed in four populations and the overall difference in synonymous vs. nonsynonymous diversity (across populations) was 100 fold (Table 2). The short length and limited variation in this region resulted in low statistical power, but the overall test (across collections) was significant (Table 2). At the protein level, except two singleton peptides, one single mature peptide was observed across the three species and six geographically distinct populations. Together with the high silent polymorphism across the gene and moderate synonymous diversity in the coding region, this extremely low diversity at the protein level in the mature peptide provides strong evidence that purifying selection is the main mode of selection operating on defensin.
Species divergence and population differentiation Divergence between species was highly significant (P<0.001) and Fst values exceeded 0.3 across all functional regions (Table 3). The mature peptide showed higher divergence between species than any other functional region (P<0.05, Ryan-Einot-Gabriel-Welsh Multiple Range Test following significant ‘Region’ effect in ANOVA, not shown). Moreover, it was the only region showing no differentiation among An. gambiae populations, whilst the magnitude of differentiation across the other functional regions were similar to each other (Table 3) and to the average of nine microsatellite loci (Fst=0.063, P<0.001; Lehmann et al. 2003). Heterogeneity in differentiation across functional regions is further evidence for selection. Increased inter-specific divergence and reduced differentiation among distant An. gambiae populations indicates strong purifying selection operating on the mature peptide in all species, hereby reducing within-species variation in this region and independently “fixing” few neutral mutations between species. This higher rate of fixation results in high fraction of between-species variation relatively to low fraction of within-species variation, yielding high Fst.
DISCUSSION Within- and between-population variation in defensin provides clear evidence that selection has shaped polymorphism in this gene. Purifying selection on the mature peptide explains (i) the overall reduced polymorphism in the mature peptide and the total coding region and the specific reductions in rare and moderately frequent alleles compared with non coding regions, (ii) the markedly reduced rate of nonsynonymous diversity compared with synonymous and the identity of the mature peptide across three species, and (iii) the increased divergence between species in the mature peptide together with the reduced differentiation between populations of An. gambiae in the same region. These patterns of variation were exhibited by all species and populations despite marked differences in their exposure to human pathogens due to their different preference to feed on human hosts (Hadis et al., 1997; Lemasson et al., 1997; Dekker&Takken, 1998; Mouchet et al., 2004). High larval mortality (>95%) was measured under natural conditions and attributed to infections with nematodes and fungi (Jenkins, 1964; Service, 1973). These larval pathogens may be widespread throughout the species’ range, explaining the identity of the mature defensin across the species and populations. However, it is likely that defensin is an effector molecule of broad target spectrum, so that it remains effective despite changes in pathogen composition affecting different populations. Importantly, variation in susceptibility to pathogens (including human pathogens) among individual mosquitoes of the An. gambiae complex (Niare et al., 2002; Lambrechts et al., 2005) is not related to peptide variation in the mature defensin. The pattern of selection on defensin is inconsistent with positive and with balancing (diversifying) selection. Accordingly, the evolutionary dynamics between pathogens that mediated selection on defensin and mosquitoes in these populations are incompatible with the arms race dynamics. Consistent with purifying selection, the dynamics of the interactions may be governed by the cost of further increased anti-microbial effect on fitness as part of the functional constraints limiting variation of defensin, albeit not exclusively so. Acknowledgments The authors are grateful to Ananias Escalante, Jose Ribeiro, Randy Dejong, Franck Prugnolle, Adam Richman and Norio Kobayashi for useful discussions and comments. FS received financial support from the American Society for Microbiology postdoctoral fellowship program. This study was supported by UNDP/World Bank/WHO Special Programme for Research and Training in Tropical Diseases Grant A990476 to TL. Footnotes Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||
Science. 2002 Oct 4; 298(5591):129-49.
[Science. 2002]Insect Mol Biol. 2001 Dec; 10(6):597-604.
[Insect Mol Biol. 2001]EMBO J. 1996 Sep 2; 15(17):4691-701.
[EMBO J. 1996]Cell. 2004 Mar 5; 116(5):661-70.
[Cell. 2004]J Exp Biol. 2004 Jul; 207(Pt 15):2551-63.
[J Exp Biol. 2004]Insect Mol Biol. 1996 Aug; 5(3):203-10.
[Insect Mol Biol. 1996]EMBO J. 1997 Oct 15; 16(20):6114-9.
[EMBO J. 1997]EMBO J. 1998 Nov 2; 17(21):6115-23.
[EMBO J. 1998]Insect Mol Biol. 2000 Oct; 9(5):481-90.
[Insect Mol Biol. 2000]Exp Parasitol. 1998 May; 89(1):103-12.
[Exp Parasitol. 1998]Trans R Soc Trop Med Hyg. 1997 Jul-Aug; 91(4):376-8.
[Trans R Soc Trop Med Hyg. 1997]J Med Entomol. 1997 Jul; 34(4):396-403.
[J Med Entomol. 1997]Trans R Soc Trop Med Hyg. 2000 Mar-Apr; 94(2):113-27.
[Trans R Soc Trop Med Hyg. 2000]J Hered. 2003 Mar-Apr; 94(2):133-47.
[J Hered. 2003]Am J Trop Med Hyg. 1988 Dec; 39(6):545-50.
[Am J Trop Med Hyg. 1988]J Hered. 2003 Mar-Apr; 94(2):133-47.
[J Hered. 2003]Am J Trop Med Hyg. 1993 Oct; 49(4):520-9.
[Am J Trop Med Hyg. 1993]Insect Mol Biol. 1997 Nov; 6(4):377-83.
[Insect Mol Biol. 1997]Heredity. 1996 Aug; 77 ( Pt 2)():192-200.
[Heredity. 1996]J Hered. 2003 Mar-Apr; 94(2):133-47.
[J Hered. 2003]Insect Mol Biol. 2000 Oct; 9(5):481-90.
[Insect Mol Biol. 2000]Mol Biol Evol. 1991 Nov; 8(6):884-7.
[Mol Biol Evol. 1991]Bioinformatics. 1999 Feb; 15(2):174-5.
[Bioinformatics. 1999]Genetics. 1989 Nov; 123(3):585-95.
[Genetics. 1989]Genetics. 1995 Jun; 140(2):783-96.
[Genetics. 1995]Brief Bioinform. 2004 Jun; 5(2):150-63.
[Brief Bioinform. 2004]Bioinformatics. 1999 Feb; 15(2):174-5.
[Bioinformatics. 1999]J Hered. 2003 Mar-Apr; 94(2):133-47.
[J Hered. 2003]Trans R Soc Trop Med Hyg. 1997 Jul-Aug; 91(4):376-8.
[Trans R Soc Trop Med Hyg. 1997]J Med Entomol. 1997 Jul; 34(4):396-403.
[J Med Entomol. 1997]Med Vet Entomol. 1998 Apr; 12(2):136-40.
[Med Vet Entomol. 1998]Science. 2002 Oct 4; 298(5591):213-6.
[Science. 2002]Malar J. 2005 Jan 11; 4():3.
[Malar J. 2005]Bioinformatics. 1999 Feb; 15(2):174-5.
[Bioinformatics. 1999]Genetics. 1992 Oct; 132(2):583-9.
[Genetics. 1992]Bioinformatics. 1999 Feb; 15(2):174-5.
[Bioinformatics. 1999]