Skip to main content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Science. Author manuscript; available in PMC 2022 Jan 21.
Published in final edited form as:
PMCID: PMC8782153
NIHMSID: NIHMS1770124
PMID: 23258410

Comparative analysis of bat genomes provides insight into the evolution of flight and immunity

Associated Data

Supplementary Materials

Abstract

Bats are the only mammals capable of sustained flight and are notorious reservoir hosts for some of the world’s most highly pathogenic viruses, including Nipah, Hendra, Ebola and SARS. To identify genetic changes associated with the development of bat-specific traits, we performed whole genome sequencing and comparative analyses of two distantly related bat species, fruit bat Pteropus alecto and insectivorous Myotis davidii. We discovered an unexpected concentration of positively selected genes in the DNA damage checkpoint and NF-κB pathways that may be related to the origin of flight, as well as expansion and contraction of important gene families. Comparison of bat genomes with other mammalian species has provided new insights into bat biology and evolution.

Bats belong to the order Chiroptera within the mammalian clade Laurasiatheria (1). While consensus has not been reached on the exact arrangement of groups within Laurasiatheria, a recent study placed Chiroptera as a sister taxon to Cetartiodactyla (whales + even-toed ungulates such as cattle, sheep and pigs) (2). The Black flying fox (Pteropus alecto), and David’s Myotis (Myotis davidii), represent the Yinpterochiroptera and Yangochiroptera suborders, respectively, and display a diverse range of phenotypes (Fig. 1). Captive colonies, immortalized cell lines and bat-specific reagents have been developed for these two species, however genomic data is currently unavailable.

An external file that holds a picture, illustration, etc.
Object name is nihms-1770124-f0001.jpg
Comparison of bat biological traits

P. alecto and M. davidii represent two distinct Chiropteran suborders and demonstrate diverse evolutionary adaptations. PNG: Papua New Guinea.

The most conspicuous feature of bats, distinguishing them from all other mammalian species, is the capacity for sustained flight. Positive selection in the oxidative phosphorylation (OXPHOS) pathway suggests increased metabolic capacity played a key role in its evolution (3), yet the byproducts of oxidative metabolism (such as reactive oxygen species, ROS) can produce harmful side effects including DNA damage (4). We hypothesize that genetic changes during the evolution of flight in bats likely included adaptations to limit collateral damage caused by byproducts of elevated metabolic rate. Another phenomenon that has sparked intense interest in recent years is the discovery that bats maintain and disseminate numerous deadly viruses (5). In this context, we further hypothesize that the long-term coexistence of bats and viruses must have imposed strong selective pressures on the bat genome, and the genes most likely to reflect this are those directly related to the first line of antiviral defense - the innate immune system.

We performed high-throughput whole-genome sequencing of individual wild-caught specimens of P. alecto and M. davidii using the Illumina HiSeq platform (6). More than 100x coverage high quality reads were obtained for P. alecto and M. davidii resulting in high quality assemblies (Tables S13, Fig. S1). The two bat genomes, at approximately 2 Gb, were smaller in size than other mammals (7) (Fig. S2), while the number of genes we identified was similar to other mammals (21,392 and 21,705 in P. alecto and M. davidii, respectively) (Fig. S3). Both species displayed a high degree of heterozygosity at the whole genome level (0.45% and 0.28% in P. alecto and M. davidii, respectively) (Tables S45), while repetitive content accounted for slightly less than a third of each genome (Tables S67). We identified a novel endogenous viral element derived from Saimiriine herpesvirus 2 that has expanded to 126 copies in P. alecto (Table S8, Fig. S4). Gene family expansion and contraction analysis (Tables S912) revealed significant expansion (p<0.05) of 71 gene families in M. davidii compared to only 13 in P. alecto, which may be related to a recent wave of DNA transposon activity (8).

We screened all nuclear-encoded bat genes to identify those for which a single orthologous copy was unambiguously present in both bat species as well as human, rhesus macaque, mouse, rat, dog, cat, cattle and horse. From this, 2,492 genes were used to perform maximum likelihood and Bayesian phylogenomic analysis (Figs. 2, S57). All phylogenetically informative signals including concatenated nucleotides and amino acids vigorously supported bats as a member of Pegasoferae (Chiroptera + Perissodactyla + Carnivora) (9), with the bat lineage diverging from the Equus (horse) lineage approximately 88 million years ago (MYA), supported by findings at the transcript level (10). Interestingly, phylogenetic reconstruction with mitochondrial DNA sequences resulted in bats occupying an outlying position in Laurasiatheria (Fig. S8). The incongruence between nuclear and mitochondrial trees likely reflects rapid evolution of the mitochondrial genome of the bat ancestor during the evolution of flight (3).

An external file that holds a picture, illustration, etc.
Object name is nihms-1770124-f0002.jpg
Phylogenomic analysis

Maximum likelihood phylogenomic analysis of 2,492 genes from M. davidii, P. alecto and eight mammalian species. Divergence time estimates in blue, gene family expansion events in green, and gene family contraction events in red. MRCA: most recent common ancestor.

To identify mechanisms that facilitated the origin of flight in bats, we surveyed genes involved in detection and repair of genetic damage. A high proportion of genes in the DNA damage checkpoint/DNA repair pathway were found to be under positive selection in the bat ancestor, including ATM, DNA-PKc, RAD50, KU80, and MDM2 (Fig. 3A, Table 1). We propose that these changes may be directly related to minimizing/repairing the negative effects of ROS generated as a consequence of flight. Additionally in this pathway, TP53 (p53) and BRCA2 were shown to be under positive selection in M. davidii, while LIG4 was under positive selection in P. alecto (Table 1). Bat-specific mutations in a nuclear localization signal in p53 and a nuclear export signal in MDM2 (Figs. 3B, S9) may affect subcellular localization and function in both species (11, 12). Other candidate flight-related genes under positive selection in the bat ancestor included COL3A1, involved in skin elasticity, and CACNA2D1, which has a role in muscle contraction (Table S13).

An external file that holds a picture, illustration, etc.
Object name is nihms-1770124-f0003.jpg
Accelerated evolution in the DNA damage checkpoint in bats

(A) Positive selection in the DNA damage checkpoint/DNA repair pathway. Genes under positive selection in the bat ancestor are highlighted in orange. Genes under positive selection in M. davidii only (p53, BRCA2) or P. alecto only (LIG4) are highlighted in blue. (B) Mutations unique to bats were detected in the functionally relevant regions of the p53 nuclear localization signal (NLS) and MDM2 nuclear export signal (NES) (black highlight).

Table 1.

DNA damage checkpoint and innate immune genes under positive selection in the bat lineages

LineageSymbolGeneω0 (average)ω1 (other)ω2 (target)p-value
Ancestor TLR7 toll-like receptor 70.28210.26702.77783.54E-07
ATM ataxia telangiectasia mutated0.200960.195950.71631.34E-05
MDM2 Mdm2 p53 binding protein homolog (mouse)0.133580.126150.810854.05E-04
NLRP3 NLR family, pyrin domain containing 30.17880.17141.18841.93E-04
MAP3K7 mitogen-activated protein kinase kinase kinase 70.02160.01940.47868.93E-03
RAD50 RAD50 homolog0.096570.093430.288827.95E-03
PRKDC protein kinase, DNA-activated, catalytic polypeptide0.230360.227680.451556.80E-03
KU80 X-ray repair complementing defective repair in Chinese hamster cells 50.311450.304360.917473.75E-02
c-REL v-rel reticuloendotheliosis viral oncogene homolog (avian)0.24950.24031.57171.11E-02

P. alecto TBK1 TANK-binding kinase 10.06430.05220.29301.29E-09
LIG4 ligase IV, DNA, ATP-dependent0.120330.113760.247978.91E-04
IL18 interleukin 18 (interferon-gamma-inducing factor)0.52980.45321.76472.66E-04
IFNG interferon, gamma0.50100.45271.32824.89E-03
ISG15 ISG15 ubiquitin-like modifier0.20690.19090.43872.63E-02
DDX58 DEAD (Asp-Glu-Ala-Asp) box polypeptide 580.30400.29230.46611.23E-02

M. davidii IFNAR1 interferon (alpha, beta and omega) receptor 10.49540.47231.09247.00E-03
TP53 tumor protein p530.256230.239330.481237.00E-03
BRCA2 breast cancer 2, early onset0.490020.477320.642131.31E-03
IRAK4 interleukin-1 receptor-associated kinase 40.16700.15830.35311.96E-02

The rate ratio ω of non-synonymous to synonymous substitutions (dN/dS) was calculated using multi protein alignments of P. alecto and M. davidii sequences with orthologous sequences from human, rhesus macaque, mouse, rat, dog, cattle and horse. ω0 is the average ratio in all branches, ω1 is the average ratio in non-bat branches, and ω2 is the ratio in the bat branch. A low p-value indicates that the ω2 model fits the data better than the ω1 model.

We next examined genes of the innate immune system (Table 1). Positively selected genes in the bat ancestor included c-REL, a member of the NF-κB family of transcription factors, which also contained amino acid changes potentially affecting IκB binding (Fig. S10). In addition to diverse roles in innate and adaptive immunity (13), c-REL plays a role in the DNA damage response by activating ATM (14) and CLSPN (15), while ATM is also an upstream regulator of NF-κB (16). The DNA damage response plays an important role in host defense and is a known target for virus interaction (17), raising the possibility that changes in DNA damage response mechanisms during selection for flight could have influenced the bat immune system.

Intriguingly, both P. alecto and M. davidii have lost the entire locus containing the PYHIN gene family, including AIM2 and IFI16; both of which are involved in sensing microbial DNA and the formation of inflammasomes (Fig. S11). The association between PYHIN genes and cell cycle regulation in other species (18) hints that loss of the PYHIN family in bats may be connected to changes in the DNA damage pathway; since at least one PYHIN gene is present in all other major groups of eutherian mammals (19). NLRP3, triggered by both viral infection and ROS in other mammals (20), plays an analogous role to AIM2 in inflammasome assembly and was also under positive selection in the bat ancestor (Table 1).

Natural killer (NK) cells provide a first line of defense against viruses and tumors and include two families of NK cell receptors; killer-cell immunoglobulin-like receptors (KIRs), encoded by genes in the leukocyte receptor complex (LRC) and killer cell lectin-like receptors (KLRs, also known as Ly49 receptors), encoded within the natural killer gene complex (NKC). KLRs and KIRs were entirely absent in P. alecto and reduced to a single Ly49 pseudogene in M. davidii (Table S14). KIR-like receptors identified in other species (21) were also absent from both P. alecto and M. davidii genomes, supported by transcript analysis in P. alecto (10). This likely indicates that bat NK cells use a novel class of receptors to recognize classical MHC class I molecules. Furthermore, additional LRC members of the immunoglobulin superfamily (including SIGLECS, LILRs, CEACAMs and LAIRs) have undergone considerable gene duplication in M. davidii and other mammals; yet have almost completely failed to expand in P. alecto (Fig. S12). As the genes encoded within the LRC bind a variety of ligands and play multiple roles in immune regulation, these observations have diverse implications for differences in immune function between P. alecto and M. davidii and between bats and other mammals.

We identified seven complete and two partial copies of the digestive enzyme RNASE4 in M. davidii (Table S15), while P. alecto RNASE4 has acquired a frame-shift mutation resulting in loss of catalytic residues (Fig. S13). We also identified critical amino acid changes in M. davidii RNASE4 genes (relative to the mammalian consensus) that suggest diversification of substrate specificity (Fig. S13). With a proven role in host defense against RNA viruses (22), RNASE4 expansion in M. davidii may have implications for virus resistance, but may also reflect the insectivorous diet of M. davidii, which contrasts that of P. alecto which consumes predominantly fruit, flowers and nectar.

M. davidii also differs from P. alecto in aspects including hibernation and echolocation (Fig. 1). Bile salt-stimulated lipase (BSSL), capable of hydrolyzing triglycerides into monoglycerides and subsequently releasing digestible free fatty acids, has been specifically expanded in M. davidii compared to P. alecto and other mammals (Fig. S14). In addition, we observed six candidate genes related to hibernation showing positive selection in M. davidii and three other hibernating species, relative to non-hibernators (Table S16). Seven echolocation related genes, including new candidates WNT8A and FOS (a subunit of the AP-1 transcription factor) had significantly higher dN/dS in the echolocating M. davidii branch relative to non-echolocating branches (Table S17). Of note, the third exon in M. davidii FOXP2 had even greater variation from the mammalian consensus than two previously identified variable sites (Fig. S15) suggesting a specific transcript variant is involved in echolocation (23).

In summary, comparative analysis of P. alecto and M. davidii genomes has provided insight into the phylogenetic placement of bats, and has revealed evidence of genetic changes that may have contributed to their evolution. Gene duplication events played a particularly prominent role in the evolution of Myotis bats and may have helped contribute to their speciation. Concentration of positively selected genes in the DNA damage checkpoint pathway in bats may indicate an important step in the evolution of flight, while evidence of change in components shared by the DNA damage pathway and the innate immune system raises the interesting possibility that flight-induced adaptations have had inadvertent effects on bat immune function and possibly also life expectancy (24). The data generated by this study will help to address major gaps in our understanding of bat biology and provide new directions for future research.

Supplementary Material

Supplementary Materials

Supplementary Tables

Acknowledgments:

We thank H. Field, C. Smith and M. Yu for helping source genomic DNA, K. Itahana and J.J. Boomsma for constructive discussion and M. Cowled for graphics assistance. We acknowledge financial support from the China National Genebank at Shenzhen, CSIRO (OCE Science Leaders Award, Julius Award), The Australian Research Council (FT110100234), State Key Program for Basic Research (2011CB504701), National Natural Science Foundation of China (81290341), and the Defense Threat Reduction Agency of USA. The views expressed in this article are those of the author and do not necessarily reflect the official policy or position of the Department of the Navy, Department of Defense, nor the U.S. Government. KBL and KGF are contractors for the U.S. Government. This work was prepared as part of their official duties. Title 17 U.S.C. §105 provides that ‘Copyright protection under this title is not available for any work of the United States Government.’ Title 17 U.S.C. §101 defines a U.S. Government work as a work prepared by a military service member or employee of the U.S. government as part of that person’s official duties. P. alecto and M. davidii genomes have been deposited at DDBJ/EMBL/GenBank under the accession numbers ALWS01000000 and ALWT01000000. Short read data have been deposited into the Short Read Archive under accession numbers SRA056924 and SRA056925. Raw transcriptome data has been deposited in Gene Expression Omnibus as GSE39933. Tree files and alignments have been submitted to TreeBASE under Study Accession URL: http://purl.org/phylo/treebase/phylows/study/TB2:S13654. We also wish to thank the editor and two anonymous reviewers for their helpful comments and suggestions.

Footnotes

The authors declare no competing financial interests.

References:

1. Teeling EC et al. Science 307, 580 (Jan 28, 2005). [PubMed] [Google Scholar]
2. Nery MJ, Gonzalez DJ, Hoffmann FG, Opazo JC, Mol. Phylogenet. Evol 10, 210 (2012). [Google Scholar]
3. Shen YY et al. Proc. Natl. Acad. Sci. U. S. A 107, 8666 (May 11, 2010). [PMC free article] [PubMed] [Google Scholar]
4. Barzilai A, Rotman G, Shiloh Y, DNA Repair 1, 3 (Jan 22, 2002). [PubMed] [Google Scholar]
5. Wang LF, Walker PJ, Poon LL, Current Opinion in Virology 1, 649 (2011). [PMC free article] [PubMed] [Google Scholar]
6. Materials and methods are available as supplementary material on Science Online.
7. Smith JDL, Gregory TR, Biol. Lett 5, 347 (Jun 23, 2009). [PMC free article] [PubMed] [Google Scholar]
8. Pritham EJ, Feschotte C, Proc. Natl. Acad. Sci. U. S. A 104, 1895 (Feb 6, 2007). [PMC free article] [PubMed] [Google Scholar]
9. Nishihara H, Hasegawa M, Okada N, Proc. Natl. Acad. Sci. U. S. A 103, 9929 (Jun 27, 2006). [PMC free article] [PubMed] [Google Scholar]
10. Papenfuss AT et al. BMC Genomics 13, 261 (2012). [PMC free article] [PubMed] [Google Scholar]
11. O’Keefe K, Li HP, Zhang YP, Mol. Cell. Biol 23, 6396 (Sep, 2003). [PMC free article] [PubMed] [Google Scholar]
12. Roth J, Dobbelstein M, Freedman DA, Shenk T, Levine AJ, EMBO J. 17, 554 (Jan 15, 1998). [PMC free article] [PubMed] [Google Scholar]
13. Gilmore TD, Gerondakis S, Genes Cancer 2, 695 (2011). [PMC free article] [PubMed] [Google Scholar]
14. De Siervi A et al. Cell Cycle 8, 2093 (Jul 1, 2009). [PMC free article] [PubMed] [Google Scholar]
15. Kenneth NS, Mudie S, Rocha S, EMBO J. 29, 2966 (Sep 1, 2010). [PMC free article] [PubMed] [Google Scholar]
16. Brzoska K, Szumiel I, Mutagenesis 24, 1 (Jan, 2009). [PubMed] [Google Scholar]
17. Turnell AS, Grand RJ, J. Gen. Virol 93, 2076 (2012). [PubMed] [Google Scholar]
18. Schattgen SA, Fitzgerald KA, Immunol. Rev 243, 109 (Sep, 2011). [PubMed] [Google Scholar]
19. Cridland JA et al. BMC Evol. Biol 7, 140 (2012). [Google Scholar]
20. Tschopp J, Schroder K, Nature Reviews Immunology 10, 210 (Mar, 2010). [PubMed] [Google Scholar]
21. Sambrook JG, Beck S, Curr. Opin. Immunol 19, 553 (Oct, 2007). [PubMed] [Google Scholar]
22. Cocchi F et al. Proc. Natl. Acad. Sci. U. S. A 109, 5411 (Apr 3, 2012). [PMC free article] [PubMed] [Google Scholar]
23. Li G, Wang JH, Rossiter SJ, Jones G, Zhang SY, PLoS ONE 2, (Sep 19, 2007). [Google Scholar]
24. Brunet-Rossinni AK, Austad SN, Biogerontology 5, 211 (Aug, 2004). [PubMed] [Google Scholar]
25. Li RQ et al. Genome Res. 20, 265 (Feb, 2010). [PMC free article] [PubMed] [Google Scholar]
26. Li H, Durbin R, Bioinformatics 25, 1754 (Jul 15, 2009). [PMC free article] [PubMed] [Google Scholar]
27. Li H et al. Bioinformatics 25, 2078 (Aug 15, 2009). [PMC free article] [PubMed] [Google Scholar]
28. Benson G, Nucleic Acids Res. 27, 573 (Jan 15, 1999). [PMC free article] [PubMed] [Google Scholar]
29. Chen N, Using RepeatMasker to identify repetitive elements in genomic sequences. Current Protocols in Bioinformatics (John Wiley & Sons, Inc., 2004). [Google Scholar]
30. Abrusan G, Grundmann N, DeMester L, Makalowski W, Bioinformatics 25, 1329 (May 15, 2009). [PubMed] [Google Scholar]
31. Kent WJ, Genome Res. 12, 656 (Apr, 2002). [PMC free article] [PubMed] [Google Scholar]
32. Birney E, Clamp M, Durbin R, Genome Res. 14, 988 (May, 2004). [PMC free article] [PubMed] [Google Scholar]
33. Katzourakis A, Gifford RJ, PLoS Genet. 6, (Nov, 2010). [PMC free article] [PubMed] [Google Scholar]
34. Edgar RC, Nucleic Acids Res. 32, 1792 (Mar, 2004). [PMC free article] [PubMed] [Google Scholar]
35. Stamatakis A, Bioinformatics 22, 2688 (Nov 1, 2006). [PubMed] [Google Scholar]
36. Stanke M, Waack S, Bioinformatics 19, Ii215 (Sep, 2003). [PubMed] [Google Scholar]
37. Salamov AA, Solovyev VV, Genome Res. 10, 516 (Apr, 2000). [PMC free article] [PubMed] [Google Scholar]
38. Trapnell C, Pachter L, Salzberg SL, Bioinformatics 25, 1105 (May 1, 2009). [PMC free article] [PubMed] [Google Scholar]
39. Li H et al. Nucleic Acids Res. 34, D572 (Jan 1, 2006). [PMC free article] [PubMed] [Google Scholar]
40. Kim EB et al. Nature 479, 223 (Nov 10, 2011). [PMC free article] [PubMed] [Google Scholar]
41. Li RQ et al. Nature 463, 311 (Jan 21, 2010). [PMC free article] [PubMed] [Google Scholar]
42. Hahn MW, De Bie T, Stajich JE, Nguyen C, Cristianini N, Genome Res. 15, 1153 (Aug, 2005). [PMC free article] [PubMed] [Google Scholar]
43. Stamatakis A, Ludwig T, Meier H, Bioinformatics 21, 456 (Feb 15, 2005). [PubMed] [Google Scholar]
44. Yang ZH, Rannala B, Mol. Biol. Evol 23, 212 (Jan, 2006). [PubMed] [Google Scholar]
45. Yang ZH, Mol. Biol. Evol 24, 1586 (Aug, 2007). [PubMed] [Google Scholar]
46. Yang ZH, Comput. Appl. Biosci 13, 555 (Oct, 1997). [PubMed] [Google Scholar]
47. Zhao HB, Yang JR, Xu HL, Zhang JZ, Mol. Biol. Evol 27, 2669 (Dec, 2010). [PMC free article] [PubMed] [Google Scholar]
48. Andrews MT, Squire TL, Bowen CM, Rollins MB, Proc. Natl. Acad. Sci. U. S. A 95, 8392 (Jul 7, 1998). [PMC free article] [PubMed] [Google Scholar]
49. Holmes RS, Cox LA, Cholesterol 2011, 1 (2011). [Google Scholar]
50. Kelley J, Walter L, Trowsdale J, PLoS Genet. 1, 129 (Aug, 2005). [PMC free article] [PubMed] [Google Scholar]
51. Paulson JC, Macauley MS, Kawasaki N, Ann. N. Y. Acad. Sci 1253, 37 (2012). [PMC free article] [PubMed] [Google Scholar]
52. Zebhauser R et al. Genomics 86, 566 (Nov, 2005). [PubMed] [Google Scholar]