U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Institute of Medicine (US) Forum on Microbial Threats. Microbial Evolution and Co-Adaptation: A Tribute to the Life and Scientific Legacies of Joshua Lederberg: Workshop Summary. Washington (DC): National Academies Press (US); 2009.

Cover of Microbial Evolution and Co-Adaptation

Microbial Evolution and Co-Adaptation: A Tribute to the Life and Scientific Legacies of Joshua Lederberg: Workshop Summary.

Show details

5Infectious Disease Emergence: Past, Present, and Future


Emerging infections, as defined by Stephen Morse of Columbia University in his contribution to this chapter, are infections that are rapidly increasing in incidence or geographic range, including such previously unrecognized diseases as HIV/AIDS, severe acute respiratory syndrome (SARS), Ebola hemorrhagic fever, and Nipah virus encephalitis. Among his many contributions to efforts to recognize and address the threat of emerging infections, Lederberg co-chaired the committees that produced two landmark Institute of Medicine (IOM) reports, Emerging Infections: Microbial Threats to Health in the United States (IOM, 1992) and Microbial Threats to Health (IOM, 2003), which provided a crucial framework for understanding the drivers of infectious disease emergence (Box WO-3 and Figure WO-13). As the papers in this chapter demonstrate, this framework continues to guide research to elucidate the origins of emerging infectious threats, to inform the analysis of recent patterns of disease emergence, and to identify risks for future disease emergence events so as to enable early detection and response in the event of an outbreak, and perhaps even predict its occurrence.

In the chapter’s first paper, Morse describes two distinct stages in the emergence of infectious diseases: the introduction of a new infection to a host population, and the establishment within and dissemination from this population. He considers the vast and largely uncharacterized “zoonotic pool” of possible human pathogens and the increasing opportunities for infection presented by ecological upheaval and globalization. Using hantavirus pulmonary syndrome and H5N1 influenza as examples, Morse demonstrates how zoonotic pathogens gain access to human populations. While many zoonotic pathogens periodically infect humans, few become adept at transmitting or propagating themselves, Morse observes. Human activity, however, is making this transition increasingly easy by creating efficient pathways for pathogen transmission around the globe. “We know what is responsible for emerging infections, and should be able to prevent them,” he concludes, through global surveillance, diagnostics, research, and above all, the political will to make them happen.

The authors of the chapter’s second paper, workshop presenter Mark Woolhouse and Eleanor Gaunt of the University of Edinburgh, draw several general conclusions about the ecological origins of novel human pathogens based on their analysis of human pathogen species discovered since 1980. Using a rigorous, formal methodology, Woolhouse and Gaunt produced and refined a catalog of the nearly 1,400 recognized human pathogen species. A subset of 87 species have been recognized since 1980—and are currently thought to be “novel” pathogens. The authors note four attributes of these novel pathogens that they expect will describe most future emergent microbes: a preponderance of RNA viruses; pathogens with nonhuman animal reservoirs; pathogens with a broad host range; and pathogens with some (perhaps initially limited) potential for human-human transmission.

Like Morse, Woolhouse and Gaunt consider the challenges faced by novel pathogens to become established in a new host population and achieve efficient transmission, conceptualizing Morse’s observation that “many are called but few are chosen” in graphic form, as a pyramid. It depicts the approximately 1,400 pathogens capable of infecting humans, of which 500 are capable of human-to-human transmission, and among which fewer than 150 have the potential to cause epidemic or endemic disease; evolution—over a range of time scales—drives pathogens up the pyramid. The paper concludes with a discussion of the public health implications of the pyramid model, which suggests that ongoing global ecological change will continue to produce novel infectious diseases at or near the current rate of three per year.

In contrast to other contributors to this chapter, who focus on what, why, and where infectious diseases emerge, Jonathan Eisen, of the University of California, Davis, considers how new functions and processes evolve to generate novel pathogens. Eisen investigates the origin of microbial novelty by integrating evolutionary analyses with studies of genome sequences, a field he terms “phylogenomics.” In his essay, he illustrates the results of such analyses in a series of “phylogenomic tales” that describe the use of phylogenomics to predict the function of uncharacterized genes in a variety of organisms, and in elucidating the genetic basis of a complex symbiotic relationship involving three species.

Knowledge of microbial genomes, and the functions they encode, is severely limited, Eisen observes. Among 40 phyla of bacteria, for example, most of the available genomic sequences were from only three phyla; sequencing of Archaea and Eukaryote genomes has proceeded in a similarly sporadic manner. To fill these gaps in our knowledge of the “tree of life,” his group has begun an initiative called the Genomic Encyclopedia of Bacteria and Archaea. Eisen describes this effort and advocates the further integration of information on microbial phylogeny, genetic sequence, and gene function with biogeographical data, in order to produce a “field guide to microbes.”

The chapter’s final paper, by Peter Daszak of the Consortium for Conservation Medicine, Wildlife Trust, makes the leap from knowing how infectious diseases emerge to predicting where, and under what circumstances, an emergent disease event is likely to occur. Daszak presents several examples of his group’s efforts to build predictive approaches to infectious disease emergence based on a thorough understanding of the underlying ecology. These include constructing a model to predict relative risks for Nipah virus reemergence in Malaysia, where a 1999 outbreak devastated a thriving pig farming industry; identifying likely sources by which West Nile virus could spread to Hawaii, the Galapagos, and Barbados; and determining likely reservoirs of H5N1 influenza for specific geographic locations worldwide.

Daszak’s group constructed a database of emerging infectious disease “events” first reported in human populations between 1940 and 2004, which they have used to examine correspondences between events and ecological variables, such as human population density and wildlife diversity, in a geographical context. These analyses have revealed “hotspots” for infectious disease emergence. Daszak discusses the implications of hotspot location for global infectious disease surveillance, and describes how he and coworkers have used their knowledge of hotspots to target surveillance for Nipah virus in India, and also to discover a virus with zoonotic potential in Bangladesh.


Stephen S. Morse, Ph.D. 1

Columbia University

We have all learned about the importance of infectious diseases throughout history, including the Plague of Justinian (541–542), the first known pandemic on record (McNeill, 1976), and the Black Death in the fourteenth century. Stanley Falkow, who is included in this volume, has extensively studied Yersinia pestis, the responsible organism, and given us important insights into its pathogenesis. Another devastating disease that was once much feared is smallpox, which is said to have killed more people than all the wars in history. The eradication of smallpox was therefore a triumph of public health. Ironically, smallpox has the unique property of being the only species to date that human beings have intentionally driven to extinction. While we have unintentionally driven so many species to extinction, it is nice to know we can actually intentionally do some good. Cholera was, of course, a very big concern in the nineteenth century and remains a concern today, especially in places like Bangladesh, as Gerald Keusch of Boston University and a member of the Forum can affirm.

The 1918 influenza pandemic is one of our paradigms of a nightmare emerging infectious disease event. It may very well have been the greatest natural disaster in the early days of the twentieth century. The “official” mortality estimates keep rising as investigators keep finding data from further away, in developing countries and more remote places. But that pandemic is thought to have accounted for about 50 million or more deaths, depending on how you want to count it, and is obviously a matter of great concern.

Despite that, we have had years of complacency about infectious diseases, partly for reasons already discussed—the antibiotic era, immunizations, improved public health measures—all of which have led to the fact that we now live longer and tend to die later of chronic diseases. Unfortunately, this has not been true everywhere. It has not been true in many developing countries. Infectious diseases remain the major causes of morbidity and mortality in much of the world.

But in this paper, I would like to concentrate on emerging infections, the ones that are not previously recognized and that seem to appear suddenly and almost mysteriously—if you will, The Andromeda Strain (Crichton, 1969). Figure WO-7 graphically shows a number of examples. Of course, there are also forgotten infections that reappear. We sometimes call those “reemerging infections.” I tend to think of most of the “reemerging” infections as reminding us that many infectious diseases in our highly mechanized modern societies, with the standard of living we enjoy, have been pushed to the margins, but have never been entirely eliminated. So when public health measures are relaxed or are abandoned because of lack of money or complacency—complacency being a very big problem—you then see forgotten infections reappearing. An example is diphtheria in the former Soviet Union and Eastern Europe in the early 1990s when those countries no longer had the money to maintain their immunization programs. It reminds us that many of these diseases may be forgotten, but they are not gone.

HIV/AIDS is, of course, the infection that got our attention initially and made it possible at least to think about shaking ourselves out of the growing complacency about infectious diseases. HIV infection and AIDS, starting from obscurity, rose to become a leading cause of death in the United States by 1993 (Figure 5-1). There are recent reports dating HIV to the early twentieth century, but it didn’t appear to take off until mid-century. You can find a molecular example of HIV in Zaire in 1969, but that is almost a one-off, and then there were reports of a few cases in the 1970s in Africa, if anyone had been paying attention. Then suddenly, in the early 1980s, it appeared in the United States and took off like the proverbial rocket to overtake all other causes of death in healthy young people. Of course, this is the same age group killed in the 1918 flu, but also the very people we generally expect to have the best survival rate. They have survived childhood and we expect that they ought to be fine. As shown in Figure 5-1, all the other causes of death were unchanged during that period.

FIGURE 5-1. Leading causes of death in young adults, United States, 1987–2005.


Leading causes of death in young adults, United States, 1987–2005. Red line: Rise of HIV infection to become leading cause of death. SOURCE: CDC (2008).

HIV was therefore quite a surprise. When you think about it, this does seem rather like The Andromeda Strain. We had thousands of years of experience with infections, some of them historically recorded in some detail. Some of these are still unidentified, and we still argue about what they were. But a disease that actually kills by undermining the immune system directly was a novel mechanism of pathogenesis. How often does one find a new mechanism of pathogenesis in an infectious disease, considering the thousands of years of experience that we have had? I think it was quite remarkable.

Since its peak (around 1995), the HIV/AIDS death rate in the young adult population in the United States has dropped (Figure 5-1), thanks largely to the fact that a few effective drugs were finally developed, including in particular the protease inhibitors. As a result, the trend reached a plateau and has recently been going down. HIV/AIDS is now a treatable disease, with many lives saved among those who can afford the medication. But it also worries me that this fortunate situation may not last very long. Inevitably, antiviral resistance has already been identified in some patients. Another concern is that some of the younger people have now become quite complacent about this disease, not knowing the devastation that many of us witnessed in the 1980s, before it could be effectively treated. We are seeing young people now regarding this with less seriousness than they should.

So there we are, facing complacency again. If there is a bottom line to the theme of the Forum on Microbial Threats, it is that we cannot afford to be complacent anymore.

What are emerging infections? I always like informally to define emerging infections as those that would knock a really important story off the front page of the newspaper, whether the runaway bride or the Texas polygamy case, at least for a day or two. However, I do have a more formal definition: those infections that are rapidly increasing in incidence or geographic range. In some cases, these are novel, previously unrecognized diseases. But, as I am going to show you, many of them are not The Andromeda Strain. They do not come from space. Actually, in many cases, they have already existed in nature. Very often, anthropogenic causes—often as unintended consequences of things we do—are important in the emergence of these infections.

There are many examples. You can pick your favorite: Ebola in 1976; hantavirus pulmonary syndrome, which I will discuss briefly in a moment; Nipah, which Peter Daszak addressed at the workshop (and his group has done some excellent work on this); SARS; and, of course, influenza, which still continues to surprise us.

You could think of the many events shown in Figure 5-2 as “a thousand points of light” (or at least those of you who are old enough to remember the first President Bush). But these are really a lot of little fires all over the world, most of which we did not spot in time before they became big brush fires or even wild-fires. That includes many examples, such as West Nile virus entering the United States in 1999, the enteropathogenic Escherichia coli (made famous by the “Jack in the Box” case2), and a number of others, including SARS, of course.

FIGURE 5-2. Global examples of emerging and reemerging infectious diseases, some of which are discussed in the main text.


Global examples of emerging and reemerging infectious diseases, some of which are discussed in the main text. Red represents newly emerging diseases; blue, reemerging or resurging diseases; black, a “deliberately emerging” disease. SOURCE: (more...)

I have divided the process of disease emergence into two steps, for analysis: (1) what I call introduction, where these “Andromeda-like” infections are coming from; and (2) establishment and dissemination, which (fortunately for us) is much harder for most of these agents to achieve. The basic lesson there is that many may be called, but few are chosen.

In this two-step process, as you all know, the opportunities are increasing thanks to ecological changes and globalization, which gives the microbes great opportunities to travel along with us, and to travel very quickly. Even medical technologies have played an inadvertent role in helping to disseminate emerging infections.

I will spend most of my time talking about what seems to be the most mysterious step—and I hope we can demystify it a bit here—and that is the introduction of a “new” infection. What we now know is that many of these infections, exotic as they may seem, are often zoonotic. Some of them do not do very much, and may cause no infection at all; while others may cause a truly dramatic infection, like Ebola.

So that zoonotic pool, if I may use that term, is not fully chlorinated, and it is a rich source of potential emerging pathogens. There is so much biodiversity out there, including a tremendous biodiversity of microbes. Some of that biodiversity—we do not know how much, even now—is still untapped.

Changes in the environment may increase the frequency of contact with a natural host carrying an infection, and therefore increase our chances of encountering microorganisms previously unknown to humans. Of course, the role of food animals, as well as wildlife (one of the subjects of Peter Daszak’s contribution to this volume), has come very much to fore in recent years.

There are a number of examples associated with activities like agriculture, food-handling practices, and, for the vector biologists, of course, changes in water ecosystems. Table 5-1 lists just some of these cases. The basic point is that there are a number of ecological changes, many of them anthropogenic, which provide new opportunities for pathogens to emerge and gain access to human populations. Think of these as a sort of microbial explorers, discovering new niches—us—and exploring new territory.

TABLE 5-1. New Opportunities for Pathogens: Ecological Changes.


New Opportunities for Pathogens: Ecological Changes.

It is important not to overlook the very important role of evolution as well. One role is obviously what evolution has already been doing for a long time, leading to the biodiversity of pathogens that we see existing in nature. It is remarkable, when you think about how great that biodiversity is. We don’t even know how many viruses human beings are subject to, even how many inhabit us at this very moment. But when I think about just the herpesviruses, which are pretty well studied, that number could be very large indeed. There are eight known human herpesviruses, and at least six of them—you might argue, even seven of them, except for Human herpesvirus 8, the one that causes Kaposi’s sarcoma—are ubiquitous in the human population. They can be found all over the world. Several of them are present at very high prevalence in the human population.

That just gives you an idea of some of that great biodiversity. As it happens, these herpesviruses are all specialized for humans. There are, of course, herpes-viruses of other species. So a lot of coevolution between host and pathogen goes on as well.

Of course, there is adaptation to new hosts and environments through natural selection. We see this with influenza most notably, but with many other examples—the coronaviruses, like SARS—as well. Of course, antimicrobial resistance has been mentioned so many times. If anyone needs to be convinced about the role of evolution in the world, I think this is a pretty good demonstration—one of the rare examples in which you can do in vitro exactly the same thing as what happens in the real world, just on a different scale.

There are many case studies. I’ll briefly discuss a few, just to illustrate some key points.

Hantavirus pulmonary syndrome was ironically one of the first things to happen suddenly in the United States after the original Institute of Medicine Emerging Infections report came out in October 1992. Hantavirus pulmonary syndrome suddenly appeared in the southwestern United States in the following spring and summer.

My friend Richard Preston wrote a book called The Hot Zone. He has a very philosophical chapter at the end where he talks about the “revenge of the rainforest.” I think it is a good thought, in that we should be kinder to our environment, for many good reasons. The rainforests are great sources of biodiversity and, to a great extent, that biodiversity was largely unexplored.

But an emerging infection can occur anywhere. Even the southwestern United States, which looks so dry, arid, and inhospitable to life, has its share, different from the rainforest, but just as significant.

Jim Hughes, who is a Forum member and was the director of the National Center for Infectious Diseases (NCID) at the Centers for Disease Control and Prevention (CDC) at the time of the outbreak, knows this story firsthand. Starting in the late spring and then going through the summer of 1993, people started appearing at emergency departments and clinics with respiratory distress. Many of them were hospitalized. I believe the case fatality rate at that time was about 60 percent, even with treatment. It is a little lower now, but it is still hovering near 40 to 50 percent.

The health departments did the usual investigations: There is a pocket of plague in that area, so the local health departments tested for that. Another possibility could be influenza out of season. These, and other likely possibilities, were ruled out. The state health departments then called in CDC, which did a number of tests and identified, perhaps surprisingly, a hantavirus as the most likely culprit. This was tested both by serology and, later, shedding of virus was tested by polymerase chain reaction (PCR). Of course, when you think of hantavirus, you usually think of rodents, with a few minor exceptions. So a number of rodent species trapped near patients’ homes were tested. The most frequent rodent was apparently also the most frequently infected: Peromyscus maniculatus, the deer mouse. This is a very successful and prolific rodent that is essentially the major wild rodent in this entire area. Ruth Berkelman likes to refer to this as your typical hardworking single mom, as shown in the illustration (Figure 5-3).

FIGURE 5-3. A deer mouse (Peromyscus maniculatus), natural host for the Sin Nombre (hantavirus pulmonary syndrome) virus, with her young.


A deer mouse (Peromyscus maniculatus), natural host for the Sin Nombre (hantavirus pulmonary syndrome) virus, with her young. SOURCE: Image courtesy of Bet Zimmerman, www.sialis.org.

Of course, once a test was developed and people started looking for the virus, they were able to find it in a great number of other places, including serum and tissue samples that had been saved earlier because the etiology was unknown, but odd—cases of acute respiratory distress. There were even some cases outside the geographic range of Peromyscus maniculatus, which turned out to be hantaviruses that were natural infections of other rodent species.

This point is illustrated in Figure 5-4 (I thank C. J. Peters, then at CDC, for the illustration). Before 1993, the United States had one known hantavirus, not associated with human disease (Prospect Hill virus) and another hantavirus of rats, Seoul virus, and related variants that could be found in port cities; neither was associated with serious acute disease in the United States. After 1993, we had to add another: the virus that causes hantavirus pulmonary syndrome. Then, when people started looking for hantaviruses, there was no shortage of previously unrecognized cases. In Figure 5-4, the virus names in bold have been associated with human disease, while many others have not. So throughout North and South America, suddenly there was a whole rash of hantaviruses that nobody knew existed.

FIGURE 5-4. Hantaviruses of the Americas.


Hantaviruses of the Americas. Viruses associated with human disease are shown in bold. SOURCE: Adapted from Peters (1998) with permission from ASM Press and Jim Mills.

That is evolution at work. We do not know how long ago this diversification occurred. It could have been as long as 2 million years ago, when some of the rodent species separated, but I would defer to the mammalologists on that issue.

As with HIV, at first we think it is an orphan, but, of course, it has its relatives; we just hadn’t found them yet.

What about the respiratory viruses? We have been thinking about that question a great deal lately. Some of our most serious historical examples—influenza, measles, smallpox, and many others—have been respiratory viruses. Right now pandemic influenza and H5N1 avian flu are very much on our minds.

Figure WO-11 was one of Josh Lederberg’s favorite slides. It shows U.S. mortality rates. You can see an enormous peak in 1918, coinciding with the 1918 flu pandemic. It was a big event, and even in the United States it killed at least a half-million people, most of them young and previously healthy.

Several pandemics have been documented. The 1918 pandemic was by far the worst. The Asian flu of 1957—as it happened, I lived through the next two twentieth-century influenza pandemics—was not much fun, to put it mildly, but nothing was ever like 1918. I wasn’t there to experience that one, thankfully. Later, the pandemic Hong Kong flu of 1968 appeared but was relatively mild compared with 1918 and even 1957.

There have been some other events along the way: the reappearance in 1977 of H1N1, and the famous swine flu scare in 1976, which, in fact, Harvey Fineberg, now president of the IOM, wrote about when he was at Harvard (“the epidemic that never was,” as he and his coauthor Richard Neustadt dubbed it [Neustadt and Fineberg, 1983]).

Figure 5-5 shows a ward filled with patients suffering from influenza during the 1918 pandemic. These are soldiers who were about to go overseas to fight in World War I. The photo shows graphically the impact that a disease like the 1918 flu had. CDC has since recalculated the case fatality rates adjusted to today’s population, just extrapolating what the expected deaths would be. With today’s population, a 1918-like pandemic would be expected to cause almost 2 million deaths in the United States alone. If it were like the 1957 or 1968 pandemic, a much milder pandemic, it might be fewer than 100,000 deaths. In any case, it is not something to take lightly.

FIGURE 5-5. Influenza pandemic 1918 at Camp Funston, Kansas.


Influenza pandemic 1918 at Camp Funston, Kansas. SOURCE: Image NCP 1603 courtesy of the National Museum of Health and Medicine, Armed Forces Institute of Pathology, Washington, DC.

In pandemic influenza viruses, the novel or new genes tend to come from avian influenza viruses that then reassort, often with mammalian influenza genes (or at times the virus may possibly go directly from avian to human, although that seems to be a relatively rare event).

We hear a great deal recently about the H5N1 avian flu in humans, and about the next pandemic. These two terms, “pandemic” and “avian flu,” are really not synonymous, although nonscientists sometimes mistakenly use them that way. Rob Webster and Virginia Hinshaw discovered some years ago that the waterfowl of the world appear to be the natural reservoirs for influenza viruses. Therefore, there is certainly open territory for influenza virus dissemination along any of the Old World flyways for bird migration.

As a result of all of those movements of birds, both migratory fowl and domestic poultry, we have seen a number of outbreaks of H5N1 avian influenza, starting in Asia, but extending into Europe and Africa as well. There have been some human cases, mostly (although not all) occupational, and with a high case fatality rate. Fortunately, there have been only a few instances of human-to-human transmission so far, all apparently quite limited. Obviously, everyone is watching this closely, just in case there is a change in the ability of the virus to transmit from person to person. If this virus were able to infect people readily and transmit itself, let’s say, as well as ordinary seasonal influenza does, then it could well be the next pandemic.

I am not putting money on H5N1, however. The next pandemic is going to happen, but so far nobody in this field can predict exactly when and where, and which influenza strain will be responsible. The only people who claim they can, at least as of now, are either charlatans or great risk takers. It is safer to bet on horse races.

Let me now move on briefly to that second step in emergence—establishment and dissemination. Luckily for us, this is much harder for a newly-minted pathogen. So many infections that can get into human beings from time to time may not have a good way of transmitting or propagating themselves. We have given them some help in this regard—think about HIV, for example, spreading in the blood supply or through contaminated injection equipment—and provided some highways for what I like to call “microbial traffic”: pathogens moving into new areas or new populations. Of course, environmental changes can be important here as well.

It used to take a long time to get around the world, but now you can do it, if you make all your connections, in 24 to 48 hours. If you do not make all your connections, as happens to most of us, then you spend time in a usually crowded airport, where you have even more opportunity to infect others.

Consider SARS, for example. By the way, ironically, Hong Kong decided to embark on a new promotional campaign just before SARS started. The slogan was “Hong Kong will take your breath away.” I do not know what inspired them to come up with it just then. Maybe they are better prognosticators than we are when it comes to the flu and other respiratory diseases. They certainly have had much more direct experience.

The consequences of SARS on global travel were enormous. The usually bustling Hong Kong airport was deserted. At least the few who did arrive there did not have to worry about waiting for their luggage. And the hotel rooms were cheaper, especially at the Hotel Metropole, which, we now know, was site of the “Big Bang” of SARS.

The spread of SARS was a remarkable event, when you think about it. One infected individual—a physician, in fact—from south China treated a patient who had an unusual pneumonia. Clinicians usually assume community-acquired pneumonia is not very transmissible—a major mistake here, as this turned out to be, unfortunately, an exception. He then went to Hong Kong, where he stayed at the Hotel Metropole, a popular business hotel, and became sick. He believed he had the same disease that had killed the patient he had treated earlier. He went to the hospital, told his healthcare providers about his odd patient, and warned them to be careful. Apparently they did not pay much attention. There were 99 healthcare workers infected in Hong Kong alone.

At the same time, another dozen people were infected in the Hotel Metropole by this index patient. This is what was responsible for the dissemination of SARS essentially worldwide. Of course, everyone likes to say that it was an interesting coincidence that he stayed in Room 911. There no longer is a Room 911 at the Hotel Metropole, by the way. This is a little bit like the first Legionnaires’ outbreak and what it did to that hotel’s image, but that is another story.

We had a few near-misses with SARS. The man who went to Vietnam was actually a New York businessperson who did not go back to New York. One doctor from Singapore did go to New York, but did not get sick until he was on his way home and was put into isolation in Germany.

Just to put in a small plug for one of my favorite causes (of course, this is completely biased): ProMED-mail, the listserv for reporting and discussion of emerging infections. There was a little item that appeared there in February 2003, just questioning whether something odd was going on in China, with reports of deaths. The next day China admitted to having 305 cases of SARS.

Yi Guan, as he first reported at an IOM Forum meeting on SARS in October 2003, actually was able to find earlier cases, going back at least to November 2002. There were several different cases in perhaps five cities in southern China, but they were not reported or recognized at the time. He did a survey and found that animal slaughterers and wild animal handlers had a much greater chance of becoming seropositive. Why? Because the ultimate link to humans was another cute little animal, Paguma larvata, the palm civet, which is actually a prized food animal in south China, particularly during the winter. It is very expensive. The civets became infected, it would appear, in the live animal markets, probably from contact with bats (according to work by Peter Daszak and colleagues). Wild-caught and farmed civets—yes, they do farm them—that were tested were all negative for the SARS coronavirus.

Then, of course, SARS came to Canada, as we well know, and wreaked havoc. Those of you who know Don Low, as many do, know that he was right at the front line there; I remember that when I saw him at one of our Forum meetings just after the crisis was over, he was exhausted.

By the end of all this, there were about 8,000 cases, most of them in the original area, but a few in other widely scattered places, with over 700 deaths, or about a 10 percent case fatality rate. Not a trivial disease.

This also was the first time the World Health Organization (WHO) had really acted aggressively, which got the Canadians very annoyed, since WHO issued a travel advisory recommending that travelers avoid Toronto. But WHO acted very effectively and was able, I think, to solve some of the scientific and disease-control problems rather quickly.

There is probably a parallel story with HIV origins. We do not know how it entered the human population. It may very well have been through a similar mechanism as SARS. It came from chimpanzees, most likely, and humans may have become infected by preparing or handling infected nonhuman primates for the “bushmeat” trade.

Hospitals also provide opportunities for emerging infections. Transmission of infections by contaminated injection equipment is well known. Most of the Ebola cases arose this way.

In summary, there are some recognizable factors responsible for precipitating or enabling emergence, such as ecological factors or globalized travel and trade. This was the framework, which I had originally developed, that we used in the Emerging Infections (IOM, 1992) report. These factors have since been augmented and embellished in the new version of the IOM Emerging Infections report, titled Microbial Threats to Health, published in 2003 (Box WO-3; IOM, 2003). So there are even more of them now, but I think they are recognizable. We know what is responsible for emerging infections and should be able to prevent them.

What are we going to do about this? One thing we can do is improve disease surveillance. I will put in another plug for ProMED here. There is a sort of backhanded compliment, I guess, from a recent popular book about John Snow and cholera, The Ghost Map, by Steven Johnson (2006). On page 219, he states: “The popular ProMED-mail e-list offers a daily update on all the known disease outbreaks flaring up around the world, which surely makes it the most terrifying news source known to man.”

The reality is that we need better early-warning systems and more effective disease control, implemented without delay. If we had let SARS go the way we had let AIDS go, probably very few of us would be here to talk about it, especially the physicians.

To summarize, these are my central themes:

  • There are factors responsible for the emergence of infectious disease.
  • Often, interspecies transfer is responsible or facilitates emergence.
  • Things we do (anthropogenic changes) often increase the risk of transmission by altering the environment and interposing ourselves into an environment containing pathogens unfamiliar to humans.
  • We can manage those risks in some ways using our wits.
  • You might ask, what should we be doing to make the world safer? Effective global surveillance is one, as are better diagnostics, political will to respond to these events, and research to help understand the ecology and pathogenesis of these “new” infections and to help develop effective preventive or therapeutic measures.

I am sure the other contributors to this chapter will have additional suggestions and insights into the problem and about how we might begin to make the world safer. We must get serious about this. Our future as a species may well depend on it someday.


Mark Woolhouse, Ph.D. 4

University of Edinburgh

Eleanor Gaunt, B.Sc. 4

University of Edinburgh

A systematic literature survey suggests that there are 1399 species of human pathogen. Of these, 87 were first reported in humans in the years since 1980. The new species are disproportionately viruses, have a global distribution, and are mostly associated with animal reservoirs. Their emergence is often driven by ecological changes, especially with how human populations interact with animal reservoirs. Here, we review the process of pathogen emergence over both ecological and evolutionary time scales by reference to the “pathogen pyramid.” We also consider the public health implications of the continuing emergence of new pathogens, focusing on the importance of international surveillance.


In this review, we will be particularly concerned with species of pathogen that have recently been reported to be associated with an infectious disease in humans for the first time. As discussed more fully below, not all such pathogens (possibly very few of them) will be truly “new,” at least in the sense that the pathogen has only recently discovered us rather than we have only recently discovered the pathogen. This focus on novel pathogens differs somewhat from the more general topic of “emerging infectious diseases,” which is often taken to include previously rare disease which are now on the increase, and sometimes diseases once considered to be in decline but which are now resurgent—the so-called “re-emerging” diseases. However, our focus does fairly reflect one of the major public health concerns of the early 21st century, the possible emergence of new pathogens species and novel variants (OSI 2006).

At first glance, a pre-occupation with yet-to-emerge disease problems may seem extravagant, given the massive and all too immediate health burdens imposed by malaria, tuberculosis, measles, and other familiar examples. An obvious counterargument is the relatively recent advent of HIV-1, unrecognized less than a generation ago and yet now one of the world’s biggest killers. As we shall discuss, the great majority of novel pathogens have not caused public problems on anything like this scale. However, AIDS (reinforced by knowledge of other plagues occurring throughout human history—see Diamond 2002) reminds us that the possibility that they could do so is real. In the early stages of the emergence of a new disease, it is a possibility that all too often cannot easily be dismissed as current concerns about H5N1 influenza A virus attest. A second reason for concern is that outbreaks of new diseases, and the public reaction to them, can cause economic and political shocks far greater than might be anticipated. The 2003 SARS epidemic, for example, resulted in fewer than 1000 deaths but cost the global economy many billions of dollars (King et al. 2006). Variant CJD, which has caused just over 100 deaths mostly confined to the UK, has had a global economic impact of a similar magnitude. Moreover a better understanding of the natural history of the emergence of new infectious diseases should inform our ability to combat them and, as the 2003 SARS epidemic illustrated, rapid, coordinated intervention can be highly effective.

Pathogen Diversity

Surveys of Pathogen Species

Although the existence of pathogens has been recognized for centuries, the first comprehensive list of human pathogen species was not published until 2001 (Taylor, Latham, and Woolhouse 2001). This list was generated from a comprehensive review of the secondary literature available at the time (see Taylor, Latham, and Woolhouse 2001 for full details). Each entry was a distinct species known to be infectious to and capable of causing disease in humans under natural transmission conditions. Species only known to cause infection through deliberate laboratory exposure were excluded. Species only known to cause disease in immuno-compromised patients and species only associated with a single human case of infection (e.g., Zika virus) were included. Ectoparasites such as ticks and leeches were not included. The 2001 list included species names that appeared in either (1) a text book published within the previous 10 years, or (2) standard web-based taxonomy browsers (see below), or (3) an ISI Web of Science citation index search covering the preceding 10 years. In subsequent work (e.g., Woolhouse and Gowtage-Sequeira 2005) NCBI taxonomies were used throughout (www.ncbi.nlm.nih.gov.library.vu.edu.au/Taxonomy/).

This methodology has the advantage that it is (or, at least, aspires to be) systematic, transparent and reproducible by other researchers. However, it does have its limitations and two of these in particular are worth highlighting. First, the criterion “capable of causing disease” has been variously interpreted and not all text book reports of disease-causing organisms can be confirmed from the primary literature. Second, some taxonomies have been revised since 2001, altering which pathogen variants are regarded as “species.” Further revisions can reasonably be anticipated. More fundamentally, using the species as the unit of analysis ignores a wealth of important and interesting variation that occurs within species in traits such as virulence factors, antigenicity, host specificity or antibiotic resistance. Moreover, what is meant by “species” may differ from one group to another; some pathogens have complex subspecific taxonomies (e.g., Salmonella enterica, Listeria monocytogenes, human rhinoviruses, Candiru virus complex, Trypanosoma brucei complex), making direct comparisons of different “species” potentially problematic. With these caveats noted however, a survey of recognized species represents a natural starting point for investigations of the diversity of human pathogens.

Surveys of New Pathogen Species

A subset of human pathogen species of special interest here is those that have only recently been discovered. In this context, “recently” is taken (arbitrarily) as meaning from 1980 onwards and “discovered” means recognized as causing infection and disease in humans. Thus there are several possible reasons for a pathogen to appear in the list of “new” species.

  1. Both the pathogen and the disease it causes did not occur before 1980.
  2. The disease was already recognized but the pathogen was not identified as the etiological agent before 1980.
  3. The pathogen was already recognized but had not been associated with human disease before 1980.
  4. Neither the pathogen nor the disease it causes were recognized or reported before 1980, but they did occur.
  5. What was considered to be a single pathogen before 1980 was subsequently recognized as comprising two or more species.

Strictly speaking, only the first of these possibilities constitutes an “emerging” infectious disease as defined earlier. In practice, however, most post-1980 pathogens probably fall into categories (2) to (5). For example, phylogenetic evidence has demonstrated clearly that the evolutionary origins of the human immunodeficiency viruses pre-date their discoveries in the 1980s by at least several decades (van Heuverswyn et al. 2006).

To provide a more complete picture of new pathogens the list of species described above was supplemented in early 2007 by searching the WHO, CDC, and ProMed web sites and the primary literature.

Results of Pathogen Surveys

Based on the above methodologies an updated version of the previously reported surveys generates a list of 1399 species of human pathogen. The most diverse group is the bacteria (over 500 species) with fungi, helminths and viruses making up most of the remainder (Table 5-2).

TABLE 5-2. Numbers of Pathogen Species by Taxonomic Category.


Numbers of Pathogen Species by Taxonomic Category.

Of these 1399 species of human pathogen, 87 have been discovered from 1980 onwards (Table 5-3). The composition of the subset of new species is very different from the full list. New species are dominated by viruses, and there are relatively few bacteria, fungi or helminths (Table 5-2). Within these broad categories certain taxa stand out: human retroviruses were not reported until 1980; most of the new fungi are microsporidia; and almost half the new bacteria are rickettsia. Although the over-representation of viruses is highly statistically significant (odds ratio (OR) = 18.0, P < 0.001), it is not clear that (excluding retroviruses) particular kinds of viruses have special status. Single-stranded RNA viruses make up the largest subset of new species (45 species) but are only marginally over-represented. Similarly, bunyaviruses are the largest single family but are also only marginally over-represented in the list of new viruses.

TABLE 5-3. Dates of First Reports of Human Infection with Novel Pathogen Species.


Dates of First Reports of Human Infection with Novel Pathogen Species.

In summary, since 1980 new human pathogen species have been discovered at an average rate of over 3 per year. Almost 75% of these have been virus species even though viruses still represent a small fraction (less than 14%) of all recognized human pathogen species.

Geographic Origins of Novel Pathogens

For those pathogen species discovered in the post-1980 period, the geographic location of the first reported human case(s) can often be determined from the primary literature, at least to within specific countries and often to specific regions or municipalities. However, this is not possible for all new pathogen species. For example, although the early history of HIV-1 has been exhaustively investigated the exact origin of the first reported human case remains unclear (Barre-Sinoussi et al. 1983). Similarly, the only reported human case of European bat lyssavirus 2 in a human could have resulted from exposure in Finland, Switzerland or Malaysia (Lumio et al. 1986). Moreover, some new human pathogens were already endemic or ubiquitous in the human population when they were first discovered; examples include human metapneumovirus and human bocavirus. For those pathogens which were discovered previously, but were only recently associated with human disease (such as commensals which have become pathogenic in patients immunosuppressed due to infection with HIV) the geographic origin is taken as the location in which the patient became sick (if the patient was not reported as having recent travel history).

Figure 5-6 shows a map of the points of origin of the first human cases of disease caused by 51 of the 87 pathogen species discovered since 1980. Data of this kind must be interpreted cautiously, not least because of likely ascertainment bias (variable likelihood of detection and identification of novel pathogens) in different parts of the world. Nonetheless, Figure 5-6 does make the important point that the emergence of new pathogens shows a truly global pattern, with multiple incidents being reported from every continent except Antarctica (with other gaps apparent in, for example, the Middle East and central Asia). There is no striking tendency for new pathogens to be more likely to be reported from tropical rather than temperate regions, or from less developed regions, or from more densely populated regions.

FIGURE 5-6. World map indicating points of origin of the first reported human cases of disease caused by 51 novel pathogen species since 1980.


World map indicating points of origin of the first reported human cases of disease caused by 51 novel pathogen species since 1980. Locations are identified to municipality or region (occasionally country), jiggled as necessary to avoid overlap.

Process of Pathogen Emergence

Reservoirs of Infection

Relatively few human pathogens are known solely as human pathogens. The remainder also occur in other contexts: as commensals; or free-living in the wider environment; or as infections of hosts other than humans.

Overall, probably no more than 50 to 100 species are specialist human pathogens. These range from major killers such as Plasmodium falciparum, mumps virus, Treponema pallidum, smallpox and HIV-1 to those causing more minor problems such as the human adenoviruses and rhinoviruses.

Hundreds of species which can cause human disease occur naturally as “commensals” found on the skin, on mucosal surfaces, or in the gut. They are normally benign but are sometimes pathogenic, for example if introduced into the blood system via a wound or in association with AIDS or other immunosuppressive conditions. Examples include the streptococci and Candida spp.

Several hundred human pathogen species have environmental reservoirs; these are referred to as “sapronoses.” Examples include Bacillus anthracis, Legionella pneumophila, and Cryptococcus neoformans. Here, we do not take sapronotic to include pathogens which are transmitted via the fecal-oral route or via a free-living stage of a complex parasite life cycle. Most sapronoses are bacteria or fungi, plus some protozoa, and cause sporadic infections of humans. Few are highly transmissible (directly or indirectly) between humans, an important exception being Vibrio cholerae. Some human pathogens (e.g., Listeria spp.) are both sapronotic and zoonotic.

Many more pathogens—over 800 species—are capable of infecting animal hosts other than humans. These range from species where humans are largely incidental hosts—such as rabies or Bartonella henselae—to species in which the main reservoir (sensu Haydon et al. 2002) is the human population and animals may be largely incidental hosts, that is, the so-called “reverse zoonoses” such as Schistosoma haematobium, rubella virus, Mycobacterium tuberculosis, or Necator americanus. We refer to all of these as “zoonotic,” following the World Health Organization’s definition of zoonoses as “diseases or infections which are naturally transmitted between vertebrate animals and humans.” In contrast to some other authors (e.g., Hubalek 2003) we do not consider pathogens with invertebrate reservoirs, and especially pathogens which are transmitted by arthropod vectors, as zoonotic. Note that the WHO definition does not include human pathogen species which recently evolved from animal pathogens, such as HIV-1. Nor does it include pathogens with complex life cycles where vertebrate animals are involved only as intermediate hosts with humans as the sole definitive host. It does, however, include reverse zoonoses.

Few of the 87 new human pathogen species in Table 5-3 are commensals or sapronoses. The great majority—around 80%—are associated with nonhuman vertebrate reservoirs (e.g., SARS coronavirus, vCJD agent and Borrelia burgdorferi) and most of the remainder appear to be long-standing human pathogens which have only recently been identified (e.g., Hepatitis G virus). Even some of the nonzoonotic pathogens, notably HIV-1 and HIV-2, are recently evolved from pathogens of nonhuman vertebrates (Keele et al. 2006). Compared with human pathogen species reported before 1980 the new species are statistically significantly more likely to be associated (or, at least, are more likely to be known to be associated) with a nonhuman animal reservoir (OR = 2.75, P < 0.001).

The reservoirs of the new, zoonotic human pathogens are mainly mammals, although a small number are associated with birds (Figure 5-7). However, the reservoirs include a wide range of mammal groups with ungulates, carnivores, and rodents most frequently involved, but also bats, primates, marsupials and occasionally other taxa (Figure 5-7). These observations must be interpreted with some caution because our knowledge of the host range of many pathogens is still incomplete. Nevertheless, the data available give the impression that taxonomic relatedness is less important than ecological opportunity as a determinant of the reservoirs of novel human pathogens. Homo sapiens as a species is classified within primates and, beyond that, the most closely related major groups are the rodents and lagomorphs. Ungulates, carnivores, and bats are more distant relatives. One related observation is that emerging human pathogens are especially likely to have a broad host range which includes more than one of these groups (Woolhouse and Gowtage-Sequeira 2005).

FIGURE 5-7. Counts of recently discovered human pathogens species (see Table 5-3) associated with various categories of non-human animal reservoirs.


Counts of recently discovered human pathogens species (see Table 5-3) associated with various categories of non-human animal reservoirs. Some pathogens species are associated with more than one category of reservoir. These data should be regarded as no (more...)

Drivers of Pathogen Emergence

As discussed earlier, not all the pathogens in the list of new species should be regarded as truly emerging; some have only recently been identified as the causative agents of established infectious diseases. However, for 30 or more of new species the literature suggests various drivers deemed to be associated with their emergence at the present time. These drivers can be considered within a framework originally suggested by the Institute of Medicine (IOM 2003), noting that this framework was devised with reference to all emerging and re-emerging infectious diseases, not just newly discovered pathogen species.

The most commonly cited drivers fall within the following IOM categories: economic development and land use; human demographics and behavior; international travel and commerce; changing ecosystems; human susceptibility; and hospitals. Economic development and land use, and especially changes in economic development and land use, are associated with the emergence of pathogens such as Nipah virus and Borrelia burgdorferi through activities such as intensification of farming and forest encroachment respectively. Human demographics and behavior, and especially changes in human demographics and behavior, are associated with the emergence of pathogens such as HIV-1 and Hepatitis C virus through activities such as sexual activity and intravenous drug use. International travel and trade are increasing as part of the process of globalization and are associated with the emergence of pathogens such as SARS coronavirus. Changing ecosystems covers unintended consequences of human activities such as desertification, pollution, and climate change and is associated with the emergence of pathogens such as the hantaviruses. Broadly speaking, the set of drivers listed so far are all “ecological” in nature; they are to do with the ways that humans interact with their wider environment (especially with other vertebrate animals both domestic and wild), providing opportunities for pathogens to infect humans, and with the ways that humans interact with each other, providing opportunities for pathogens to spread within human populations. A particular concern—implicit but not highlighted in the IOM’s list—is increasing use of “exotic” animal species, whether as food, farm animals or pets, and the trade that accompanies this.

The other most commonly cited drivers are to do with human population health. Human susceptibility is particularly important in the context of coinfections associated with AIDS (e.g., several species of microsporidia) but also covers the effects of malnutrition and other immunosuppressive conditions. The hospitals category covers iatrogenic transmission (e.g., vCJD), and xenotransplantation (e.g., baboon cytomegalovirus), as well as nosocomial infections (e.g., Ebola viruses and Rotavirus C).

Other categories listed by the IOM—such as “intent to harm”—have not been or are not commonly cited as associated with the emergence of novel human pathogen species. Among these is the category “microbial adaptation and change,” an observation that we expand on below.

Transmission and Disease

The 87 new species of human pathogen are associated with public health problems of hugely variable magnitudes. At one extreme is HIV-1 which has killed an estimated 25 million people since it was first reported in 1983, with 40 million more currently infected (UNAIDS 2007). HIV-1 has a high transmission potential within many human populations (combining transmission mainly by sexual contact or by needle-sharing associated with intravenous drug use with an infectious period of several years) and is highly pathogenic (with a case fatality rate close to 100% in the absence of treatment). At the other extreme, Menangle virus is known to have infected only 2 farm workers in which it may have caused a mild febrile illness (Chant et al. 1998). Menangle virus does not appear to be highly infectious to or transmissible between humans and has not so far been associated with severe disease. In the following section we consider the kinds of epidemiological and biological differences that underlie the vast difference in public health impacts between pathogens such as HIV-1 and pathogens such as Menangle virus.

Pathogen Pyramid

A useful aid to conceptualizing the process of pathogen emergence is the pathogen pyramid. The concept of the pathogen pyramid was first put forward by Wolfe et al. (2004) and developed further in Wolfe, Dunavan, and Diamond (2007). A very similar framework but with a more formal mathematical underpinning was adopted by Woolhouse, Haydon, and Antia (2005). The pyramid we use here has four levels corresponding to exposure, infection, transmission, and epidemic spread (Figure 5-8). Wolfe, Dunavan, and Diamond (2007) subdivided epidemic spread into (in their terminology): Stages 4a, b, and c, infectious diseases that exist in animals but with different balances of animal-to-human and human-to-human spread (where Stage 4c corresponds to reverse zoonoses as defined above); and Stage 5, pathogens exclusive to humans (corresponding to specialist human pathogens as defined above).

FIGURE 5-8. The pathogen pyramid (adapted from Wolfe, Dunavan, and Diamond 2007).


The pathogen pyramid (adapted from Wolfe, Dunavan, and Diamond 2007). Each level represents a different degree of interaction between pathogens and humans, ranging from exposure through to epidemic spread. Some pathogens are able to progress from one (more...)

Level 1: Exposure The first stage of the emergence of a new pathogen is the exposure of humans to that pathogen. Exposure requires “contact” between humans and the pathogen reservoir (which may be animal or environmental; exposure to commensals is implicit). The nature of “contact” is determined by the mode of transmission of the pathogen, e.g., animal bite, contamination of food with fecal material, blood-feeding by arthropod vectors or exposure to aerosols. The only barrier to exposure is insufficient overlap between habitats occupied by humans and habitats occupied by the pathogen. Changes in human ecology, particularly patterns of land use and interactions with animal reservoirs, are likely to change our exposure to potential new pathogens, as are changes in the ecology of the pathogens, their reservoirs or their vectors, e.g., as a result of climate change or other kinds of environmental change.

We do not know how many potential human pathogen species there are which we have not yet been exposed to, but we do know that human pathogens make up only a fraction of the known biodiversity of viruses, bacteria, fungi, protozoa and helminths, which in turn probably makes up only a fraction of the biodiversity which exists (Dykhuizen 1998).

Level 2: Infection The second stage of pathogen emergence is reached if the pathogen proves capable of infecting humans, possibly causing disease. As reviewed above, we know of 1399 species that have reached this stage. Others may have done so but have yet to be identified. Others may do so in the future but, to date, we have had no or insufficient exposure to them. Clearly, there will often be significant biological barriers—referred to as species barriers—preventing organisms infecting other kinds of host from infecting humans. We do not, for example, share any pathogens with plants, very few with invertebrates, and only a small number with cold-blooded vertebrates (e.g., Salmonella spp. in reptiles and amphibians—Mermin et al. 2004; helminth infections from fish—Chai et al. 2005). In contrast, we share many more of our pathogens with birds, and we share more than half with other species of mammal.

Indeed, the species barrier (at least between humans and other mammals) may not be as profound as is sometimes implied. According to Cleaveland et al. (2001) over 500 different species of pathogen are known to occur in domestic livestock and as many as 40% of these are zoonotic. The same authors report for domestic carnivores (dogs and cats) that almost 400 pathogen species are known, of which almost 70% are zoonotic. These data imply that, given the opportunities for exposure to pathogens that proximity to domestic animals must surely provide, many pathogens, perhaps even a majority, are capable of crossing the species barrier and infecting humans.

As suggested by the IOM (2003) report, an important contributor to the ability of a new pathogen to infect humans is variation in human susceptibility. In some cases this variation might have a genetic basis; for example, apparently pre-existing genetic variation in human susceptibility to HIV (Arien, Vanham, and Arts 2007). More commonly, phenotypic variation in the human population will be important, particularly factors which compromise the human immune system. The most striking examples come from the wide range of opportunistic infections associated with the immunosuppressive effects of HIV infection; these include several pathogen species, such as the microsporidia Brachiola algerae and Enterocytozoon bieneusi which were first recognized in AIDS patients.

Level 3: Transmission The third stage of pathogen emergence is reached if a pathogen that can infect humans also proves capable of transmission from one human to another. Transmission in this context need not be direct (e.g., by aerosol spread or sexual contact); it might be indirect (e.g., via contamination of food) or via an arthropod vector. The requirement is simply that an infection of one human leads ultimately to an infection of another.

In most cases the barriers preventing transmission will be biological, often reflecting tissue tropisms within the human host since pathogens normally need to access the gut, upper respiratory tract, urogenital tract or (especially for vector-borne infections) blood in order to be able to exit the body. However, sometimes such barriers can be overcome by changes in human behavior. The two best examples concern prion diseases. Kuru is only transmitted through cannibalism, which is extremely rare in most human societies. vCJD is not transmissible between humans except iatrogenically as a result of surgical procedures or blood transfusions.

Again, these barriers to human-to-human transmission are far from insuperable. Although information is lacking for many pathogen species (Taylor, Latham, and Woolhouse 2001), the literature suggests that a substantial minority—at least 500 species, over one third of the total, and possibly many more—are transmissible between humans.

Level 4: Epidemic Spread The fourth and, in our version, final level of the pathogen pyramid is reached if a pathogen is sufficiently transmissible within the human population to cause major epidemics or pandemics and/or to become endemic, without the involvement of the original reservoir. This represents a quantitative rather than qualitative distinction and it can be made more formally precise by reference to the concept of the basic reproduction number, R0. R0 can be defined as the average number of secondary cases of infection produced when a primary case is introduced into a large population of previously unexposed hosts (adapted from Anderson and May 1991). The distinction between Level 3 and Level 4 pathogens can be expressed in terms of R0. If R0 is less then one then, on average, a single primary case will fail to replace itself and although there may be chains of transmission these will be self-limiting—this corresponds to Level 3. On the other hand, if R0 is greater than one then, on average, a single primary case will produce more than one secondary case and, at least initially, there will be an exponential increase in the number of cases and ultimately a major epidemic is possible—this corresponds to Level 4. (A proviso is that, even if R0 >1, stochastic extinction of the infection chain is quite possible, especially in the early stages of the epidemic when numbers of cases are low—see May, Gupta, and McLean 2001.)

The barriers between Level 3 and Level 4 are both biological and epidemiological. The biological barriers are to do with pathogen infectivity, host susceptibility, the infectiousness of the infected host and for how long the host is infectious (whether this is terminated by recovery or death). The epidemiological barriers are to do with the rate and pattern of contacts between infectious and susceptible hosts. Here again, the nature of a “contact” reflects the mode of transmission of the pathogen (see above). The rate and pattern of contacts can increase, and hence R0 can increase, independently of the pathogen, as a result of shifts in host demography or behavior. In the context of human hosts such shifts could constitute changes in factors such as population density (e.g., urbanization), living conditions, water supply and sanitation, patterns of travel and migration, or sexual behavior and intravenous drug use, depending on the specific pathogen involved. These might be augmented by changes in host susceptibility due to the kinds of factors listed earlier. Clearly, for the same pathogen R0 can vary considerably from one human population to another. Similarly, different strains of the same pathogen species may have very different R0 values in humans, e.g., different subtypes of influenza A virus.

In principle, this barrier might seem quite fragile; the kinds of changes in host demography and behavior alluded to above are certainly occurring. In practice, it is not clear how many species of human pathogen have reached Level 4 since we have estimates of R0 values within human populations for only a handful of them. Based on earlier studies (Taylor, Latham and Woolhouse 2001; Woolhouse and Gowtage-Sequeira 2005) a plausible estimate is that 100 to 150 pathogen species are capable of causing major outbreaks within human populations, with half to two-thirds of these being specialist human pathogens and the remainder also occurring in animal reservoirs or the wider environment. This implies considerable attrition between levels 3 and 4 of the pathogen pyramid.

Status of New Pathogens

We can now consider where the 87 new human pathogen species fit within the pathogen pyramid. It is immediately clear that the majority of them are at Level 2; they can infect humans but are rarely if at all transmitted between humans. Examples include Borrelia burgdorferi, vCJD agent, most of the hantaviruses and Ehrlichia spp. At the other extreme, although there are a number that appear to be at Level 4, most of these are pathogens which are probably long established in human populations but have only recently been recognized, such as human metapneumovirus or hepatitis C virus. Only a very small number are likely to be recent additions to the repertoire of Level 4 human pathogens, namely HIV-1, HIV-2 and, arguably before its spread was contained, SARS coronavirus. In between, at Level 3, there is a significant minority of new pathogens that are somewhat transmissible between humans but which have so far been restricted to relatively minor outbreaks. These include Andes virus, human torovirus and some Encephalitozoon spp. For these species the value of the basic reproduction number R0 is of particular interest, especially if it lies close to one, the threshold for potential epidemic spread. R0 can be estimated from data on the distribution of outbreak sizes as follows.

The quantitative analysis of outbreak data used to estimate R0 is based upon a methodology developed by Jansen et al. (2003) for measles case data from the UK (to monitor the effect of changes in childhood vaccination coverage). Here, we apply the technique (see also Matthews and Woolhouse 2005) to data on human outbreaks of Andes virus (see Figure 5-9 for details). Andes virus is an emerging South American hantavirus and there are concerns that, unusually for hantaviruses, it can be transmitted directly between humans (Wells et al. 1997). Most reports of Andes virus represent sporadic cases (i.e., outbreaks of size 1) but clusters of cases also occur, ranging in size from 2 to 20 (Figure 5-9). This pattern—many small outbreaks and a few larger ones—is typical of a wide range of infectious diseases (Woolhouse, Taylor, and Haydon 2001). The best estimate of R0 based on these data lies in the range 0.22 to 0.37. This is well below one and in reality is likely to be an over-estimate since at least some of the clusters of cases may reflect exposure to a common source rather than, as is assumed in the analysis, person-to-person spread. However, the analysis does suggest that occasional larger outbreaks will occur (the R0 estimates are consistent with up to 1 in 200 outbreaks being of size 10 or more) without necessarily implying that there has been a major change in Andes virus epidemiology. This same approach can be applied to other “Level 3” pathogens to determine how close they are to reaching Level 4 of the pyramid (cf. Jansen et al. 2003).

FIGURE 5-9. Analysis of Andes virus outbreaks.


Analysis of Andes virus outbreaks. Frequencies of outbreaks of different sizes (grey bars) are compared with the fit of a statistical model to the data (open bars). Outbreak data are taken from Wells et al. (1997) and Lazaro et al. (2007). The model is (more...)

Evolution and Emergence

So far we have examined the emergence of new species of human pathogens over time scales of a few decades. However, the origins of many human pathogens are considerably more ancient, extending back over time scales of thousands to millions of years. This process has been reviewed by, among others, Weiss (2001), Diamond (2002), and Wolfe, Dunavan, and Diamond (2007). Of particular interest here are examples of pathogens which have emerged in human populations as a result of successfully crossing the species barrier from an animal reservoir and reaching Level 4 status. Any analysis must be prefaced by the observation that we have good evidence for the origins of only a small minority of pathogens, plausible hypotheses (usually based on the epidemiologies of related species) for some of the remainder, and no information at all for the majority. Wolfe, Dunavan, and Diamond (2007) have proposed that this lack is addressed by a research program they term an “origins initiative”. That said, 16 examples of putative species jumps are listed in Table 5-4. Inspection of this list suggests two tentative observations. First, although a variety of different kinds of pathogen are listed including several species of bacteria and protozoa, the majority are viruses. Second, a variety of different animal reservoirs are involved: primates, ungulates, rodents and birds. Wolfe, Dunavan, and Diamond (2007) point out that primates are much better represented in this list than might be expected given their much more modest role as reservoirs of modern zoonoses. This may reflect both the much greater ecological overlap between humans and other primates in the distant past and the notion that pathogens of our closest relatives are more likely to be epidemiologically successful in humans. The latter idea is supported by the observation that two of the most recent examples of successful species jumps—HIV-1 and HIV-2—have primate origins (Keele et al. 2006). Similarly, several human pathogens with much deeper evolutionary origins, perhaps even pre-dating Homo sapiens as a distinct species, are also most closely related to modern primate pathogens. Examples include the hepatitis B and G viruses (Simmonds 2001). It is worth noting that species jumps can occur in both directions. For example, it is thought that Mycobacterium bovis—predominantly a cattle pathogen—evolved from the human pathogen M. tuberculosis (Brosch et al. 2002).

TABLE 5-4. List of Human Pathogens Which Have Successfully Crossed the Species Barrier and Proved Capable of Epidemic Spread and, in Some Cases, Endemic Persistence in Human Populations.


List of Human Pathogens Which Have Successfully Crossed the Species Barrier and Proved Capable of Epidemic Spread and, in Some Cases, Endemic Persistence in Human Populations. The Original Hosts Have Been Identified with Varying Degrees of Certainty. (more...)

HIV-1 and HIV-2 illustrate that the evolution of new species of pathogen is an ongoing process. Both are sufficiently divergent from their closest relatives— SIVcpz and SIVsmg respectively—in terms of both their genome sequences and their biologies to be regarded as distinct species. This has probably occurred within the last 100 years. In a nonhuman context, over even shorter time scales we have seen the evolution of another new species of pathogen, canine parvovirus (CPV), associated with a cat virus, feline panleukopenia virus (FPV), jumping into dogs (Parrish and Kawaoka 2005). CPV has spread to dog populations around the world in only a few years.

All of these examples concern RNA viruses, and RNA viruses differ from pathogens with DNA genomes in having far higher nucleotide substitution rates and so the potential for rapid adaptation to new host species (Holmes and Rambaut 2004). The importance of this kind of genetic lability has been explored by Antia et al. (2003) using simple mathematical models. These authors suggested that the potential for successful adaptation (which they defined as becoming sufficiently transmissible that R0 in humans became greater than one) is sensitive both to the size of initial outbreaks (determined mainly by the initial R0 value) and, especially, to the rate of genetic change and the genetic distance to be traveled. As discussed earlier, the initial R0 value is a function not only of pathogen biology but also of features of human demography and behavior which promote transmission and thus the kinds of changes in these mentioned above have the potential to increase the likelihood of the evolution of new human pathogens.

The successful adaptation of a nonhuman pathogen to humans is itself a highly stochastic process. This is illustrated by the early evolution of the human immunodeficiency viruses (see Van Heuverswyn et al. 2006). There is phylogenetic evidence for numerous introductions of SIVs into human populations; most of these failed to become established (Arien et al. 2007) and only HIV-1 M subtype C has become truly pandemic.

This pattern raises the question of where, in practice, the relevant genetic changes that allow a pathogen to successfully invade a human population occur. Antia et al.’s analysis focuses on the process of adaptation within the human population. However, it may be that genetic change within the original reservoir (whether animal or environmental) is also critical for producing variants which are capable of infecting humans in the first place. With a handful of exceptions, such as the simian immunodeficiency viruses, we typically have very little information on the genetic and functional diversity of human pathogens or their immediate ancestors in nonhuman reservoirs.

This is a potentially important topic for future research but a reasonable working hypothesis, supported by our knowledge of the origins of HIV, is that genetic variation in nonhuman pathogen populations does occasionally and incidentally produce human infective variants, and this explains why so many novel human pathogens are RNA viruses (Woolhouse, Taylor, and Haydon 2001). This idea is further supported by the observation that RNA viruses tend to have broader host ranges than DNA viruses (Cleaveland, Laurenson, and Taylor 2001; Woolhouse, Taylor, and Haydon 2001), implying that they can more easily adapt to new host species.

The implication of the preceding discussion is that pathogen evolution is not only an important driver of progression up the pathogen pyramid over long time scales but that, especially for RNA viruses, this process may be relevant over much shorter time scales as well. In addition, we note that evolution is clearly a key driver of the emergence of new variants of existing human pathogen species, with potentially significant epidemiological consequences. This is evident in the generation of antibiotic resistant bacteria and chloroquine resistant malaria, as well as variants expressing novel virulence factors (e.g., E. coli O157) or with distinct pathogenicities (e.g., H5N1 influenza A).

Finally, we note that an important feature of new pathogens is that they have not been previously subject to evolutionary constraints on their virulence (i.e., the degree of harm they do to the host) in the new host (Ebert 1998). Moreover, the new host may make only a small contribution to the epidemiology of the pathogen (Level 3 of the pathogen pyramid), or even none at all [if] it is an epidemiological “dead end” in the sense that although infection can occur there is no onward transmission of infection (Level 2). In such cases evolutionary constraints on pathogen virulence may be weakened or absent (Woolhouse, Taylor and Haydon 2001). Putting these observations together it is unsurprising that many new human pathogens (e.g., Nipah and Ebola viruses, some hantaviruses, SARS coronavirus, and HIV-1) are very virulent, as indicated by their high case-fatality rates.

Public Health Implications

Future Emergence Events

It seems likely that the kinds of ecological changes that have been associated with pathogen emergence in the recent past (see IOM 2003) will continue to occur in the immediate future, e.g., continued deforestation for agriculture, intensification of livestock production, globalization, bush meat trade, urbanization, and so on. In that case, we can reasonably anticipate the reporting of yet more new species of human pathogen (currently happening at a rate of over 3 per year—Table 5-3) in the immediate future as well.

The survey of new pathogen species reported since 1980 suggests the kinds of pathogens that are most likely to emerge in the future. Four characteristics are expected to be particularly important:

  1. RNA virus (most new pathogens are RNA viruses);
  2. Nonhuman animal reservoir (most new human pathogens are associated with or originate from other kinds of host, usually other species of mammal);
  3. Broad host range (pathogens that are already capable of exploiting a range of different hosts species are more likely to have the potential to infect us);
  4. Some, perhaps initially limited, potential for transmission between humans (in which case, evolution of the pathogen and/or changes in the human population that increase the pathogen’s transmission potential could lead to a marked increase in the size of outbreaks).

The above criteria are certainly not intended as absolute predictors of pathogen emergence; a good historical counterexample is syphilis (new to the Old World in the late 15th century, its origins remain disputed but it is a bacterium not associated with nonhuman reservoirs—Weiss, 2001). Even so, it is helpful to have some indication of what kinds of new pathogen we are most likely to encounter.


The first line of defense against any emerging pathogen is its rapid detection and identification. Recent practical experience with BSE and SARS demonstrates that rapid detection and identification leading to the rapid introduction of preventive measures can prove highly effective in combating outbreaks of novel diseases (Wilesmith 1994; Stohr 2003). Moreover, computer simulation studies motivated by concerns about the possible emergence of pandemic influenza suggest that only if a new strain is detected in the very earliest stages and interventions are put in place extremely promptly is their any realistic prospect of curtailing an epidemic (Ferguson et al. 2006).

Surveillance for novel pathogens, however, does present some particular challenges. Initially, this is likely to depend on clinical observation, such as the reporting of clusters of cases of disease with unusual symptoms. Internet surveillance for reports of unusual disease outbreaks is also possible and, in the longer term, generic diagnostic tools—for example, lab-on-a-chip tests for all known human viruses—should become available (OSI 2006).

The map of reports of new pathogen species (Figure 5-6) argues strongly that surveillance needs to be global, especially considering the unprecedented rates of international travel and trade that can allow new infectious diseases, such as SARS, to spread around the world over time scales of days or weeks. Pathogen emergence is an international problem.


Another key lesson from surveying novel pathogens is the importance of animal reservoirs in the emergence of new infectious diseases. One implication of this is that surveillance in reservoir populations likely to be an effective tool for monitoring risks to humans (Cleaveland, Meslin, and Breiman 2007). On top of this, it may often be the case that most scientific knowledge of the basic biology of an unusual human pathogen lies, at least initially, with the veterinary community rather than the medical community. Palmarini (2007) lists a number of examples of this: infectious cancers, retroviruses, lentiviruses, transmissible spongiform encephalopathies, rotaviruses, and papilloma viruses. To this list could be added coronaviruses and ehrlichiosis. More generally, it is now widely recognized that humans share the majority of their pathogens with other animals (Taylor, Latham, and Woolhouse 2001).

Together, these observations underline the importance of close linkages between medical and veterinary researchers, resonating with the “one medicine” concept originally put forward by Schwabe (1969) and seeming especially appropriate in the context of emerging infectious diseases.

However, understanding the process of emergence requires much more than an understanding of the basic biology of the host-pathogen interaction, important though this undoubtedly is. A theme of this review has been the importance of ecological factors for the emergence of new pathogens. But we have used “ecological” to cover a very wide range of environmental, agricultural, entomological, demographic, behavioral, cultural, economic, and sociological drivers of pathogen emergence. In specific contexts these could include the bush meat trade (associated with the emergence of HIV and SARS), livestock feed production (associated with BSE/vCJD) or changes in pig farming practices (associated with Nipah virus). These examples emphasize that disease emergence is a multi-disciplinary problem and needs to be understood at a number of scientific levels. Collaborations need to be developed not just between the human and animal health branches of the biomedical research community but also with researchers covering a much wider range of disciplines.


The pathogen pyramid provides a useful conceptual framework for thinking about the process of the emergence of a new species of human pathogen. However, it is immediately clear that at each level of the pyramid there are some important gaps in our knowledge.

First, we still have very little idea of the diversity of pathogens to which humans are being or could be exposed. Systematic surveys across a range of possible sources of new pathogens (notably other mammal species) using techniques such as shotgun sequencing are possible in principle, and would provide this information.

Establishing a priori which pathogens are capable of infecting humans is even more challenging. A first step would be to identify the cell receptors used by the 189 recognized species of human virus. At present, we have this information for only around half of the virus species.

Estimating the transmission potential of a new pathogen within the human population can only be achieved by closely monitoring initial outbreaks. Analysis of such data can provide some early warning of crucial epidemiological changes (as illustrated by the analysis of measles data mentioned above). Real time analysis of epidemic data can also provide timely estimates of the transmission potential (see Lipsitch et al. 2003 for application to the SARS epidemic) which can help inform control efforts. On the other hand, for many of the rarer human pathogens we do not currently know whether or not they are transmissible between humans (Woolhouse 2002).

It is extremely likely that we will encounter new species of human pathogen in the near future. We urgently need the scientific and logistic capacity to rapidly detect and evaluate the threat that new pathogens present and to intervene quickly and effectively wherever necessary. Experience of SARS provides some encouragement that, given adequate resources, efforts to combat emerging pathogens can be successful, but further challenges lie ahead.


Jonathan A. Eisen, Ph.D. 5

University of California, Davis


One of the unifying goals of this workshop, as well as of the Forum on Microbial Threats, has been to promote the study of microbes, not only to enhance our understanding of their present roles in the world but also, we hope, to predict their future changes (e.g., the emergence of new infectious diseases). This was, of course, one of the life missions of Joshua Lederberg, who helped create the Forum and who this workshop is honoring.

Studies of evolution are central to these goals because “nothing in biology makes sense except in the light of evolution” (Dobzhansky, 1973). Evolutionary studies help us understand the past and interpret the present, and from a combination of those two we have some possibility of being able to predict the future. Since Lederberg was also keen on evolutionary studies (Lederberg, 1997, 1998), it is appropriate for a workshop in his honor to focus on Microbial Evolution and Co-Adaptation.

I would like to note that I feel a personal connection to Joshua Lederberg, as I received much of my microbiology training from Ann Ganesan who had been a Ph.D. student in his lab. However, anyone, with or without a specific connection to Lederberg, can learn a great deal about him and his work through a wonderful website made available by the National Library of Medicine.6 There you can find many of his scientific papers, as well as his letters, scientific articles he wrote, articles he read, columns he wrote for The Washington Post on science policy, and more. In addition, I would like to point out that Lederberg was an ardent supporter of “open access” to scientific publications. He was on the board of PubMed when PubMed Central was created. PubMed Central is a centralized archive of freely available, full-text versions of scientific publications. Although not all of his papers are in PubMed Central,7 most are available at the National Library of Medicine site.

In this paper, I am focusing on one key aspect of evolution: the origin of novelty, or how new forms, functions, processes, and properties originate. In addition, I consider some of the factors that influence the likelihood that novelty will originate—something generally referred to as evolvability. Why do some organisms invent new functions readily while others are “novelty challenged”? I note that I focus here on work from my lab and am not attempting to review the entire field.

I have been interested in the origin and novelty and evolvability, particularly as they occur in microbes, since I was introduced to microbes as an undergraduate through studies of hydrothermal vent ecosystems. Actually, I had written a paper on this back in high school, but it was not until college that I truly focused on the topic. A bit later, in 1995, my career—and that of most other microbiologists and evolutionary biologists—was changed forever with the publication of the first complete genome of any free-living organism (Fleischmann et al., 1995). It was then that I shifted my research to the integration of evolutionary analyses with studies of genome sequences. For better or for worse, I coined the term for this field: phylogenomics (Eisen et al., 1997). Note that the way I use this term is a bit different than some others in the community. Many people use the term phylogenomics to refer to the use of genome-scale data (e.g., genome sequences) for phylogenetic studies. With that introduction, I will now relate some “phylogenomic tales” as examples of how phylogenomic analysis can help us understand the origin of novelty. This will also demonstrate the usefulness of this approach for understanding the past, interpreting the present, and—maybe—predicting the future.

Phylogenomics and Novelty I: Predicting Gene Functions Using Evolutionary Trees

Throughout this workshop, we have seen many examples of genome sequencing leading to wonderful insights about the microbial world. Indeed, it can be said that genome sequence data have sparked a renaissance in microbiology. It is important to realize, however, that much of this renaissance rests on one particular step in the analysis—the prediction of gene function based on gene sequence. This step is critical because typically one generates the genome sequence of a particular organism, most of whose genes will not have been studied experimentally. Prediction of gene function adds value to the genome sequence data because such predictions can guide further computational and experimental studies of the organism. My first phylogenomic tale illustrates how, in the course of a genome-sequencing project, the evolutionary analysis of a particular gene can enable us to make more accurate predictions about the function of that gene in a particular organism and, in some cases, can also provide insight into the evolutionary processes in that organism, as well.

This is the story of one such organism, Helicobacter pylori, a bacterium that dwells in the stomachs of humans and some other mammals. For many years, these stomach dwellers were generally ignored. However, thanks in a large part to the work of people like Barry Marshall, it is now known that H. pylori is a causative agent of stomach ulcers as well as gastric cancers (Marshall, 2002). Due to its medical importance as well as its novel ability to tolerate very high acidity, this species was one of the first targeted for genome sequencing. In 1997, the genome of one strain was published (Tomb et al., 1997).

At that time, as a Ph.D. student at Stanford, I was relentlessly badgering everyone I knew, attempting to convince them that evolutionary analysis could help in the prediction of gene function. I had become convinced of this myself through analysis of the trickle of genome sequence data for humans, yeast, and other organisms that had already begun to flow before the first complete genome was published. Back in 1995, I had even published a paper showing the benefits of evolutionary reconstructions in studies of one family of proteins, the SNF2 family (Eisen et al., 1995). Although the benefits were clear to me, others were not so sure. Fortunately, at the time I was teaching a class with Rick Myers, a professor in the genetics department and the head of the Stanford Human Genome Center. He had been asked to write a “News and Views” piece for Nature Medicine commenting on the recent papers reporting the sequencing and analysis of the genomes of H. pylori (Tomb et al., 1997) and Escherichia coli K12 (Blattner et al., 1997). Also, since he was one of the people I had been badgering, he suggested I try to come up with an example of where the inclusion of evolutionary analysis could have benefited their work.

Luckily for me, there was a claim in the H. pylori paper that was a perfect candidate for evolutionary analysis. The authors reported (Tomb et al., 1997):

The ability of H. pylori to perform mismatch repair is suggested by the presence of methyl transferases, mutS and uvrD. However, orthologues of MutH and MutL were not identified.

This was right up my alley because I was working on the evolution of DNA mismatch repair at the time.

A DNA mismatch can be created when the wrong base is put into a newly synthesized strand by the enzyme carrying out DNA replication (i.e., DNA polymerase). Thus, a mismatch indicates a replication error. Mismatch repair is a process whereby, immediately following DNA replication, repair enzymes scan for mismatches between the template and newly synthesized DNA strands. When the mismatch repair machinery finds one, it removes a section of DNA containing the mismatch from the newly synthesized strand. That section is then resynthesized using the original (and presumably accurate) template strand as a guide. Mismatch repair is vital. It greatly reduces the mutation rate by correcting many of the replication errors made by DNA polymerase.

It was because of my knowledge of the evolution of mismatch repair that the report in the H. pylori paper caught my attention. I knew that every time a mismatch repair system had been found in an organism, regardless of whether that organism was from the bacteria, mammals, plants, yeast, or a variety of other groups, and regardless of whether it was found by genetics, by biochemistry, or even by targeted cloning, the pattern was the same. The system always required at least one member of the MutS family of proteins and one member of the MutL family. Yet, according to the paper, H. pylori did not encode a MutL homolog. So I decided to look at this in more detail.

My first step was to recheck the genome sequence analysis. I did this by first using the Basic Local Alignment Search Tool (BLAST), which compares a given DNA, RNA, or protein sequence with corresponding sequences in a database and determines if there are similar sequences therein and, if so, generates a list of the closest matches. First, I took all known MutL-like proteins and searched them against the H. pylori genome data and found, as Tomb et al. did, that there were no close matches. Given that they had determined the complete genome of this strain, the absence of BLAST match suggested there was indeed no MutL encoded in the genome. I note that this “determining the absence” of something from a genome is one of the key benefits of determining complete genome sequences (Fraser et al., 2002).

Then I did some BLAST searches with MutS-like proteins as “queries” and found, as the authors had reported, that there was one, and only one, protein encoded in the genome that was similar to proteins in the MutS family. So I took this protein and then used it to search against all known sequences from other organisms, to see to what it was most similar. This in essence was mimicking the searches done in the analysis of the genome, and the result seemed quite convincing (Table 5-5). All of the proteins that were most similar to the H. pylori protein were described in the database as “Mismatch repair protein MutS” or something similar. This description of the related proteins, also known as their annotation, was clearly what led the authors to conclude that this protein was involved in mismatch repair. This left me with a conundrum. There was no MutL protein encoded in the genome, yet there was, apparently, a MutS protein. Many possible explanations came to mind, all of which were interesting. H. pylori might have been the first species to be found with a mismatch repair system that did not require a MutL homolog. Or it might have recently lost its MutL, as had been seen in many strains of E. coli and Salmonella (LeClerc et al., 1996). Alternatively, perhaps the MutS-like protein was not a normal MutS involved in mismatch repair, but rather was used for a different function in this organism.

TABLE 5-5. BLAST Search Results as They Were Seen in 1997 Using the MutS-like Protein from Helicobacter pylori as a Query.


BLAST Search Results as They Were Seen in 1997 Using the MutS-like Protein from Helicobacter pylori as a Query.

Although these, along with yet other explanations, seemed plausible, one observation suggested to me that the latter explanation—that the MutS-like protein was doing something else—might be the correct one. In the list of BLAST matches, I had noticed that members of the MutS family that I knew to have documented roles in mismatch repair were not high on the list, indicating the H. pylori protein was not as similar to these as it was to some other MutS-like proteins that might have a novel function (Table 5-5). In addition, I knew from my prior work (Eisen et al., 1995), and from the work of others (e.g., Tatusov et al., 2000), that BLAST scores were not a reliable indicator of evolutionary relatedness. So my next step was to investigate the evolutionary history of the MutS proteins, including the new one from H. pylori. I did this by generating a multiple sequence alignment of all available MutS sequences and inferring an evolutionary tree from that alignment. The evolutionary tree revealed that there were two subfamilies of MutS homologs in bacteria, one containing the “normal” MutS-like proteins known to be involved in mismatch repair and the other containing the H. pylori protein along with a few others. None of the proteins in this second subfamily had ever been studied experimentally and all were only distantly related to the “normal” MutS subfamily. Given this finding along with the observation that H. pylori lacked a MutL homolog, we wrote in our Nature Medicine article (Eisen et al., 1997) that it was premature to predict that mismatch repair would be found in this species. I followed this up with a more comprehensive evolutionary study (Eisen, 1998b) that came to the same conclusion.

I would like to point out that this was not simply an esoteric exercise. Mismatch repair has great significance due to its role in modifying the mutation rate. Without mismatch repair, an organism’s mutation rate usually goes way up (and in addition the rate of acquisition of DNA from other organisms also tends to go up). This has important implications for the evolution of virulence, pathogenicity, and drug resistance. Many papers published since this initial analysis have confirmed that H. pylori actually does have a high baseline mutation rate (Kang et al., 2006). In fact, the entire group of epsilon proteobacteria (of which H. pylori is a member) does not have a normal MutS homolog. Thus, the question arises: Do all of these organisms have high mutation rates? Or have they evolved some compensatory process that reduces mutation rate even without mismatch repair? At least from current data, it seems that many members of this group do have somewhat elevated mutation rates. For example, when the Sanger Center was sequencing the genome of a close relative of H. pylori, Campylobacter jejuni (which also does not encode a normal MutS homolog), even in the few generations required to grow up a sample of this strain for sequencing, many mutations were acquired (Parkhill et al., 2000). This suggests that the mutation rate for this strain is quite high. Awareness of this dynamic is vitally important when designing therapeutics to target organisms that lack mismatch repair. This example illustrates how evolutionary analysis of a gene found in a genome can not only tell us something about the biology of that organism, but can also help us to predict its evolvability.

This H. pylori story is but one of many that demonstrate the value of including evolutionary analysis when predicting gene function. In this regard, I must point out that I am far from unique in holding this view. For example, while I was working on the use of phylogenetic trees, multiple groups were showing how classifying proteins into families and subfamilies was critical for predicting function (Sonnhammer et al., 1997; Tatusov et al., 2000). My approach to this functional prediction was somewhat different from these subfamily- or ortholog-focused approaches in that I have argued that one needs to actively use the tree itself by using an approach known as character state reconstruction (Figure 5-10). Character state reconstruction is a commonly used method in phylogenetics whereby one can infer for particular traits (also known as characters) the history of change between different forms of those traits (also known as states). Normally, character state reconstruction is used to infer information about ancestral nodes in a tree (e.g., the common ancestor of two extant organisms), but it can also be used to infer the likely state of modern organisms. It is relatively straightforward to use these methods to infer information about protein function by treating each protein much as you would treat different organisms. Importantly, not only can one infer likely functions for proteins using this approach, but this has a benefit over subfamily classification approaches in that it is less likely to make incorrect predictions of function (such as, when function changes rapidly; Eisen, 1998a). It is worth noting that this adaptation of character state reconstruction methods for predicting the functions of uncharacterized genes is analogous to predicting the biology of a species based on the position of that organism in the tree of life. Such predictions tend to work better for gene function, in a large part because organism-level biology can change much more rapidly than the function of specific genes.

FIGURE 5-10. Phylogenetic prediction of gene function.


Phylogenetic prediction of gene function. Outline of a phylogenomic methodology. In this method, information about the evolutionary relationships among genes is used to predict the functions of uncharacterized genes (see text for details). Two hypothetical (more...)

Regardless of whether one uses my character state-based approach or one of the subfamily-based approaches, it is clear that adding information about the evolutionary history of a gene can help predict its functions. Last, and perhaps most important, methods that make use of evolutionary information have been automated (Brown et al., 2007; Haft et al., 2001; Tatusov et al., 2000; Zmasek and Eddy, 2002) and thus can be employed more readily and on larger genomic data sets.

Phylogenomics and Novelty II: Recent Evolution

The methods for predicting function outlined above focus on making use of known information about some genes to predict the functions of uncharacterized genes. These methods do not work well, or even at all, if completely novel functions have arisen in an organism over short evolutionary time scales. Fortunately, over the last few years researchers have developed suites of methods to scan through genomic data for evidence of recent evolutionary diversification. Thus, my second phylogenomic tale relates to how knowledge about the origin of novelty helps us both carry out and interpret these scans.

The key to leveraging information about recent evolutionary events is to first get an understanding of how new functions arise on short time scales. Fortunately, we know a decent amount about this and have heard a great deal of recent new insights at this meeting. Examples include clustered regularly interspaced short palindromic repeats (CRISPRs) loci, which appear to be immune-system analog in bacteria and archaea that provides for immunity from phage, the rapid loss of genes that are not under strong positive selection, the use of contingency loci to rapidly change the sequence of a protein, and so forth. In fact, many of these phenomena have been either discovered or characterized in detail through comparative genomic analysis of closely related organisms.

Based upon this we can design a relatively simple process for taking a genome and identifying recent events in its history: sequence the genome and the genomes of some close relatives; compare the genomes to each other (including documentation of gene order conservation, gene gain and loss, gene duplication, and generation of simple polymorphisms); and then catalog the variation into different classes that correspond to different mechanisms of novelty generation. For example, polymorphisms in protein coding regions can be classified into synonymous (do not change protein sequence) and non-synonymous (change amino acid sequence) and then the pattern of synonymous versus non-synonymous substitutions can be used to screen a genome for the selective pressure different genes are under. Similarly, one can build evolutionary trees of all genes in a genome and look for those with longer branches in one lineage over another as evidence for an acceleration of evolutionary rate (Pollard et al., 2006). This sort of logic can be applied to just about any type of recent evolutionary event in genomes. Here I go into a bit more detail about how one can use this approach focusing on recent gene duplication events.

We know from the classic work of Ohta (2000) and others that gene duplication followed by subsequent divergence of the duplicates is a very important mechanism for the generation of novelty in virtually all organisms. Thus, to identify those genes within a lineage that are most likely to have recently diversified functions, we can turn this around and look for recent duplications. We did this by scanning complete genomes, looking for gene families that are expanded in one lineage compared to related lineages. As far as I know, we were the first to use this method when we applied it to the Deinococcus radiodurans genome (White et al., 1999). Subsequently, this general approach has been used in the analysis of many genomes and developed into a robust tool for characterizing them (Jordan et al., 2001).

The work I am going to describe here involves analysis of the genome of Vibrio cholerae. John Heidelberg had led a project to sequence this genome at The Institute for Genomic Research (TIGR) and asked me for help in carrying out some analyses (Heidelberg et al., 2000). One thing I did was to scan the genome for gene families that had undergone lineage-specific duplications (i.e., duplications that occurred since the organism last shared a common ancestor with any other organism for which we also had the complete genome sequence available). This was done “function blind”—meaning we simply analyzed the raw sequence data and not the known or predicted functions of genes. We found something very striking. In one gene family, the number of genes in this species was much greater than that in other related species. More importantly, the “extra” genes in V. cholerae were apparently the result of multiple rounds of gene duplication that occurred in the evolutionary branch leading up to this species (i.e., since it diverged from other lineages for which genomes were available). This family encoded the methyl-accepting chemotaxis proteins (MCPs) which were predicted to be involved in sensing and responding to chemical gradients in the environment (Figure 5-11).

FIGURE 5-11. Phylogenetic tree of methyl-accepting chemotactic protein (MCP) homologs in completed genomes.


Phylogenetic tree of methyl-accepting chemotactic protein (MCP) homologs in completed genomes. Homologs of MCPs were identified by FASTA3 searches of all available complete genomes. Amino acid sequences of the proteins were aligned using CLUSTALW, and (more...)

Given the known biology of V. cholerae as an aquatic microbe, it seemed even more likely that this protein family might indeed have experienced recent evolutionary adaptations. Of course, not all duplications are related to evolutionary diversification, but with a genome encoding more than 4,000 proteins, identifying a candidate subset to pursue with more careful informatics and with experimental studies was definitely helpful.

Phylogenomics III: Uncharacterized Genes

Both of the approaches described above predict the function of particular genes by making use of experimental information about homologs of those genes. Unfortunately, this does not always work well, for many reasons. For instance, much of the time a gene of interest will have homologs in other species but none of those homologs have been studied experimentally. Such genes, known as “conserved hypothetical” genes, pose a significant challenge for function prediction. Fortunately, over the last 10 years, many new methods have been developed that are particularly useful for characterizing their functions (see Marcotte, 2000, for review). Since these methods make use of other types of experimental information (such as coexpression patterns, protein-protein interaction networks) or computational analysis (including chromosomal location, shared promoter sequences, protein domain patterns), they are generally known as “nonhomology” methods. I’m going to introduce you to one of them, my favorite: phylogenetic profiling (Pellegrini et al., 1999).

In phylogenetic profiling, we first determine the distribution of genes of interest across many species. Genes with similar patterns of distribution are then grouped together. The underlying idea here is that often several genes interact in some way, for example, all being subunits of a complex protein or being involved in carrying out a particular process such as methanogenesis. For one gene to be functional, all must be present in an organism. Such genes would thus tend to be found in groupings that have similar patterns of distribution across species. It is important to point out that when interpreting these profiles, one must take into account two key processes in the evolution of microbial genomes. First, unless genes are used or are under strong selection to be maintained, they tend to disappear. Second, microbes don’t just inherit genes vertically within a lineage; they also acquire genes from other organisms by horizontal gene transfer. Significantly, when genes that work together are acquired horizontally, they tend to all get added or deleted simultaneously, or nearly simultaneously, with the result that when we compare genomes, we see that all members of such a group are either present or absent.

Here’s how one actually carries out phylogenetic profiling. You start with a set of genes in which you are interested, perhaps all the genes in the complete genome sequence of “your” organism. You then compare them against each complete genome sequence in a genome database by asking a simple yes-or-no question: For each gene in your organism, is there a homolog in the other genome? After you have done this for every gene in your genome, you create a profile for each gene by plotting its presence or absence across all the species. With such profiles in hand, one can then identify genes with similar profiles. One way to do this is to simply cluster genes by their profiles and look for tight clusters of genes with highly similar distribution patterns. An example of this is shown in Figure 5-12 in which each row corresponds to a gene and each column represents one species. Conceptually, this is analogous to microarray clustering of gene expression patterns. In fact, microarray clustering software is often used for analyzing phylogenetic profiles.

FIGURE 5-12. Phylogenetic profile analysis of sporulation in Carboxydothermus hydrogenoformans.


Phylogenetic profile analysis of sporulation in Carboxydothermus hydrogenoformans. For each protein encoded by the C. hydrogenoformans genome, a profile was created of the presence or absence of orthologs of that protein in the predicted proteomes of (more...)

Once you have such groupings of genes with similar cross-species distribution patterns, you can then use them to aid in predicting gene functions. For example, we used phylogenetic profiling to analyze the genome of the bacterium Carboxydothermus hydrogenoformans (Wu et al., 2005). There we found a very tight cluster of genes shared among many sporulating species (e.g., Bacillus subtilis) but absent from species that did not sporulate—even if closely related. Many of the gene families in this cluster were known to be involved in sporulation in other species. Based on this information, we predicted that C. hydrogenoformans had the ability to sporulate, and indeed we subsequently confirmed this experimentally. Moreover, our analysis revealed that there were also many other gene families of unknown function that were shared by sporulating species and absent from nonsporulating ones. Such genes were likely candidates for carrying out novel sporulation-associated activities. A bit of confirmation came just as we were finishing our paper. Richard Losick at Harvard published a set of studies on sporulation in B. subtilis that identified a few new sporulation genes (Eichenberger et al., 2004; Silvaggi et al., 2004), and many of our candidates were in their list of novel sporulation genes. Perhaps most interestingly, many of our candidates were still not identified as likely sporulation genes and likely represent novel sporulation-associated functions yet to be characterized.

I note that the approach of phylogenetic profiling can be strengthened by modifying the basic yes-or-no question. Instead of asking if there is a homolog of your gene present in another species, you want to ask if there is an ortholog, thus using some evolutionary information to improve your clustering (Eisen and Wu, 2002). With either method, phylogenetic profiling is a powerful tool for finding sets of genes that function in related processes or in a pathway. Although it does not characterize their biochemical activity well, it can provide insight into the process in which they participate (e.g., sporulation) and thus guide experimental studies. As we sequence more and more genomes, this method will become more and more informative.

Phylogenomics IV: Acquisition of Function from Others

There are two basic strategies by which organisms evolve new functions. One option is through modification of their own genome (e.g., mutation, gene duplication, domain swapping, invention of new genes), but these processes can sometimes be quite slow. In many cases, it is much easier instead to acquire the function from another organism that already has it. How is this done? By acquisition or affiliation. In other words, they can acquire the requisite genes via sex or lateral gene transfer, or they can gain access to the products of those genes through some type of affiliation with organisms that have those functions. Such affiliations include long-term symbioses. Symbioses are categorized as being parasitic (when one partner obviously benefits and the other is harmed), commensal (where one benefits and the other is unaffected), or mutualistic (where both benefit), but often we do not actually know the full extent of mutual impacts.

I am going to give an example of function acquisition by symbiosis, and demonstrate how genomic studies, combined with an understanding of the biology and evolution of the symbiosis, can aid in functional predictions. One partner in this symbiosis is the glassy-winged sharpshooter. This insect, like other sharpshooters, is an obligate xylem feeder that makes its living by feeding on the fluids in the xylem portion of the circulatory system of the host plant. This particular species has received special attention because it is a vector for Pierce’s disease, a nasty problem in grape vineyards. The disease agent is a bacterium, Xylella fastidiosa, that infects the xylem and can be transmitted between plants by the sharpshoorters, much like bloodborne pathogens are transmitted between animal hosts (see Chatterjee et al., 2008, for a review).

Obligate sap-feeding insects face a serious challenge. As part of their defenses, many plants make their sap less useful to sap-feeding insects by removing some nutrients that are essential for animals. For example, the essential amino acids (that all animals cannot synthesize and thus require in their diet) tend to be present in very low concentration in phloem sap. To counter this, many obligate phloem-feeding insects have bacterial symbionts living inside specialized cells in their gut. The insect provides the bacteria with sugars from sap, and the bacteria, in turn, make amino acids for their hosts. Xylem sap, which moves from the roots to the rest of the plant, tends to be even more nutrient-poor than phloem sap, and obligate xylem feeders also have bacterial symbionts living inside specialized cells in the gut (see Moran et al., 2008, for a review on heritable symbionts).

When we started our project, obligate phloem-feeding insects, such as aphids, had already been studied extensively, but much less was known about the obligate xylem-feeders. We (especially our collaborator, Nancy Moran) thought there might be a different twist to the story for the bacterial symbionts living in xylem-feeding hosts. At that time, all the species of sharpshooters examined had been found to host Baumannia cicadellinicola, a close relative of the symbionts that make amino acids for the aphids (Moran et al., 2003). Our first step was to apply shotgun genome sequencing methods to the DNA obtained from endosymbiont-containing tissue dissected from this sharpshooter. Using this approach we were able to determine the complete genome of B. cicadellinicola. Examination of the genome revealed many very interesting things (Wu et al., 2006).

First, we found that this organism had many of the hallmarks typical of intracellular symbionts: a small genome, low G+C content, and high evolutionary rates. As an aside, the high evolutionary rates often seen in intracellular symbionts are thought to be due, in large part, to the small effective population sizes for intracellular organisms. However, we found significant variation in the rate of evolution among endosymbionts, with the highest rates tending to be found in those that lack homologs of mismatch repair genes (Figure 5-13). Adaptation to an intracellular existence is typically accompanied by marked reduction in genome size, probably due to random forces. When important DNA repair genes are lost, mutation rates may go up—an evolutionarily significant consequence of their small genome size. Furthermore, whereas free-living species would have opportunities to reacquire the repair genes from other organisms, this is very unlikely for intracellular ones, isolated as they are. Thus another evolutionary consequence of an intracellular existence is reduced evolvability by means of lateral gene transfer.

FIGURE 5-13. There is significant variation in the rate of evolution among endosymbionts, with the highest rates tending to be found in those that lack homologs of mismatch repair genes.


There is significant variation in the rate of evolution among endosymbionts, with the highest rates tending to be found in those that lack homologs of mismatch repair genes. SOURCE: Adapted from Wu et al. (2006).

Secondly, further examination of the genome and prediction of gene functions revealed pathways for synthesizing diverse vitamins and cofactors, suggesting that this symbiont was also helping its xylem-feeding host to deal with a very nutrient-poor diet. Based on our prior knowledge of these types of symbioses, we expected to find pathways for the synthesis of the essential amino acids required by the sharpshooter—but we could not find any. In thinking about the mechanisms for the evolution of novelty, it seemed unlikely to us that this host, the glassy-winged sharpshooter, would have evolved the ability to synthesize essential amino acids, given that this capability has never been found, as far as I know, in any animal species. Nevertheless, the observations were that the sharpshooter eats only xylem sap, xylem does not contain the essential amino acids, and the genes for essential amino acid synthesis pathways were not present in the genome of either the sharpshooter or its Baumannia endosymbiont. We were vexed.

There were three possibilities we considered that could reasonably explain this conundrum. One was that the sharpshooter was acquiring amino acids from other food sources. This seemed unlikely as sharpshooters are generally considered to be obligate xylem sap feeders. A second possibility was that the glassy-winged sharpshooter was getting the essential amino acids from the xylem sap. Though we could not rule this out, it seemed unlikely because there should be strong selection on the plants to keep essential amino acids out of the xylem sap and because xylem generally was not known to have such amino acids. A third possibility was that another organism in the sharpshooter system was making amino acids. This seemed to be the most likely possibility especially since our collaborator Nancy Moran had just recently shown that there was a second type of bacterial symbiont living inside the guts of all sharpshooters (Moran et al., 2005). We had not paid much attention to this second type of symbiont since the Baumannia symbionts were so closely related to the Buchnera symbionts of aphids that provided all the nutritional supplements needed by their host to feed on phloem sap (and since these new symbionts were from a completely different phylum of bacteria).

Fortunately, we had a quick, though somewhat dirty, way to test for the possibility that another organism in the system was making essential amino acids. To sequence the Baumannia genome we did not use a pure culture since these symbionts had never been grown in the lab. Instead, we had done a “metagenomics” project in which Nancy Moran’s lab had dissected hundreds of sharpshooters and removed as carefully as possible the tissue that was known to contain the Baumannia symbionts. We then extracted DNA from this material and used it for whole-genome shotgun sequencing during which we sheared the DNA into moderately small pieces, cloned these pieces into a plasmid library, and then sequenced the ends of these plasmid clones. From these data we were able to generate a good assembly of the Baumannia genome, which we then finished with PCR and primer walking methods. The key for us was that not all of the sequence reads that we obtained were found to map to the Baumannia genome. Some came from other organisms in the sample. So the first thing we did was to look in these other data for genes that might be involved in synthesizing essential amino acids—and we immediately found a few.

So the next question was: From what organism did these genes originate? We knew there should be host DNA in the sample (although we thought it was unlikely that the host would be synthesizing essential amino acids since no animals are known to do so) and that there might also be DNA from the second symbiont as well as from other resident microbes. So what we needed to do was to sort the DNA sequence reads into which came from which organism. This sorting is commonly known as “binning” in metagenomic studies. We tried every binning method in use at the time including genome assembly, analysis of DNA base composition and word frequencies, examination of depth of coverage, and others. Unfortunately none of them worked well, most likely because we had very little coverage of the genomes from these other organisms. This is where phylogenomic approaches came in handy.

We decided to try to sort the sequence reads by phylogenetic analyses. So we took all the reads, identified all possible proteins or protein fragments that they could encode, then for these identified which had apparent homologs in sequence databases, and for those built phylogenetic trees. We then sorted the phylogenetic trees by which organism’s genes showed up in the tree as the nearest neighbor of the protein or fragment.

Overall the trees showed only a few major patterns. In some (Figure 5-14A) the nearest neighbor was something from an animal. Thus, we concluded that the sequence reads in this “bin” likely corresponded to fragments of the host genome. In other trees, the nearest neighbor was a Wolbachia or some close relative (Figure 5-14B). Since Wolbachia (a type of bacteria related to Rickettsia) are common intracellular parasites of insects, we concluded that these reads came from Wolbachia that infected at least some of the insects that Nancy had dissected. Then there was a large collection of reads for which the trees showed a grouping with species in the Bacteroidetes phylum (Figure 5-14C). Because Sulcia was in this phylum, we concluded these were likely from the second symbiont.

FIGURE 5-14. Phylogenetic trees of putative proteins encoded by single sequence reads of DNA isolated from symbiont-containing tissue of the glassy-winged sharpshooter.


Phylogenetic trees of putative proteins encoded by single sequence reads of DNA isolated from symbiont-containing tissue of the glassy-winged sharpshooter. Trees were constructed by aligning putative proteins encoded by the reads to homologs from complete (more...)

We then asked: Of the potential essential amino acid synthesizing genes we had identified in some of the reads, to which of the bins did they belong? The answer was clear as day—all belonged to the Sulcia bin. We thus concluded that it was likely that the second symbiont was the provider of essential amino acids for the host and so we spent another year or so trying to finish the genome of this symbiont. Though we did not quite finish the genome, from the 130 or so kilo-base pairs (kbp) of DNA we mapped to this organism, we found that it encoded in essence all the essential amino acid synthesis pathways (Wu et al., 2006); this was later confirmed by the complete genome (McCutcheon and Moran, 2007). What we had discovered was a dual symbioses where one symbiont (Baumannia) makes vitamins and cofactors and the other (Sulcia) makes essential amino acids, and together they supplement the nutrient-poor diet of the glassy-winged sharpshooter contributing to this organism’s annoying ability to spread Pierce’s disease. Most importantly for this article here, we would not have been able to sort out the data from the different organisms (and thus would not have discovered the dual symbioses) without phylogenetic analysis of the metagenomic data.

Phylogenomics V: Knowing What We Do Not Know

As has often been heard at this workshop, Lederberg was very fond of emphasizing that we need to know what we do not know. In that spirit, I want to discuss how knowing what we do not know can help with functional predictions. One aspect of what we do not know that influences our ability to make useful functional predictions is that genome-sequencing projects are highly biased in terms of what types of organisms have been sequenced. For example, I and many others noticed a few years ago (Eisen, 2000; Hugenholtz, 2002) that most of the genomes of bacteria were coming from just three of the 40+ phyla of bacteria (Figure 5-15). The same trend was seen in Archaea and microbial eukaryotes. So based on this we applied for, and in 2002 received, a grant from the “Assembling the Tree of Life” program at the National Science Foundation (NSF) to sequence the first genomes from representatives of 8 phyla of bacteria. We have now finished this project and are in the process of writing up a series of papers on our findings. Yet even from the initial analyses, what was abundantly clear was that a single genome from these phyla was simply not enough. Each phylum represents something on the order of 1 billion to 2 billion years of evolution, and a lot happens in that time in bacteria. So a single genome cannot do justice to the diversity of genes and features of each phylum.

FIGURE 5-15. The diversity of Bacterial and Archaeal species for which complete genomes are available is still poor.


The diversity of Bacterial and Archaeal species for which complete genomes are available is still poor. (a) Phylogenetic tree, based on rRNA sequences, of representatives of many major Bacterial and Archaeal lineages. (b) Lineages for which complete genomes (more...)

Based on this, in collaboration with the Joint Genome Institute (a Department of Energy [DOE]-funded genome center), we have started a new initiative to really fill in the genomes from across the tree of life. This Genomic Encyclopedia of Bacteria and Archaea (GEBA)8 is just getting started, with 100 genomes being sequenced from across the tree in the first year. Already the results are quite convincing that sampling from across the tree leads to enormous benefits. For example, the phylogenetic profiling method outlined above works best when you have sampling of diverse genomes from different phylogenetic groups. Adding these GEBA genomes to the mix makes phylogenetic profiling work much better.

Sampling from across the tree will take some effort because there are many, many, major groups of Bacteria, Archaea, and microbial Eukaryotes, and many of these do not have any cultured representatives. However, the benefits will likely be enormous.9

It is important to point out however that just having genome sequences from across the tree is not sufficient. Functional information from diverse organisms is also critical. I give one example here of why. When I first went to TIGR, Owen White was in charge of a project to sequence the genome of the bacterium Deinococcus radiodurans, the most radiation-resistant organism known. This was very exciting to me since I did my Ph.D. research in part on the evolution of radiation resistance. So I volunteered to help Owen analyze the genome. Since there was some experimental evidence that active DNA repair processes contributed to the resistance of this organism, I spent some time looking for likely DNA repair genes in the genome, making use of the phylogenomic approaches I had been advocating. Indeed we were able to find many genes that appeared likely to be involved in DNA repair processes.

The problem was that the list we came up with was very similar to the list that we could make for nonradiation-resistant organisms such as E. coli and B. subtilis. However, a little thinking about what we did not know helps explain this. Imagine if sometime in the recent history of D. radiodurans a novel DNA repair gene evolved. The method we were using to look for DNA repair genes would not have found this since we were looking primarily for homologs of genes that were shown in other species to be involved in DNA repair processes. Even using novel methods such as phylogenetic profiling would not necessarily help if the new genes in this species were not connected in any way to known DNA repair pathways. The problem here is that most experimental studies of repair genes in bacteria were done in two phyla and we would have a hard time identifying novel repair genes if they had been invented anywhere else in the bacterial tree of life.

This is a general lesson for all functional predictions. Such predictions rely upon some functional characterizations done in some organism. The more these functional studies are done across the tree of life, the better will our functional predictions become. Similarly, functional predictions rely in part upon comparative genome analysis; thus, the more genomes we have from across the tree, the better functional predictions we will get. Thus, knowing what we do not know is critical in guiding experimental and sequencing studies to get the most out of the diversity of organisms.

Summary: A Call for a Field Guide to the Microbes

Overall, what I have tried to do here is present examples of how evolutionary and genome analysis can be integrated into “phylogenomic” studies. I have focused on predicting functions of genes, but the benefits of phylogenomics extend to all aspects of the biology of microbes.

I should emphasize that this is not some radical or overly novel concept as the integration of phylogeny and function is well known to be critical for understanding the diversity of life. What I have tried to show here is that this is as true for genomics and micreobes as it is for physiology, behavior, genes, ecosystems and other arenas in which evolution has been shown to be a powerful tool. I should note that there is a third piece of information that is useful in addition to integrating phylogeny and function—the biogeographical patterns of the distributions of organisms. For microbes, figuring out the distribution patterns of organisms and the rules determining these patterns is one of the final frontiers. If we are able to integrate phylogeny, function and genomes, and biogeography, we will have something for microbes that is known to be useful in many other organisms—a field guide. A field guide to microbes would no doubt be useful in many arenas, and I am certain it would be a book that Josh Lederberg would have carried with him wherever he went.


I thank Merry Youle for assistance in the editing of this manuscript and Martin Wu and Dongying Wu for help in many of the genome analyses reported here. In addition, I thank the many people at TIGR and UC Davis who provided help in generating and analyzing the genome and metagenomes under discussion. This work was supported by the Defense Advanced Research Projects Agency under grant HR0011-05-1-0057.


Peter Daszak, Ph.D.10

Consortium for Conservation Medicine, Wildlife Trust

In this essay, I examine the process of zoonotic disease emergence and pose the question: Can we use past trends to predict future patterns and better control this public health threat? Stephen Morse, in this chapter, wrote about the ecological and demographic factors that drive disease emergence. This process is essentially evolutionary, with these “drivers” forcing selection of novel, or already present, pathogen strains that are better suited to transmission within our changed populations or environment. Here, I will provide examples of how these drivers can be analyzed and their influence on pathogen transmission measured. I will then demonstrate how our group has used these analyses to make predictions of different steps in the process of disease emergence. First, however, I will briefly review what history has to teach us about disease emergence.

Historical Trends in Disease Emergence

As we look into history, we can see that the process of disease emergence has occurred in a series of earlier phases. For example, many pathogens now endemic in humans appear to have evolved from an ancestor that moved into our populations at around the time we first domesticated animals—between 10,000 to 15,000 B.C.E. Measles virus, for example, is phylogenetically most closely related to the ungulate pathogens rinderpest and peste des petits ruminant viruses.

Epidemiological studies of host-parasite dynamics demonstrate that infectious diseases cannot become endemic until host populations reach a certain threshold density. In the case of measles, repeated introductions of the virus to the South Pacific Islands did not result in establishment of this disease except in regions where human population densities approached 500,000 (Black, 1966). The first zoonotic pathogens apparently emerged in humans between 200 B.C.E. and 300 C.E., as networks of communities in the Fertile Crescent,11 reached threshold densities for diseases such as measles and smallpox (Dobson and Carper, 1996). This early phase of human infectious disease emergence may have been preceded by another such period during the Pleistocene Era,12 when the human population contracted during glaciation, then expanded and migrated afterward. Later, global transportation networks, exploration, conquest, and trade made possible the emergence and spread of the Black Death (plague) in Europe in the fourteenth century, and brought the Conquistadors and smallpox to the Aztecs in the fifteenth and sixteenth centuries. With the Industrial Revolution came pandemic influenza and with air travel and other features of globalization, HIV/AIDS and our current phase of emerging diseases.

As these grand episodes of disease emergence suggest, one simple predictor of infectious disease emergence is globalization: As we move into new regions, we expand endemic pathogen geographic ranges and provide a new source of susceptible hosts for others. As Figure 5-16 shows, that rate has accelerated to the point where we are connected as never before via globalized travel and trade networks (Hufnagel et al., 2004).

FIGURE 5-16. The rate of globalization has accelerated to the point where we are connected as never before via globalized travel and trade networks.


The rate of globalization has accelerated to the point where we are connected as never before via globalized travel and trade networks. SOURCE: Hufnagel et al. (2004).

In these cases, infectious agents are exploiting increased human population density and increased contact rates between people. As globalization progresses and human populations become increasingly dense and interconnected, we can already make one simple prediction: The rate of disease emergence will increase correspondingly. Furthermore, as our global patterns of trade and economy rely on this connectivity, we can also predict that the economic cost of these outbreaks will increase, as illustrated in Figure 5-17.

FIGURE 5-17. Economic impacts of selected emerging infectious diseases.


Economic impacts of selected emerging infectious diseases. SOURCE: Figure courtesy of BioEra.

But there is another important pattern in these historical trends: The first emergence of many now-common endemic diseases (e.g., measles, smallpox) is a zoonotic event explained by ecology—increasing contact with recently-domesticated wildlife led to spill-over of their pathogens and evolution into human-adapted strains. This process has been repeated throughout recent history (e.g., plague), culminating in the recent phase of new zoonoses, HIV-1 and -2, Ebola, severe acute respiratory syndrome (SARS), and others.

We can break down this process of zoonotic disease emergence into three discrete phases, each with different dynamics and drivers. In the first stage (“preemergence”), human changes to the environment drive wildlife into new regions, or introduce livestock, leading to spill-over among animal populations and outbreaks of novel wildlife diseases. In the second stage, wildlife pathogens “spill over” to human populations causing single cases, small clusters of cases, or localized outbreaks, as has occurred with Nipah virus or Ebola virus. In the final stage, “pandemic spread,” pathogens either adapt to, or are already adapted to, human-to-human transmission, and move rapidly across nations. This pandemic stage is likely rarely reached, but can lead to severe impact due to high mortality (e.g., HIV/AIDS) or economic impact (e.g., SARS).

As we examine these three phases, we see that they are driven by different causal factors, or drivers, but that in almost all cases these are anthropogenic13 and are measurable. For example, the rise in bushmeat hunting (the original cause of HIV emergence [Peeters et al., 2002]) has been analyzed and measured in Southeast Asia and Africa, and the landscape changes underlying Lyme disease emergence in the United States have also been measured and analyzed. Below, I discuss how analyses of the ecological or demographic drivers of disease can be used to measure and predict the future risk of disease emergence across these different phases.

Predicting International (Pandemic) Spread—West Nile Virus, Avian Influenza, and Nipah Virus

Of the three steps in disease emergence, the most simple to develop predictions around is the process of international or pandemic spread. This is because the drivers responsible for this are simple and well-defined: human travel and trade patterns. Our group has used this approach to predict future patterns of West Nile virus and avian influenza spread. For West Nile virus (WNV), which has moved across the United States and into South America, we examined the most likely pathways of introduction to Hawaii, the Galapagos, and Barbados.

We examined the only plausible pathways for WNV spread: when carried by people, wind-transported mosquitoes, mosquitoes hitching a ride on boats or planes, animals (pets or poultry), or by migratory birds. We conducted simple calculations of the likely number of “bird days” of risk via these pathways, incorporating data on the average number of mosquitoes per plane; the numbers of migratory birds that pass through Hawaii, the Galapagos, and Barbados from West Nile-infected locations and their reservoir competence; and the number of infected people who travel to these locations. We determined that West Nile is likeliest to be transmitted to all three locations by mosquitoes on airplanes (Figure 5-18; Kilpatrick et al., 2004, 2006b)—a risk two to three orders of magnitude higher than any of the others. Because most mosquitoes on planes are found in the cargo hold, simple measures to eradicate them (i.e., residual insecticide use in cargo holds) would likely reduce the risk of WNV introduction significantly.

FIGURE 5-18. WNV is most likely to be spread by airplane.


WNV is most likely to be spread by airplane. SOURCE: Adapted from Kilpatrick et al. (2004, .

We used a similar approach to examine whether the most likely pathways of H5N1 avian influenza spread into and across Europe was due to migratory birds or poultry. We used data from the Food and Agriculture Organization (FAO) on country-to-country poultry trade, and from the Royal Society for the Protection of Birds (RSPB) and Smithsonian on migratory birds. We considered the numbers of birds that migrate along the major flyways, the dimensions of their summer breeding areas, the timing of their migration, and whether their migrations routes pass over H5N1-infected locations (Kilpatrick et al., 2006a). We examined each spreading event of H5N1 from 1998 until the present to determine whether it was more likely to have been caused by the poultry trade or by migratory birds. Our analyses suggest that the virus spread initially within Southeast Asia due to the poultry trade, but once it moved out of the region, it was rapidly spread via migratory pathways to and within Europe.

For countries where H5N1 had not yet been reported, we then proceeded to calculate the risk of introduction associated with trade in poultry and wild birds, and with bird migration (expressed as infectious bird days, as shown in Figure 5-19; Kilpatrick et al., 2006a).

FIGURE 5-19. Predicted risk of H5N1 avian influenza introduction from countries that have had H5N1 outbreaks (in blue).


Predicted risk of H5N1 avian influenza introduction from countries that have had H5N1 outbreaks (in blue). Risk was estimated as the number of infectious bird days (number of infected birds × days shedding virus) caused by trade (presented as (more...)

Although the United States does not trade in poultry with any country that has reported H5N1, its neighbors do; thus, it is not hard to imagine that the virus could travel easily into the United States across its borders. Our analysis also showed much lower (two to three orders of magnitude) risk of H5N1 introduction to the United States through migratory birds. These results contrasted with the publicly stated position of the U.S. Department of Agriculture (USDA) and other U.S. government entities that migratory birds moving through the Siberia-Alaskan flyway would be the most likely cause of H5N1 introduction to the United States.

A large outbreak of encephalitis in Malaysia in 1998–1999 was traced to a novel bat-borne paramyxovirus, Nipah virus (NiV). There was no human-to-human transmission of NiV in Malaysia, but the virus caused the death of over 100 people, with a fatality rate of around 40 percent. The virus is carried by fruit bats (Pteropus spp.), infected pigs, and people in close contact with infected pigs in Malaysia and Singapore (Figure 5-20).

FIGURE 5-20. Species chain for Nipah virus in Malaysia.


Species chain for Nipah virus in Malaysia. This emerging virus is carried by fruit bats across the Old World tropics. It emerged in people in Malaysia after spilling over from bats to pigs, which act as amplifier hosts. SOURCE: Wildlife Trust Inc. (bat); (more...)

To analyze what caused this outbreak, we investigated the first cases of NiV infection in pigs and humans at the “index” farm—a large (30,000-head) intensively managed export farm in Ipoh, Malaysia. The human NiV infections reflect the transmission dynamics of NiV in the pig population and showed an interesting pattern—a long period (>18 months) of slow, smoldering infection followed by a large epidemic spike. After five years of field studies, virology, experimental work, and mathematical modeling, we have been able to piece together the factors that led to this pattern (Daszak et al., 2006). First, we have shown that NiV is able to survive for prolonged periods on fruit juice at room temperature (Fogarty et al., 2008), supporting the notion that bats visiting the index farm to feed on fruit trees adjacent to pigsties were the origin of infection.

In Malaysia, we captured wild bats, tested them for NiV antibodies, and isolated the virus. We identified evidence of NiV in every colony we examined in Malaysia and reported changes in seroprevalence in young bats, suggesting that the virus circulates endemically in the Malaysian bat population. We found one colony around a kilometer from the index farm and obtained anecdotal evidence of bats visiting the site. These findings suggest that the virus could have been introduced repeatedly to the pigs at the index farm. We also tracked the bats and found that they fly large distances and frequently cross between Malaysia and Indonesia, migrating to follow the fruiting and flowering of orchard crops. Our data from these field studies were able to refute a previously published hypothesis that El Niño/Southern Oscillation (ENSO)-driven forest fires in Sumatra caused bats to move into Malaysia during the 1990s and introduce the virus there for the first time.

We were able to obtain excellent demographic data on Malaysian pig farms, thanks to the legal requirements for export farms in Malaysia. As a result, birth and death rates among the pigs were carefully tracked. We used these data to develop a mathematical model of infection dynamics that simulates precisely the structured system of the pig farm where NiV first emerged. This model allowed us to “recreate” the original outbreak as a computer simulation and analyze why it persisted for almost two years prior to the large-scale outbreak. Our analyses showed that the emergence of NiV in humans appears to be closely linked to the intensification of pig farming in Malaysia. It is unusual that a wildlife pathogen is enhanced by intensive livestock production and then becomes a human problem. This system, however, involves both pigs and mangoes, and the production of both commodities in Malaysia rose sharply in the two decades preceding the 1999 outbreak, as farmers used pig manure to fertilize fruit trees. When pig populations declined as a result of the 1999 Nipah outbreak, mango production declined as well. We, therefore, used our modeling approach to predict the likelihood of future NiV emergence and spread should the Malaysian government decide to reconstitute these large export farms in this or other regions.

Can We Predict Spillover of the Next Emerging Zoonosis?


Here, I believe, is the “Holy Grail” for emerging disease research—to develop a valid strategy to predict the next emerging zoonosis—the next HIV or SARS. To do this will require a fusion of evolution, ecology, virology, and microbiology. This is clearly a challenge, and it has been proposed that it will be difficult if not impossible to overcome. For example, Murphy (1998) commented, “In general, there is no way to predict when or where the next important new zoonotic pathogen will emerge or what its ultimate importance might be.”

Before I address the challenge, I first need to acknowledge that Murphy raises an important point in his article—that we have a very poor knowledge of the true diversity of microbes able to emerge as new zoonoses in the future—described by Morse (1993) as the “zoonotic pool.” We can assemble crude estimates of this diversity. Consider that there are approximately 50,000 vertebrates (see also paper by Morse in this chapter) and that zoonotic viruses emerging in humans tend to be vertebrate viruses. If we estimate that each vertebrate species carries 20 endemic, unknown viruses (almost certainly a gross underestimate), then there is a global diversity of 1 million viruses (bats alone would carry about 20,000 unknown viruses). With only approximately 2,000 different species of viruses identified, we can crudely say that we underestimate the zoonotic pool by at least 99.8 percent!

This is one of two obstacles we must overcome if we are going to predict the next zoonosis: We do not know the real size or global distribution of the zoonotic pathogen pool. In addition, we must account for surveillance bias, because people are looking harder for emergent diseases in some places than in others, and that determines where previous zoonoses have been identified.

Emerging Disease Database

With the above challenges in mind, we constructed a database (based on an earlier published list of emerging infectious diseases [EIDs]; Taylor et al., 2001) of EID “events,” defined as the original case or case cluster representing a given infectious disease (including drug-resistant strains) emerging in human populations for the first time, that occurred between 1940 and 2004 (Jones et al., 2008). These EID events include newly evolved strains of pathogens (e.g., drug-resistant and multiply drug-resistant strains), pathogens that have recently entered human populations for the first time, and pathogens that have probably been present in humans historically, but which have recently increased in incidence.

The geographic origin of emergence for each of these diseases is shown in Figure 5-21. Surveillance bias is evident in this figure: Europe and the United States, which support the most comprehensive disease surveillance efforts, detect the greatest numbers of EID events. We corrected for that by geographically plotting the coordinates of every author—about 17,000 of them—of every paper published in the Journal of Infectious Diseases (JID) for the last 20 years, and used this information in our analyses.

FIGURE 5-21. Global richness map of the geographic origins of EID events from 1940 to 2004.


Global richness map of the geographic origins of EID events from 1940 to 2004. The map is derived for EID events caused by all pathogen types. Circles represent one degree grid cells, and the area of the circle is proportional to the number of events (more...)

We were able to use this database to address some key questions in emerging disease biology. First, whether EIDs are really on the rise (Jones et al., 2008). Decade by decade, from the 1940s to the 1990s, the number of EID events has increased significantly, even after accounting for the increasing numbers of scientists over this period. This has another implication: It is reasonable to expect that this trend will continue in the future. We also found that a majority of EID events were associated with drug-resistant microbes. Second, we were able to examine whether zoonoses such a HIV/AIDS, which are the most high-profile EIDs, are truly the most significant threat. We found that zoonoses emerging from wildlife (i.e., HIV, SARS, Ebola and Nipah viruses) are indeed significantly rising over time and during the 1990s, represented the dominant type of emerging disease.

Testing Hypotheses

We used our database approach to examine two simple questions: (1) Is disease emergence an “anthropogenic” process (i.e., are human changes to demography, the environment, and other factors the key drivers of EIDs)? (2) Can we obtain a more accurate map of the emerging disease “hotspots”—the regions most likely to cause the next new emerging disease?

To test these theories, we first found a way around the dilemma of not knowing where the diversity of pathogens resides by assuming that each mammalian species harbors a similar number of host-specific pathogens. If this is true, then the global distribution of wildlife diversity approximates the potential zoonotic pathogen diversity. In our analysis, we used a global dataset on mammalian host richness.

We then used a simple multiple logistic regression to assess the correlation between the risk of an EID historically and some key factors thought responsible for disease emergence, correcting for reporter bias with the dataset on JID authors. We addressed the first hypothesis by testing global human population density against EID risk and showed that this is a significant predictor of risk for each group of pathogen. This specifically shows that the risk of a disease emerging (not spreading) is dependent on human population density (i.e., those regions with dense human populations and presumably lots of human-driven changes are most likely to lead to a new EID).

By plotting out our risk measures globally, we were able to produce the first ever global distribution maps of emerging disease risk, corrected for reporter bias, and based on correlated trends in EIDs. These predictive maps of EID “hotspots” show different global distribution patterns when we sorted EID events according to their origins (e.g., zoonotic diseases from wildlife; vector-borne pathogens; drug-resistant pathogens; Jones et al., 2008). For EIDs of wildlife origin (the high-profile zoonoses), these hotspots are primarily tropical areas where wildlife diversity is highest, and particularly where human density is also high, as occurs in southern Brazil, northern India and Bangladesh, and Southeast Asia (Figure 5-22). However, Europe and the United States also have significant potential for zoonotic disease emergence, due to continued, high-level environmental changes.

FIGURE 5-22. Global distribution of the relative risk of an EID event.


Global distribution of the relative risk of an EID event. Maps are derived for EID events caused by (a) zoonotic pathogens from wildlife; (b) zoonotic pathogens from nonwildlife; (c) drug-resistant pathogens; and (d) vector-borne pathogens. Green corresponds (more...)

However, perhaps one of the key findings of our analysis is that if we plot out the geographic distribution of all 17,000 JID authors, we find that the global effort for infectious disease research has largely focused on regions from where the next EID is least likely to emerge. Indeed, few EID hotspots—located primarily in developing countries—are under thorough surveillance for infectious pathogens (Jones et al., 2008). We therefore concluded that global efforts to detect emerging infections should be slightly refocused to the Tropics if we are to rapidly intervene with this process of emergence.

Using Predictive Approaches: “Smart Surveillance”

Can we use this hotspot approach to increase our capacity for preventing the next EID? If we return to Nipah virus, we see that this emerging pathogen fits into the high-profile group of zoonoses that are lethal to humans and have emerged from wildlife in tropical regions. During the last decade, antibodies to this pathogen have been reported in bats across Southeast Asia, South Asia, Madagascar, China, and even continental Africa. But this knowledge has been gleaned through different groups working independently and often serendipitously. There has been no focused, global surveillance for viruses related to NiV in bats.

If we examine the wildlife zoonotic disease hotspot map (Figure 5-22) in one of the highest risk regions, Bangladesh, the human population has been subject to a series of repeated outbreaks of NiV with higher case-fatality rates than in Malaysia (average around 70 percent), evidence of foodborne infection, and evidence of up to five chains of human-to-human transmission. Bangladesh has the densest population of any country on Earth that is not an urban city-state: 2,595 people per square mile, as compared with a global average of 128 persons per square mile (the United States has 80 people per square mile; http://www.worldatlas.com, 2006). The country also has surprisingly high wildlife diversity, given its population. Thus, it appears that in Bangladesh, Nipah virus is closer to stage three, or pandemic emergence. This raises important questions: Why were there no programs to identify NiV in Bangladesh once the virus was discovered in Malaysia? What other regions globally might harbor spillover of NiV or related viruses? What other zoonotic pathogens might be lurking in the South Asia hot-spot within bats or other wildlife hosts?

I propose that a more efficient strategy to address future emerging diseases is to combine rigorous analyses of the fine-scale ecological and demographic changes within hotspot regions (the risk factors) with state-of-the-art molecular approaches to viral discovery. This will give us a more accurate predictive model for future disease emergence, and better definition of the size and diversity of the zoonotic pool. Techniques such as pyrosequencing and mass tag polymerase chain reaction (PCR) will rapidly decrease the expense and logistical challenges involved in identifying new viral groups, and if applied to key groups of wildlife species (those most often responsible for disease emergence in the past) within hotspot regions, will provide the most cost-effective way to proactively address the EID challenge. This model for virus-hunting in the future is, of course, still somewhat crude. It is impossible, for example, to determine the future ability of a novel virus to jump hosts successfully to humans, and its likely pathogenicity. However, by focusing first on viral groups known to be pathogenic, and by targeting viral discovery within these clades, significant progress can be made toward dealing with the EID threat.


The work described in this chapter was carried out by a large number of collaborators, including members of the Henipavirus Ecology Research Group (HERG),14 especially Jon Epstein (Consortium for Conservation Medicine) and Juliet Pulliam (Fogarty International Center). The work on West Nile virus and avian influenza was led by A. Marm Kilpatrick (Consortium for Conservation Medicine, University of California Santa Cruz) and the hotspots analyses were conducted in collaboration with Kate Jones (Institute of Zoology) and Marc A. Levy (Center for International Earth Science Information Network, Columbia). This work was supported in part by a National Institutes of Health/National Science Foundation “Ecology of Infectious Diseases” award from the John E. Fogarty International Center R01-TW00824, by core funding to the Consortium for Conservation Medicine from the V. Kann Rasmussen Foundation and is published in collaboration with the Australian Biosecurity Cooperative Research Center for Emerging Infectious Diseases (AB-CRC).


    Overview References

    1. IOM (Institute of Medicine). Emerging infections: microbial threats to health in the United States. Washington, DC: National Academy Press; 1992. [PubMed: 25121245]
    2. IOM (Institute of Medicine). Microbial threats to health: emergence, detection, and response. Washington, DC: The National Academies Press; 2003. [PubMed: 25057653]

    Morse References

    1. CDC (Centers for Disease Control and Prevention). HIV mortality (through 2005). 2008. [accessed February 10, 2009]. http://www​.cdc.gov/hiv​/topics/surveillance​/resources/slides/mortality/index.htm .
    2. Crichton M. The andromeda strain. New York: Knopf; 1969.
    3. IOM (Institute of Medicine). Emerging infections: microbial threats to health in the United States. Washington, DC: National Academy Press; 1992. [PubMed: 25121245]
    4. IOM (Institute of Medicine). Microbial threats to health: emergence, detection, and response. Washington, DC: The National Academies Press; 2003. [PubMed: 25057653]
    5. Johnson S. The ghost map. New York: Riverhead Books/Penguin; 2006.
    6. McNeill WH. Plagues and peoples. New York: Bantam; Doubleday Dell Publishing Group, Inc: 1976.
    7. Morens D, Folkers GK, Fauci AS. The challenge of emerging and reemerging infectious diseases. Nature. 2004;430(6996):242–249. [PMC free article: PMC7094993] [PubMed: 15241422]
    8. Neustadt RE, Fineberg HV. The epidemic that never was: policy-making and the swine flu scare. New York: Vintage Books; 1983.
    9. Peters CJ. Hantavirus pulmonary syndrome in the Americas Chapter 2. In: Scheld WM, Craig WA, Hughes JM, editors. Emerging infections. Vol. 2. Washington, DC: ASM Press; 1998.

    Woolhouse and Gaunt References

    1. Anderson RM, May RM. Infectious Disease of Humans Dynamics and Control. Oxford Scientific Press; Oxford, UK: 1991.
    2. Antia R, Regoes RR, Koella JC, Bergstrom CT. The role of evolution in the emergence of infectious diseases. Nature. 2003;426:658–661. [PMC free article: PMC7095141] [PubMed: 14668863]
    3. Arien KK, Vanham G, Arts EJ. Is HIV-1 evolving to a less virulent form in humans. Nat Rev Microbiol. 2007;5:141–151. [PMC free article: PMC7097722] [PubMed: 17203103]
    4. Barre-Sinoussi F, Chermann JC, Rey F, Nugeyre MT, Chamaret S, Gruest J, Dauguet C, Axler-Blin C, Vezinet-Brun F, Rouzioux C, Rozenbaum W, Montagnier L. Isolation of a T-lymphotropic retrovirus from a patient at risk for Acquired Immune Deficiency Syndrome (AIDS). Science. 1983;220:868–871. [PubMed: 6189183]
    5. Brosch R, Gordon SV, Marmiesse M, Brodin P, Buchrieser C, Eiglmeier K, Garnier T, Gutierrez C, Hewinson G, Kremer K, Parsons LM, Pym AS, Samper S, van Soolingen D, Cole ST. A new evolutionary scenario for the Mycobacterium tuberculosis complex. Proc. Natl. Acad. Sci. USA. 2002;99(6):3684–3689. [PMC free article: PMC122584] [PubMed: 11891304]
    6. Chai JY, Murrell KD, Lymbery AJ. Fish-borne parasitic zoonoses: status and issues. Int J Parasitol. 2005;35:1233–1254. [PubMed: 16143336]
    7. Chant K, Chan R, Dwyer DE, Kirkland P. Probable human infection with a newly described virus in the family Paramyxoviridae. Emerg Inf Dis. 1998;4:273–275. [PMC free article: PMC2640130] [PubMed: 9621198]
    8. Cleaveland S, Laurenson MK, Taylor LH. Diseases of humans and their domestic mammals: pathogen characteristics, host range and the risk of emergence. Philos Trans R Soc Lond B Biol Sci. 2001;356:991–999. [PMC free article: PMC1088494] [PubMed: 11516377]
    9. Cleaveland S, Meslin FX, Breiman R. Dogs can play useful roles as sentinel hosts for disease. Nature. 2006;440:605. [PubMed: 16572146]
    10. Diamond J. Evolution, consequences and future of plant and animal domestication. Nature. 2002;418:700–707. [PubMed: 12167878]
    11. Dykhuizen DE. Santa Rosalia revisited: why are there so many species of bacteria. Ant. v. Leeuwenhoek. 1998;73:25–33. [PubMed: 9602276]
    12. Ebert D. Experimental evolution of parasites. Science. 1998;282:1432–1436. [PubMed: 9822369]
    13. Ferguson NM, Cummings DA, Fraser C, Cajka JC, Cooley PC, Burke DS. Strategies for mitigating an influenza pandemic. Nature. 2006;442:448–452. [PMC free article: PMC7095311] [PubMed: 16642006]
    14. Haydon DT, Cleaveland S, Taylor LH, Laurenson MK. Identifying reservoirs of infection: a conceptual and practical challenge. Emerg Infect Dis. 2002;8:1468–1473. [PMC free article: PMC2738515] [PubMed: 12498665]
    15. Holmes EC, Rambaut A. Virus evolution and the emergence of SARS coronavirus. Phil Trans R Soc B Biol Sci. 2004;359:1059–1065. [PMC free article: PMC1693395] [PubMed: 15306390]
    16. Hubalek Z. Emerging human infectious diseases: anthroponoses, zoonoses, and sapronoses. Emerg Inf Dis. 2003;9:403–404. [PMC free article: PMC2958532] [PubMed: 12643844]
    17. IOM (Institute of Medicine). Microbial threats to health: emergence, detection, and response. National Academy Press; Washington, DC, USA: 2003. [PubMed: 25057653]
    18. Jansen VAA, Stollenwerk N, Jensen HJ, Ramsay ME, Edmunds WJ, Rhodes CJ. Measles outbreaks in a population with declining vaccine uptake. Science. 2003;301:804. [PubMed: 12907792]
    19. Keele BF, Van Heuverswyn F, Li Y, Bailes E, Takehisa J, Santiago ML, Bibollet-Ruche F, Chen Y, Wain LV, Liegeois F, Loul S, Mpoudi Ngole E, Bienvenue Y, Delaporte E, Brookfield JFY, Sharp PM, Shaw GM, Peeters M, Hahn BH. Chimpanzee reservoirs of pandemic and nonpandemic HIV-1. Science. 2006;313:523–526. [PMC free article: PMC2442710] [PubMed: 16728595]
    20. King DA, Peckham C, Waage JK, Brownlie J, Woolhouse MEJ. Infectious diseases: preparing for the future. Science. 2006;313:1392–1393. [PubMed: 16959992]
    21. Lázaro ME, Cantoni GE, Calanni LM, Resa AJ, Herrero ER, Iacono MA, Enria DA, González Cappa SM. Clusters of hantavirus infection, southern Argentina. Emerg Inf Dis. 2007;13:104–110. [PMC free article: PMC2725835] [PubMed: 17370522]
    22. Lipsitch M, Cohen T, Cooper B, Robins JM, Ma S, James L, Gopalakrishna G, Chew SK, Tan CC, Samore MH, Fisman D, Murray M. Transmission dynamics and control of severe acute respiratory syndrome. Science. 2003;300:1966–1970. [PMC free article: PMC2760158] [PubMed: 12766207]
    23. Lumio J, Hillbom M, Roine R, Ketonen L, Haltia M, Valle M, Neuvonen E, Lahdevirta J. Human rabies of bat origin in Europe. Lancet. 1986;1:378. [PubMed: 2868310]
    24. Matthews L, Woolhouse MEJ. New approaches to quantifying the spread of infection. Nat Rev Microbiol. 2005;7:529–536. [PMC free article: PMC7096817] [PubMed: 15995653]
    25. May RM, Gupta S, McLean AR. Infectious disease dynamics: What characterizes a successful invader. Phil Trans R Soc B Biol Sci. 2001;356:901–910. [PMC free article: PMC1088483] [PubMed: 11405937]
    26. Mermin J, Hutwagner L, Vugia D, Shallow S, Daily P, Bender J, Koehler J, Marcus R, Angulo FJ. Emerging Infections Program FoodNet Working Group. Reptiles, amphibians and human Salmonella infection: a population-based, case-control study. Clin Inf Dis. 2004;38:S253–261. [PubMed: 15095197]
    27. OSI (Office of Science and Innovation). Foresight. Infectious Diseases: Preparing for the Future. Office of Science and Innovation; London, UK: 2006.
    28. Palmarini M. A veterinary twist on pathogen biology. PLoS Path. 2007;3:e12. [PMC free article: PMC1803002] [PubMed: 17319740]
    29. Parrish CR, Kawaoka Y. The origins of new pandemic viruses: the acquisition of new host ranges by canine parvovirus and influenza A viruses. Ann Rev Microbiol. 2005;59:553–586. [PubMed: 16153179]
    30. Simmonds P. Reconstructing the origins of human hepatitis viruses. Phil Trans R Soc B Biol Sci. 2001;356:1013–1026. [PMC free article: PMC1088496] [PubMed: 11516379]
    31. Stohr K. A multicentre collaboration to investigate the cause of severe acute respiratory syndrome. Lancet. 2003;361:1730–1733. [PMC free article: PMC7119328] [PubMed: 12767752]
    32. Taylor LH, Latham SM, Woolhouse MEJ. Risk factors for human disease emergence. Philos Trans R Soc Lond B Biol Sci. 2001;356:983–989. [PMC free article: PMC1088493] [PubMed: 11516376]
    33. UNAIDS. AIDS Epidemic Update December 2006. UNAIDS/WHO; Geneva, Switzerland: 2007.
    34. Van Heuverswyn F, Li Y, Neel C, Bailes E, Keele BF, Liu W, Loul S, Butel C, Liegeois F, Bienvenue Y, Mpoudi Ngolle E, Sharp PM, Shaw GM, Delaporte E, Hahn BH, Peeters M. Human immunodeficiency viruses: SIV infection in wild gorillas. Nature. 2006;444:164. [PubMed: 17093443]
    35. Weiss RA. Animal origins of human infectious disease. Philos Trans R Soc Lond B Biol Sci. 2001;356:957–977. [PMC free article: PMC1088492] [PubMed: 11405946]
    36. Wells RM, Sosa Estani S, Yadon ZE, Enria D, Padula P, Pini N, Mills JN, Peters CJ, Segura EL. the Hantavirus Pulmonary Syndrome Study Group for Patagonia. An unusual hantavirus outbreak in southern Argentina: person-to-person transmission. Emerg Infect Dis. 1997;3:171–174. [PMC free article: PMC2627608] [PubMed: 9204298]
    37. Wilesmith JW. An epidemiologist’s view of bovine spongiform encephalopathy. Philos Trans R Soc Lond B Biol Sci. 1994;343:357–361. [PubMed: 8041802]
    38. Wolfe ND, Dunavan CP, Diamond J. Origins of major human infectious diseases. Nature. 2007;447:279–283. [PMC free article: PMC7095142] [PubMed: 17507975]
    39. Wolfe ND, Switzer WM, Carr JK, Bhullar VB, Shanmugam V, Tamoufe U, Prosser AT, Torimiro JN, Wright A, Mpoudi-Ngole E, McCutchan FE, Birx DL, Folks TM, Burke DS, Heneine W. Naturally acquired simian retrovirus infections in central African huters. Lancet. 2004;363:932–937. [PubMed: 15043960]
    40. Woolhouse MEJ. Population biology of emerging and re–emerging pathogens. Trends Microbiol. 2002;10:S3–7. [PubMed: 12377561]
    41. Woolhouse MEJ, Gowtage-Sequeria S. Host range and emerging and reemerging pathogens. Emerg Infect Dis. 2005;11:1842–1847. [PMC free article: PMC3367654] [PubMed: 16485468]
    42. Woolhouse MEJ, Haydon DT, Antia R. Emerging pathogens: the epidemiology and evolution of species jumps. Trends in Ecology and Evolution. 2005;20:238–244. [PMC free article: PMC7119200] [PubMed: 16701375]
    43. Woolhouse MEJ, Taylor LH, Haydon DT. Population biology of multihost pathogens. Science. 2001b;292:1109–1112. [PubMed: 11352066]

    Eisen References

    1. Blattner FR, Plunkett 3rd G, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y. The complete genome sequence of Escherichia coli K-12. Science. 1997;277(5331):1453–1474. [PubMed: 9278503]
    2. Brown DP, Krishnamurthy N, Sjölander K. Automated protein subfamily identification and classification. PLoS Computational Biology. 2007;3(8):e160. [PMC free article: PMC1950344] [PubMed: 17708678]
    3. Chatterjee S, Almeida RP, Lindow S. Living in two worlds: the plant and insect lifestyles of Xylella fastidiosa. Annual Review of Phytopathology. 2008;46:243–271. [PubMed: 18422428]
    4. Dobzhansky T. Nothing in biology makes sense except in the light of evolution. The American Biology Teacher. 1973;35(March):125–129.
    5. Eichenberger P, Fujita M, Jensen ST, Conlon EM, Rudner DZ, Wang ST, Ferguson C, Haga K, Sato T, Liu JS, Losick R. The program of gene transcription for a single differentiating cell type during sporulation in Bacillus subtilis. PLoS Biology. 2004;2(10):e328. [PMC free article: PMC517825] [PubMed: 15383836]
    6. Eisen JA. Phylogenomics: improving functional predictions for uncharacterized genes by evolutionary analysis. Genome Research. 1998a;8(3):163–167. [PubMed: 9521918]
    7. Eisen JA. A phylogenomic study of the MutS family of proteins. Nucleic Acids Research. 1998b;26(18):4291–4300. [PMC free article: PMC147835] [PubMed: 9722651]
    8. Eisen JA. Assessing evolutionary relationships among microbes from whole-genome analysis. Current Opinion in Microbiology. 2000;3(5):475–480. [PubMed: 11050445]
    9. Eisen JA, Hanawalt PC. A phylogenomic study of DNA repair genes, proteins, and processes. Mutation Research. 1999;435(3):171–213. [PMC free article: PMC3158673] [PubMed: 10606811]
    10. Eisen JA, Wu M. Phylogenetic analysis and gene functional predictions: phylogenomics in action. Theoretical Population Biology. 2002;61(4):481–487. [PubMed: 12167367]
    11. Eisen JA, Sweder KS, Hanawalt PC. Evolution of the SNF2 family of proteins: subfamilies with distinct sequences and functions. Nucleic Acids Research. 1995;23(14):2715–2723. [PMC free article: PMC307096] [PubMed: 7651832]
    12. Eisen JA, Kaiser D, Myers RM. Gastrogenomic delights: a movable feast. Nature Medicine. 1997;3(10):1076–1078. [PMC free article: PMC3155951] [PubMed: 9334711]
    13. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM, McKenney K, Sutton G, FitzHugh W, Fields C, Gocayne JD, Scott J, Shirley R, Liu L, Glodek A, Kelley JM, Weidman JF, Phillips CA, Spriggs T, Hedblom E, Cotton MD, Utterback TR, Hanna MC, Nguyen DT, Saudek DM, Brandon RC, Fine LD, Fritchman JL, Fuhrmann JR, Geoghagen NSM, Gnehm CL, McDonald LA, Small KV, Fraser CM, Smith HO, Venter JC. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995;269(5223):496–512. [PubMed: 7542800]
    14. Fraser CM, Eisen JA, Nelson KE, Paulsen IT, Salzberg SL. The value of complete microbial genome sequencing (you get what you pay for). Journal of Bacteriology. 2002;184(23):6403–6405. [PMC free article: PMC135419] [PubMed: 12426324]
    15. Haft DH, Loftus BJ, Richardson DL, Yang F, Eisen JA, Paulsen IT, White O. TIGRFAMs: a protein family resource for the functional identification of proteins. Nucleic Acids Research. 2001;29(1):41–43. [PMC free article: PMC29844] [PubMed: 11125044]
    16. Heidelberg JF, Eisen JA, Nelson WC, Clayton RA, Gwinn ML, Dodson RJ, Haft DH, Hickey EK, Peterson JD, Umayam L, Gill SR, Nelson KE, Read TD, Tettelin H, Richardson D, Ermolaeva MD, Vamathevan J, Bass S, Qin H, Dragoi I, Sellers P, McDonald L, Utterback T, Fleishmann RD, Nierman WC, White O, Salzberg SL, Smith HO, Colwell RR, Mekalanos JJ, Venter JC, Fraser CM. DNA sequence of both chromosomes of the cholera pathogen Vibrio cholerae. Nature. 2000;406(6795):477–483. [PMC free article: PMC8288016] [PubMed: 10952301]
    17. Hugenholtz P. Exploring prokaryotic diversity in the genomic era. Genome Biology. 2002;3(2):REVIEWS0003. [PMC free article: PMC139013] [PubMed: 11864374]
    18. Jordan IK, Makarova KS, Spouge JL, Wolf YI, Koonin EV. Lineage-specific gene expansions in bacterial and archaeal genomes. Genome Research. 2001;11(4):555–565. [PMC free article: PMC311027] [PubMed: 11282971]
    19. Kang JM, Iovine NM, Blaser MJ. A paradigm for direct stress-induced mutation in prokaryotes. FASEB Journal. 2006;20(14):2476–2485. [PubMed: 17142797]
    20. LeClerc JE, Li B, Payne WL, Cebula TA. High mutation frequencies among Escherichia coli and Salmonella pathogens. Science. 1996;274(5290):1208–1211. [PubMed: 8895473]
    21. Lederberg J. Infectious disease as an evolutionary paradigm. Emerging Infectious Diseases. 1997;3(4):417. [PMC free article: PMC2640075] [PubMed: 9366592]
    22. Lederberg J. Emerging infections: an evolutionary perspective. Emerging Infectious Diseases. 1998;4(3):366. [PMC free article: PMC2640283] [PubMed: 9716947]
    23. Marcotte EM. Computational genetics: finding protein function by nonhomology methods. Current Opinion in Structural Biology. 2000;10(3):359–365. [PubMed: 10851184]
    24. Marshall B. Helicobacter pylori: 20 years on. Clinical Medicine. 2002;2(2):147–152. [PMC free article: PMC4952378] [PubMed: 11991099]
    25. McCutcheon JP, Moran NA. Parallel genomic evolution and metabolic interdependence in an ancient symbiosis. Proceedings of the National Academy of Sciences. 2007;104(49):19392–19397. [PMC free article: PMC2148300] [PubMed: 18048332]
    26. Moran NA, Dale C, Dunbar H, Smith WA, Ochman H. Intracellular symbionts of sharpshooters (Insecta: Hemiptera: Cicadellinae) form a distinct clade with a small genome. Environmental Microbiology. 2003;5(2):116–126. [PubMed: 12558594]
    27. Moran NA, Tran P, Gerardo NM. Symbiosis and insect diversification: an ancient symbiont of sap-feeding insects from the bacterial phylum Bacteroidetes. Applied and Environmental Microbiology. 2005;71(12):8802–8810. [PMC free article: PMC1317441] [PubMed: 16332876]
    28. Moran NA, McCutcheon JP, Nakabachi A. Genomics and evolution of heritable bacterial symbionts. Annual Review of Genetics. 2008;42:165–190. [PubMed: 18983256]
    29. Ohta T. Evolution of gene families. Gene. 2000;259(1–2):45–52. [PubMed: 11163960]
    30. Parkhill J, Wren BW, Mungall K, Ketley JM, Churcher C, Basham D, Chillingworth T, Davies RM, Feltwell T, Holroyd S, Jagels K, Karlyshev AV, Moule S, Pallen MJ, Penn CW, Quail MA, Rajandream MA, Rutherford KM, van Vliet AH, Whitehead S, Barrell BG. The genome sequence of the food-borne pathogen Campylobacter jejuni reveals hypervariable sequences. Nature. 2000;403(6770):665–668. [PubMed: 10688204]
    31. Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proceedings of the National Academy of Sciences. 1999;96(8):4285–4288. [PMC free article: PMC16324] [PubMed: 10200254]
    32. Pollard KS, Salama SR, King B, Kern AD, Dreszer T, Katzman S, Siepel A, Pedersen JS, Bejerano G, Baertsch R, Rosenbloom KR, Kent J, Haussler D. Forces shaping the fastest evolving regions in the human genome. PLoS Genetics. 2006;2(10):e168. [PMC free article: PMC1599772] [PubMed: 17040131]
    33. Silvaggi JM, Popham DL, Driks A, Eichenberger P, Losick R. Unmasking novel sporulation genes in Bacillus subtilis. Journal of Bacteriology. 2004;86(23):8089–8095. [PMC free article: PMC529092] [PubMed: 15547282]
    34. Sonnhammer EL, Eddy SR, Durbin R. Pfam: a comprehensive database of protein domain families based on seed alignments. Proteins. 1997;28(3):405–420. [PubMed: 9223186]
    35. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Research. 2000;28(1):33–36. [PMC free article: PMC102395] [PubMed: 10592175]
    36. Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA, Nelson K, Quackenbush J, Zhou L, Kirkness EF, Peterson S, Loftus B, Richardson D, Dodson R, Khalak HG, Glodek A, McKenney K, Fitzegerald LM, Lee N, Adams MD, Hickey EK, Berg DE, Gocayne JD, Utterback TR, Peterson JD, Kelley JM, Cotton MD, Weidman JM, Fujii C, Bowman C, Watthey L, Wallin E, Hayes WS, Borodovsky M, Karp PD, Smith HO, Fraser CM, Venter JC. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997;388(6642):539–547. [PubMed: 9252185]
    37. White O, Eisen JA, Heidelberg JF, Hickey EK, Peterson JD, Dodson RJ, Haft DH, Gwinn ML, Nelson WC, Richardson DL, Moffat KS, Qin H, Jiang L, Pamphile W, Crosby M, Shen M, Vamathevan JJ, Lam P, McDonald L, Utterback T, Zalewski C, Makarova KS, Aravind L, Daly MJ, Minton KW, Fleischmann RD, Ketchum KA, Nelson KE, Salzberg S, Smith HO, Venter JC, Fraser CM. Genome sequence of the radioresistant bacterium Deinococcus radiodurans R1. Science. 1999;286(5444):1571–1577. [PMC free article: PMC4147723] [PubMed: 10567266]
    38. Wu D, Daugherty SC, Van Aken SE, Pai GH, Watkins KL, Khouri H, Tallon LJ, Zaborsky JM, Dunbar HE, Tran PL, Moran NA, Eisen JA. Metabolic complementarity and genomics of the dual bacterial symbiosis of sharpshooters. PLoS Biology. 2006;4(6):e188. [PMC free article: PMC1472245] [PubMed: 16729848]
    39. Wu M, Ren Q, Durkin AS, Daugherty SC, Brinkac LM, Dodson RJ, Madupu R, Sullivan SA, Kolonay JF, Haft DH, Nelson WC, Tallon LJ, Jones KM, Ulrich LE, Gonzalez JM, Zhulin IB, Robb FT, Eisen JA. Life in hot carbon monoxide: the complete genome sequence of Carboxydothermus hydrogenoformans Z-2901. PLoS Genetics. 2005;1(5):e65. [PMC free article: PMC1287953] [PubMed: 16311624]
    40. Zmasek CM, Eddy SR. RIO: analyzing proteomes by automated phylogenomics using resampled inference of orthologs. BMC Bioinformatics. 2002;3:14. [PMC free article: PMC116988] [PubMed: 12028595]

    Daszak References

    1. Black FL. Measles endemicity in insular populations: critical community size and its evolutionary implication. Journal of Theoretical Biology. 1966;11(2):207–211. [PubMed: 5965486]
    2. Daszak P, Plowright R, Epstein JH, Pulliam J, Abdul Rahman S, Field HE, Jamaluddin A, Sharifah SH, Smith CS, Olival KJ, Luby S, Halpin K, Hyatt AD, Cunningham AA. Henipavirus Ecology Research Group (HERG). The emergence of Nipah and Hendra virus: pathogen dynamics across a wildlife-livestock-human continuum. In: Collinge RS, editor. Disease ecology: community structure and pathogen dynamics. Oxford, UK: Oxford University Press; 2006. pp. 186–201.
    3. Dobson AP, Carper ER. Infectious diseases and human population history. Bioscience. 1996;46(2):115–126.
    4. Epstein JH, Prakash V, Smith CS, Daszak P, McLaughlin AB, Meehan G, Field HE, Cunningham AA. Henipavirus infection in fruit bats (Pteropus giganteus), India. Emerging Infectious Diseases. 2008;14(8):1309–1311. [PMC free article: PMC2600370] [PubMed: 18680665]
    5. Fogarty R, Halpin K, Hyatt AD, Daszak P, Mungall BA. Henipavirus susceptibility to environmental variables. Virus Research. 2008;132(1–2):140–144. [PMC free article: PMC3610175] [PubMed: 18166242]
    6. Hufnagel L, Brockmann D, Geisel T. Forecast and control of epidemics in a globalized world. Proceedings of the National Academy of Sciences. 2004;101(42):15124–15129. [PMC free article: PMC524041] [PubMed: 15477600]
    7. Jones KE, Patel NG, Levy MA, Storeygard A, Balk D, Gittleman JL, Daszak P. Global trends in emerging infectious diseases. Nature. 2008;451(7181):990–993. [PMC free article: PMC5960580] [PubMed: 18288193]
    8. Kilpatrick AM, Gluzberg Y, Burgett J, Daszak P. Quantitative risk assessment of the pathways by which West Nile virus could reach Hawaii. Ecohealth. 2004;1(2):205–209.
    9. Kilpatrick AM, Chmura AA, Gibbons DW, Fleischer RC, Marra PP, Daszak P. Predicting the global spread of H5N1 avian influenza. Proceedings of the National Academy of Sciences. 2006a;103(51):19368–19373. [PMC free article: PMC1748232] [PubMed: 17158217]
    10. Kilpatrick AM, Daszak P, Goodman SJ, Rogg H, Kramer LD, Cedeno V, Cunningham AA. Predicting pathogen introduction: West Nile virus spread to Galapagos. Conservation Biology. 2006b;20(4):1224–1231. [PubMed: 16922238]
    11. Morse SS. Examining the origins of emerging viruses. In: Morse SS, editor. Emerging viruses. New York: Oxford University Press; 1993.
    12. Murphy FA. Emerging zoonoses. Emerging Infectious Diseases. 1998;4(3):429–435. [PMC free article: PMC2640289] [PubMed: 9716965]
    13. Peeters M, Courgnaud V, Abela B, Auzel P, Pourrut X, Bibollet-Ruche F, Loul S, Liegeois F, Butel C, Koulagna D, Mpoudi-Ngole E, Shaw GM, Hahn BH, Delaporte E. Risk to human health from a plethora of simian immunodeficiency viruses in primate bushmeat. Emerging Infectious Diseases. 2002;8(5):451–457. [PMC free article: PMC2732488] [PubMed: 11996677]
    14. Taylor LH, Latham SM, Woolhouse MEJ. Risk factors for human disease emergence. Philosophical Transactions of the Royal Society of London. Series B, Biological sciences. 2001;356(1411):983–989. [PMC free article: PMC1088493] [PubMed: 11516376]
    15. WorldAtlas.com. Countries of the world. 2006. [accessed December 18, 2008]. http://www​.worldatlas​.com/aatlas/populations/ctydensityh.htm .



Professor of epidemiology and founding director of the Center for Public Health Preparedness at the Mailman School of Public Health.


In 1993, four children died and hundreds became ill after eating undercooked hamburger patties contaminated by E. coli bacteria at Jack in the Box restaurants (see http://www​.about-ecoli​.com/ecoli_outbreaks​/view/jack-in-the-box-e-coli-outbreak).


This article is reprinted with permission from Critical Reviews in Microbiology 33:231–242 (2007).


Centre for Infectious Diseases, University of Edinburgh, Edinburgh, United Kingdom.


University of California, Davis Genome Center; Department of Medical Microbiology and Immunology and Section of Evolution and Ecology, Davis, CA 95616; E-mail: ude.sivadcu@nesieaj; Website: http:​//phylogenomics.blogspot.com.


I note it will be good to sample from across viral diversity, although since there is no phylogenetic tree linking all viruses it is unclear exactly how to do this sampling.


Executive director.


Fertile Crescent historic region of the Middle East. A well-watered and fertile area, it arcs across the northern part of the Syrian desert. It is flanked on the west by the Mediterranean and on the east by the Euphrates and Tigris rivers, and includes all or parts of Israel, the West Bank, Jordan, Lebanon, Syria, and Iraq. From antiquity this region was the site of settlements and the scene of bloody raids and invasions (see http://www​.encyclopedia​.com/doc/1E1-FertileC.html).


Geological era dating from 1.8 million to 10,000 years ago.


Caused by humans.

Image workshopf13
Image workshopf7
Image workshopf11
Copyright © 2009, National Academy of Sciences.
Bookshelf ID: NBK45714


Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...