6The Institutional Landscape for Metagenomics: New Science, New Challenges

Publication Details


The Scientific Community

In new fields, such as metagenomics, the scientific societies are logical foci to build grassroots interest, implement knowledge exchange, and facilitate planning. The societies should organize events to build knowledge and foster communication in their fields and especially among other relevant disciplines. They can provide leadership to build consensus, communicate potential benefits to the public, and facilitate the establishment of leadership groups to foster the coordination and development of metagenomics on a broad scale, both nationally and internationally. One of the successes of the Arabidopsis genome project (see below) is that it was organized by the scientific community working in concert with the funding agencies. Metagenomics would be well served by using a similar organizational model.

Funding Agencies

In the United States, 12 federal agencies are members of the Microbe Project, an interagency working group formed in August 2000 under the aegis of the Subcommittee on Biotechnology of the National Science and Technology Council Committee on Science. The mission of the Microbe Project is “to maximize the opportunities offered by genome-enabled microbial science to benefit science and society, through coordinated interagency efforts to promote research, infrastructure development, education and outreach.”1 The 12 members of the Microbe Project are the Department of Agriculture, the Department of Defense, the Department of Energy, the Department of Homeland Security, the Department of the Interior US Geological Survey, the Environmental Protection Agency, the Food and Drug Administration, the National Aeronautics and Space Administration, the National Institutes of Health, the National Institute of Standards and Technology, the National Oceanic and Atmospheric Administration, and the National Science Foundation. These twelve agencies along with the Central Intelligence Agency and the Federal Bureau of Investigation are the federal agencies that because of their missions or responsibilities have benefited from genome-enabled microbiology and would be expected to benefit further from the advances of metagenomics. The Microbe Project’s mission makes it ideally suited to convene the necessary working groups to advise on the specifics of the infrastructure needed to enable the science of metagenomics and to articulate a plan that coordinates responsibilities and funding to maximize efficiencies and capture the expected synergies in this new field. Several of the agencies have already funded metagenomics projects, some of which have become models that reveal the promise of the field. Each of the 14 agencies mentioned has its own missions and interests, but much synergy is to be gained by pooling common infrastructure needs, and this is a strong motivator for a well-coordinated effort at the federal level. The Microbe Project should coordinate its work with the scientific societies to involve the scientific community in the development of the field.

Other organizations are and will be interested in funding metagenomics, including foundations with national or international interests, the private sector, and some state agencies. Large projects can have several partners that contribute on large or small scales for targeted components or for general core funding. It is possible to imagine metagenomics projects supported by several countries, funding agencies, and private foundations. Mechanisms will have to be worked out to ensure proper representation and credit while avoiding hindrances of the general goal of work for the public good.

International Coordination

The large-scale nature of metagenomics and the international interest in the field suggest that there will be interest in and value to be derived from international coordination from the beginning. Some metagenomics projects are under way in the European Community, Canada, China, Brazil, Singapore, South Korea, and Japan. Many of the projects have interest in similar but not identical habitats or focal questions. All, however, could benefit from some common infrastructure—most notably metagenomics databases and new analysis tools but also new sampling strategies and data standards, to name the most obvious.

As a first step in addressing how human metagenomics studies might be approached on an international scale, a panel of 75 participants (scientists, physicians, industry representatives, and administrators from funding agencies) from Asia, the Americas, and Europe met in Paris in October 2005 to discuss the feasibility of sequencing the human intestinal metagenome, its importance for human health and industry, possible technical approaches, and possible funding scenarios.2 The meeting generated a framework for an International Human Gut Metagenome Initiative, including recommendations to generate reference genome sequence data from approximately 1000 gut bacterial species that can be cultured, to develop techniques for sequencing microorganisms that cannot be cultured, and to classify genes of the microbial community based on metagenomic sequencing. Since this meeting in the fall of 2005, a trans-institute NIH committee has been assembled to discuss in more detail its participation in an international human metagenome project. The recent call for proposals under the European Union 7th Framework Programme includes the characterization and variability of the microbial communities in the human body as one of its areas of focus.

International coordination would help to ensure greater efficiency and less duplication of effort, but it should not restrict creativity or the national interests of any country. Besides helping to plan and develop common infrastructure, international coordination would ensure wide communication of ongoing projects and results so that new projects were not undertaken without knowledge of the global landscape. Furthermore, if a few major metagenomics projects are to be undertaken comprehensively and in great depth, they will be more successful if the breadth and resources of the international science and engineering communities are exploited.

The initiation of international coordination is best left to the interested scientific communities—particularly interested scientists and their societies—in communication with national funding agencies. As noted above, the organizational model of the Arabidopsis project is useful.


Metagenomics will draw on expertise from many disciplines and individuals:

  • Those with knowledge of microbiology, including microbial genetics, biochemistry, physiology, pathology, systematics, ecology, and evolution.
  • Other biologists, including molecular and cellular biologists and those with knowledge of host organisms, such as humans and other mammals, plants, insects, and microbial hosts with important roles in nature or of economic importance.
  • Those with knowledge of the environment, including soil and atmosphere scientists, geologists, oceanographers, hydrologists, and ecosystem scientists.
  • Computational scientists, including those with knowledge of statistics, computer science, data mining and visualization, database development, modeling, and applied mathematics.
  • Those with expertise in scaling information to large ecosystems, and in evaluating the effects of global change and its interface with policy.
  • Engineers, physical scientists, and chemists whose skills and insights are potentially field-transforming in their contribution to new methods, chemistry, devices and applications (within and beyond metagenomics), and the understanding of complexity, networks, and system structure.

Metagenomics as defined here is much more than DNA sequences and engages all the “omics” and a broader, microbial-community-based systems biology. To reach the understanding that metagenomics will make possible, new education and training programs will be needed. Experts in a broad array of fields must be integrated into metagenomics projects and provided with appropriate cross-disciplinary knowledge so that their specific expertise can be made the most of and their contributions disseminated to the wider community.

As mentioned in Chapters 4 and 5, metagenomics probably will require proportionally more contributions from computational and bioinformatics scientists than any other field of biology. Hence, it is imperative that this workforce requirement be addressed immediately. It is not easy to identify computational scientists or biologists who have both the interest and the talent in the kind of cross-training that metagenomics projects will require. We recommend establishing several types of training programs to encourage scientists to develop the needed skills. Several mechanisms have been successful in providing cross-discipline training: interdisciplinary training to augment traditional graduate programs, summer courses patterned after the Cold Spring Harbor or Marine Biological Laboratories summer courses, and post-doctoral fellowship programs in which fellows undertake training in new disciplines. Support for faculty to attend metagenomics workshops or to spend sabbaticals in metagenomics research laboratories or facilities would also be beneficial in expanding appropriate training environments.

As described earlier, although metagenomics has similarities to genomics as currently practiced, it also has important differences in the types of data and in questions to be asked, so it is important to recognize that the components and expectations of current genomics training programs will not suffice for metagenomics.


Data Release

The rapid release of sequence information has been an important and sometimes contentious issue in genome-sequencing projects. Proponents of rapid release of data cite the relatively long timeframe of sequencing projects and the ability to derive important information even from incomplete data. Opponents of rapid release emphasize the need of those doing the sequencing to have time to analyze and publish the results of their own work before others have publication opportunities. Intellectual-property issues also arise; rapid release of information into the public domain may bar the opportunity to obtain some types of intellectual-property rights.

Data release was a contentious matter in the early days of large-scale sequencing projects. Two meetings, one in Bermuda in February 1996, and one in Fort Lauderdale, FL, in January 2003, grappled with the issues and published recommendations to the community,3 which were adopted by the major funding agencies, including NIH.4 At Fort Lauderdale, projects that were funded as community resources were specifically defined: “A ‘community resource project’ is a research project specifically devised and implemented to create a set of data, reagents or other material whose primary utility will be as a resource for the broad scientific community.” Data from such projects should be released immediately for free and unrestricted use by the community. Obligations, however, were imposed on the users of such data, with respect to recognizing the data providers’ legitimate interests in publishing and analyzing the data, and in acknowledging the data providers as the source of the data. The Committee’s view is that these policies have served the community well and should be explicitly adhered to by metagenomics researchers.

The Fort Lauderdale Agreement recognizes that these policies may not necessarily be appropriate for projects funded by grants to individual investigators, where providing a community resource is not the primary goal. Recognizing that most of the major funding agencies now have data access policies in place,5 we express the view that even single-investigator projects should be expected to practice release after a specified time period e.g., 6 months.

Intellectual Property

Many companies in several markets tap into the value found in natural resources, and metagenomics constitutes a new way to access natural resources. Advances in DNA and expression technologies provide opportunities to overcome supply issues that in the past limited the value of natural products. The more advanced the technologies become, the more value will be derived and the less destructive sampling of biological materials will become.

The pharmaceutical market is especially large and hence illustrates the potential for intellectual property. Global revenues of this market in 2004 were over US$500 billion; sales in North America, Europe, and Japan made up about 80% of the total. It is estimated that 62% of oncology drugs are derived in some way from natural products (Newman et al. 2003).

Global revenues in industrial, agricultural and healthcare biotechnology in 2004 were $54.6 billion; the United States dominated with 78% of global revenue. Products in this market include enzymes for the textile, detergent, food and feed, and personal-care industries. Many small companies are in the enzyme market, but it is dominated by such large companies as Novozymes and Danisco, which have programs to identify new products by sampling microbes in the environment.

Key patents in metagenomics may affect the ability of researchers to practice some methods of metagenomics. US patents have been issued that claim methods of isolating DNA directly from a mixed population of organisms. These patents may be determined to be infringed by some who are using the metagenomics methods without a license. The uncertainty of the situation poses additional risks to any who seek to commercialize findings arising from metagenomics studies. Patent issues are also associated with bioprospecting (collecting biological material) outside the United States. The Convention on Biological Diversity (see next section) requirement for benefit-sharing poses some threats to the intellectual-property rights of those who wish to commercialize findings from metagenomics studies of samples from outside the United States. Patent issues associated with the convention also may influence full disclosure and information release. New patent legislation in and outside the United States may require statements of the country of origin about compositions derived from biological sources.

The ownership of genetic resources outside national jurisdictions is uncertain. Collection of samples in areas outside national borders—for example, in deep-sea vents beyond national jurisdictions—is unregulated by international policies. International organizations increasingly recognize the need for such policies (Arico and Salpin 2005).

Metagenomics and the Convention on Biological Diversity

The collection of samples within national borders is now guided by the Convention on Biological Diversity (CBD). Metagenomics studies rely on sample collection, and it will be important that researchers comply with the CBD to prevent charges of “biopiracy.”

The CBD is an international treaty that was adopted at the 1992 United Nations Conference on Environment and Development in Rio de Janeiro, Brazil. The three stated goals of the CBD are the conservation of biological diversity, the sustainable use of biological components, and the fair and equitable sharing of benefits arising from use of genetic resources (US Senate Committee on Foreign Relations 1994). The main points of the treaty establish the sovereign rights of states to their own natural resources and make access to the biological resources subject to national rules and legislations. The treaty imposes expectations of access by prior informed consent and of nations’ fair and equitable sharing in the benefits of commercial exploitation of their biological resources. Each party nation is expected to establish legislation and policies regarding access and benefit-sharing (ABS). Adequate protection of intellectual property rights is granted, but the treaty also creates expectations that developing nations will be granted access to technology arising from the use of their biological resources, and this can lead to challenges to the rights typically granted by patents. In response to perceived commercial risks and uncertainties, the Biotechnology Industry Organization has created guidelines for bioprospecting.6 The 1992 Rio “Earth Summit” resulted in over 150 governments signing the CBD, including the United States. More than 187 countries (not including the United States) later ratified the agreement, providing global support and acceptance of the treaty. (In 1994, the US Senate Committee on Foreign Relations approved the treaty, but the full Senate failed to ratify it.)

The UN, which administers the treaty, continues to discuss issues raised by the CBD. A CBD working group on ABS created a December 2005 report that assessed the impact of the CBD policies on the commercial use of bio-diversity. The report highlights the importance of the diversity of microbes and various pharmaceutical and biotechnology companies’ interest in it. The December 2005 CBD report points out that recent metagenomics studies pose “a host of new questions and challenges with regard to access and benefit-sharing, in particular relating to the sovereignty of microbes and the difficulties of ascribing ownership” (Laird and Wynberg 2005). The report confirms that the CBD has changed practices in corporations seeking to exploit biological diversity: “larger or socially responsible companies do not generally consider genetic resources freely available.” The difficulties in negotiations between commercial entities and nations are highlighted in the report. Tensions surround the differences between value expectations made by diversity providers and companies’ valuations based on commercialization costs. The report also describes regulatory confusion and uncertain policies that hinder the commercial exploitation of biological resources (Laird and Wynberg 2005).

Differences in perspectives regarding intellectual-property protection and rights are often at the center of discussions of ABS. Some nations are increasingly trying to introduce “disclosure of origin” as part of the patent-application process. In the short term, metagenomics has the potential to tap into substantial microbial diversity without venturing abroad. However, as research expands, the unresolved issues raised by the CBD will probably influence metagenomics research. It may be prudent for funding agencies to establish formal sections of proposals in which investigators need to specify how they will comply with the CBD when sampling outside US borders. This would increase awareness of CBD issues and help to protect the agencies against international disputes related to funded research. It should be remembered by all parties that there can be considerable mutual benefit to science and education in well-structured, collaborative, international metagenomics projects.


Metagenomics projects will require the sequencing of DNA arising from unknown organisms with unknown potential for causing human, plant, and animal disease. In that respect, metagenomics projects are much like traditional microbiology, tapping into the unknown microbial diversity in various environments. However, in contrast with traditional microbiology, the end result is DNA clones or DNA sequences, rather than living microbes. Metagenomics projects would thus appear to have fewer biosafety issues than traditional microbiology.

Biological safety levels (BSLs) guidelines for undertaking traditional microbiological and recombinant DNA studies are relatively clear. Because of the potential for cloning genes that might have human health consequences, it would be prudent to undertake metagenomics studies in a BSL2 safety environment whenever pathogenic organisms might be present in significant numbers. There may be some circumstances in which metagenomics studies would be best performed under additional safety standards, such as cloning from an environment that might harbor virulent pathogens, but those circumstances are expected to be uncommon. If the source DNA is considered according to the probability of recovering virulence genes, existing biosafety guidelines appear to be suitable for metagenomics projects.


Metagenomics is the kind of accessible and expansive science that can capture the public’s imagination. Metagenomics research provides a special opportunity to teach microbiology to the public and train a new generation of scientists to be sophisticated and effective scientific communicators who can bring the thrill of discovery to the public.