BOX 2Lessons Learned from the Human Genome Project: Comments from Francis Collins

High level planning process with broad input from the scientific community is crucial to setting ambitious but achievable and realistic goals.

A focus on completeness is important, even though this is extremely difficult when dealing with proteins. This is what distinguishes proteomics from the study of individual proteins, or the fields of biochemistry and physiology. Without completeness as a goal of proteomics much of the same research would be duplicated at a later time.

Technology must be developed and validated before attempting to scale up. Technology development includes the range of activities from proof of principle, to pilot projects, to scaling up, to high-throughput. The Human Genome Project sequenced model organisms and generated the necessary infrastructure prior to actually sequencing the human genome, which did not start until six years into the project and was initiated first with pilot projects.

Public availability of data and resources is absolutely critical if the benefits to the scientific community are going to be realized. The rapid release of pre-publication data was a key to the success of the Human Genome Project.

Interdisciplinary research needs to be fostered, including the participation of experts in automation, chemistry, and bioinformatics.

International participation and coordination is an essential component to bring the best minds to the problem, to avoid duplication, and for cost sharing.

Centralized databases that allow for integration and visualization of the data are an essential resource and are needed to transfer all these data into the hands of those who want to use them. They are expensive and need to be nurtured.

Public-private partnerships should be sought whenever feasible, especially for the generation of pre-competitive data sets. (Successful examples include the single nucleotide polymorphism consortium and mouse genomic sequencing.) Characteristics for successful public-private partnerships include a compelling scientific opportunity, pre-competitive data sets, simultaneous availability of data to all users, production facilities already in place, firm milestones and deliverables, affordability, and having well-defined endpoints.


Cover of Defining the Mandate of Proteomics in the Post-Genomics Era
Defining the Mandate of Proteomics in the Post-Genomics Era: Workshop Report.
National Research Council (US).
Washington (DC): National Academies Press (US); 2002.
Copyright © 2002, National Academy of Sciences.

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.