NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Strachan T, Read AP. Human Molecular Genetics. 2nd edition. New York: Wiley-Liss; 1999.

Cover of Human Molecular Genetics

Human Molecular Genetics. 2nd edition.

Show details

Chapter 13Genome projects

The great discoveries cataloged in our history have mostly been achieved by exploring our environment, whether it be Eratsothenes calculating the Earth's circumference, Columbus sailing across the Atlantic, Galileo peering at the moons of Jupiter, or Darwin studying finches in the Galapagos. After many centuries, we have built up an approximate understanding of our external universe, but the universe within us has only very recently been the subject of serious study. The application of microscopy to the study of cells and subcellular structures provided one major route into this world, to be followed by pioneering advances in biochemistry and then molecular biology. Now, as we enter the next millennium, we are on the threshold of a truly momentous achievement that will have enormous implications for the future. For the first time, we will know our genetic endowment - the sequence of our DNA. Then our voyage into the universe within really will have begun. Sequencing our DNA will be just the beginning of a huge effort to understand exactly how this sequence can specify a person, and how the DNA of other organisms is related to us and to their biologies.

Because of the scale of the effort, the endeavor to sequence our DNA represents biology's first ‘big project’. The Human Genome Project is a truly international effort, the primary aim of which is to deliver the complete nucleotide sequence of our DNA. The human mtDNA sequence was established in 1981 (Section 7.1.1), so the genome in question is the nuclear genome. It has been paralleled by many other genome projects which seek to sequence the DNA of a variety of model organisms. Some genome projects have already been completed; others, including the Human Genome Project, are nearing completion. Added to this are ancillary studies. Some are investigating the extent of sequence variation within the human genome. Others are examining ethical and legal implications of the new knowledge that will be obtained and its likely impact on society.

13.1. The history, organization, goals and value of the Human Genome Project

13.1.1. The Human Genome Project was conceived out of the need for a large-scale project to develop new mutation detection methods

A workshop held in Alta, Utah, in December 1984 was a major catalyst in the development of the Human Genome Project. Sponsored partially by the US Department of Energy (DOE), the workshop was intended to evaluate the current state of mutation detection and characterization and to project future directions for technologies to address current technical limitations. The growing roles of novel DNA technologies were discussed, notably the emerging gene cloning and sequencing technologies. Although such technologies had been in operation for about a decade, the efforts of individual laboratories to try to clone and characterize one gene at a time were considered to be wasteful of scientists' time and research resources. Because of the perceived technical obstacles, a principal conclusion was that methods were incapable of measuring mutations with sufficient sensitivity unless an enormously large, complex and expensive program was undertaken. A subsequent report on Technologies for Detecting Heritable Mutations in Human Beings sparked the idea for a dedicated human genome project by the DOE, and in March 1986 it sponsored an international meeting in Santa Fe, New Mexico, to assess the desirability and feasibility of ordering and sequencing DNA clones representing the entire human genome. Virtually all participants concluded that such a project was feasible and would be an oustanding achievement in biology.

After extensive discussions with the US scientific community, the DOE responded to the Santa Fe meeting by issuing a Report on the Human Genome Initiative in the spring of 1987. Three major objectives were to be implemented: the generation of refined physical maps of human chromosomes, the development of support technologies and facilities for human genome research, and expansion of communication networks and of computational and database capacities. As implementation of this program began with a small number of pilot projects, other US organizations initiated their own studies of policy and strategy. In 1988, two additional widely circulated reports from the US Office of Technology Assessment and National Research Council appeared, and the US National Institute of Health (NIH) set up an Office of Human Genome Research (later renamed the National Center for Human Genome Research) to coordinate NIH genome activities in cooperation with other US organizations. In the same year, the US congress officially gave approval to a 15-year US human genome project commencing in 1991. The required funding was estimated to be about $3 billion.

13.1.2. Organization and goals of the Human Genome Project

Organization of the Human Genome Project

The US Human Genome Project remains the major contributor to international research in this area, but several other countries quickly developed their own Human Genome Projects. Centers in the UK and France have made major contributions and large programs are also underway in some other countries, notably Germany and Japan. In order to coordinate the different national efforts, the Human Genome Organization (HUGO) was established in 1988 with a remit of facilitating exchange of research resources, encouraging public debate and advising on the implications of human genome research (McKusick, 1989). Currently, much of the effort that is going into the Human Genome Project is concentrated in a few very large genome centers, but interacting with them is a worldwide network of small laboratories, mostly attempting to map and identify disease genes (Figure 13.1).

Figure 13.1. Major scientific strategies and approaches being used in the Human Genome Project.

Figure 13.1

Major scientific strategies and approaches being used in the Human Genome Project. The major scientific thrust of the Human Genome Project begins with the isolation of human genomic and cDNA clones (by cell-based cloning or PCR-based cloning). These are (more...)

Communication between the network of genome centers and interacting laboratories is very largely based on electronic communication. This has evolved because of the need to manage and store the huge amount of data that are being produced. The data are entered into large electronic databases which, at least for publically funded mapping and sequencing efforts, are freely accessible through the Internet. Analyses can then be conducted from remote computer terminals throughout the world. Depending on the source of input data, there are two types of database:

  • Central repositories for storing globally produced mapping and sequence data (Table 13.1). Universal DNA and protein sequence databases were established decades before the onset of the genome projects, but specialized species-specific mapping databases were established comparatively recently. For example, the Genome database (GDB) is a mapping database which is aimed specifically at storing human data (note that there is a specific nomenclature for naming DNA segments and genes in different species - see Further reading and Box 13.1 for the human nomenclature).
  • Databases for storing locally produced data. In order to improve their own efficiency, the big genome mapping and sequencing centers have stored data produced in their own laboratories in dedicated databases. Unlike data input, data access is freely available through the Web from publically funded genome centers.
Table 13.1. Examples of databases relevant to the human genome project which store globally produced data.

Table 13.1

Examples of databases relevant to the human genome project which store globally produced data.

Box Icon

Box 13.1

Human gene and DNA segment nomenclature. The nomenclature used is decided by the HUGO nomenclature committee. Genes and pseudogenes are allocated symbols of usually two to six characters; a final P indicates a pseudogene. For anonymous DNA sequences, (more...)


The major rationale of the Human Genome Project is to acquire fundamental information concerning our genetic make-up which will further our basic scientific understanding of human genetics and of the role of various genes in health and disease. As a first step towards achieving this, high-resolution genetic maps were constructed and used as a framework for constructing high-resolution physical maps, culminating in the ultimate physical map - the complete sequence of the human genome. The first major task, high-resolution genetic maps, was achieved by 1994. By the time of writing (May 1999), large clone contigs had been assembled for much of the genome and large-scale DNA sequencing was very much underway. In short, rapid progress has been made and the goals that had been set for 1998 have been achieved or surpassed. Recently, goals for the next 5 years have been established with an anticipated completion date for human genome sequencing by 2003 (Table 13.2; see Collins et al., 1998).

Table 13.2. Goals for the US Human Genome Project (1998-2003) referenced against previous goals and achievements.

Table 13.2

Goals for the US Human Genome Project (1998-2003) referenced against previous goals and achievements.

13.1.3. The medical and scientific benefits of the Human Genome Project are expected to be enormous

For many human biologists and geneticists, the Human Genome Project represents an exciting, historic mission. Since its outset, the project has been especially justified by the expected medical benefits of knowing the structure of each human gene. Inevitably, this information will provide more comprehensive prenatal and presymptomatic diagnoses of disorders in individuals judged to be at risk of carrying a disease gene. The information on gene structure will also be used to explore how individual genes function and how they are regulated. Such information will provide sorely needed explanations for biological processes in humans. It would also be expected to provide a framework for developing new therapies for diseases, in addition to simple gene therapy approaches. More importantly, as mutation screening techniques develop, an expected benefit would be to alter radically the current approach to medical care, from one of treating advanced disease to preventing disease based on the identification of individual risk (Cantor, 1998).

Exciting though such possibilities are, there may be unexpected difficulties in understanding precisely and comprehensively how some genes function and are regulated (cautionary precedents are the lack of progress in predicting protein structure from the amino acid sequence, and the imperfect understanding of the precise ways in which the regulation of globin gene expression is coordinated, decades after the relevant sequences have been obtained). In addition, the single-gene disorders which should be the easiest targets for developing novel therapies are very rare; the most common disorders are multifactorial. Hence, although the data collected in the Human Genome Project will inevitably be of medical value, some of the most important medical applications may take some time to be developed.

13.2. Genetic and physical mapping of the human genome

13.2.1. The first comprehensive human genetic map was published only in 1987, but within 7 years very high resolution maps were achieved, mostly using microsatellite markers

Human genetic maps could not easily be constructed using classical genetic mapping

Classical genetic maps for experimental organisms such as Drosophila and mouse are based on genes. They have been available for decades, and have been refined continuously. They are constructed by crossing different mutants in order to determine whether the two gene loci are linked or not. For much of this period, human geneticists were envious spectators, because the idea of constructing a human genetic map was generally considered unattainable. Unlike the experimental organisms, the human genetic map was never going to be based on genes because the frequency of mating between two individuals suffering from different genetic disorders is extremely small.

The only way forward for a human genetic map was to base it on polymorphic markers which were not necessarily related to disease or to genes. As long as the markers showed mendelian segregation and were polymorphic enough so that recombinants could be scored in a reasonable percentage of meioses, a human genetic map could be obtained. The problem here was that, until recently, suitably polymorphic markers were just not available. Classical human genetic markers consisted of protein polymorphisms, notably blood group and serum protein markers, which are both rare and not very informative (see Box 11.1). By 1981, only very partial human linkage maps had been obtained, and then only in the case of a few chromosomes.

The identification of DNA-based polymorphisms transformed human genetic mapping

Unlike classical markers, DNA-based polymorphisms were not simply confined to the 3% of the DNA that was expressed (genes): they were also available in noncoding DNA. Since the latter was not so strongly conserved in evolution, changes in the DNA were comparatively frequent. The realization that DNA polymorphisms could be abundant called for a radical revision of thinking, and the early 1980s saw serious discussion of the possibility of constructing a complete human genetic map for the first time (see Botstein et al., 1980).

The first comprehensive human genetic map was based on RFLPs

The desirability of a complete linkage map of the human genome was clear. In addition to providing a framework for studying the nature of recombination in humans, it would permit rapid gene localization, assist gene cloning, and facilitate genetic diagnosis Almost inevitably, the realization that a comprehensive human genetic map was now attainable sparked serious efforts to construct one. In 1987, after a huge effort, the first such map was published based on the use of 403 polymorphic loci, including 393 RFLP markers (Donis-Keller et al., 1987). Although this achievement was important, there remained some serious drawbacks with the map: the average spacing between the markers (>10 cM) was still considerable, and, more significantly, RFLP markers are not very informative and are difficult to type (see Box 11.1).

High-resolution human genetic maps were obtained largely through the use of microsatellite markers

Hypervariable minisatellite polymorphisms are highly polymorphic, but their applicability to genome-wide maps is limited because they are mostly restricted to chromosomal regions near the telomeres. Microsatellite markers (also described as short tandem repeat polymorphisms, or STRPs) have the advantage of being abundant, dispersed throughout the genome, highly informative and easy to type (see Box 11.1). By focusing on this type of marker, researchers at the Généthon laboratory in France were quickly able to provide a second-generation linkage map of the human genome (Weissenbach et al., 1992). Subsequently, maps have been produced with ever increasing numbers of genetic markers, especially microsatellite markers, and ever increasing resolution. Within a further two years, a genetic map with 1 cM resolution had been achieved (Murray et al., 1994). After this time the major effort switched to the construction of high-resolution physical maps.

13.2.2. Different physical maps of the human genome have been constructed, but clone contig maps are the primary templates for DNA sequencing

The variety of physical maps of human DNA

Like the genetic map, a physical map of the human genome will consist of 24 maps, one for each chromosome. The different genetic maps of the human genome that have been assembled so far all represent the same concept - sets of linked polymorphic markers (linkage groups) corresponding to different chromosomes. However, unlike this uniformity, a variety of different types of physical map are possible (Table 13.3, Figure 13.2). The first physical map of the human genome was obtained more than 40 years ago when cytogenetic banding techniques were used not only to distinguish the different chromosomes, but also to provide discrimination of different subchromosomal regions (see the human karyogram in Figure 2.17). Although the resolution is coarse (an average sized chromosome band in a 550-band preparation contains ~6 Mb of DNA), it has been very useful as a framework for ordering the locations of human DNA sequences by chromosome in situ hybridization techniques.

Table 13.3. Different types of physical map can be used to map the human nuclear genome.

Table 13.3

Different types of physical map can be used to map the human nuclear genome.

Figure 13.2. Several types of physical map are being constructed for human chromosomes.

Figure 13.2

Several types of physical map are being constructed for human chromosomes. The figure shows integration of several physical maps for the long arm of human chromosome 21. Next to the standard cytogenetic map on the left are the positions of chromosome (more...)

Other maps have been obtained by mapping natural chromosome breakpoints (using translocation and deletion hybrids; see Section 10.1.2), or by mapping artificial chromosome breakpoints using radiation hybrids (Section 10.1.3), but the resolution achieved can be quite limited. Such maps have, however, been useful frameworks for mapping genes (transcript maps; see below). Large-scale restriction maps have also been generated, such as the NotI restriction map of 21q (Ichikawa et al., 1993; Figure 13.2). However, the most important maps are clone contig maps because these are the immediate templates for DNA sequencing.

YAC clone contig maps: a first-generation physical map of the human genome

A complete clone contig map of a chromosome would comprise all the DNA without any gaps (contig originated as a shortened form of the word contiguous; Section 10.3). Because of their large inserts, yeast artificial chromosome (YAC) clones have been particularly useful in generating first-generation physical maps of human chromosomes. Different methods of identifying overlaps between clones have been used, but STS markers (both polymorphic and nonpolymorphic), which had previously been mapped to the chromosome of interest, have been particularly useful. Significant contig maps for individual human chromosomes were first reported in 1992 for chromosome 21 and the Y chromosome and, subsequently, a first-generation clone contig map of the human genome was reported by workers at the CEPH lab in Paris (Cohen et al., 1993). An updated YAC contig map, covering perhaps 75% of the human genome and consisting of 225 contigs with an average size of 10 Mb, was subsequently published by the same group (Chumakov et al., 1992). While these physical maps were recognized to be far from complete, this was an outstanding achievement and provided a good framework for the scientific community to build upon in order to produce future detailed maps of all the chromosomes. Complementing this approach, good STS-based physical maps of the human genome have been developed, such as the one constructed at the Whitehead Institute in Massachussetts (Hudson et al., 1995). These have been achieved in part by mapping STSs against panels of whole-genome radiation hybrids).

BAC/PAC clone contig maps: the major templates for DNA sequencing

The utility of YAC contig maps is limited because YAC inserts are often not faithful representations of the original starting DNA; many YAC clones are chimeric or have internal deletions (see Section 10.3.5). As a result, second-generation clone contig maps have relied on bacterial artificial chromosomes (BACs) and P1 artificial chromosomes (PACs). Although the insert sizes of these clones (typically 70–250 kb) are much smaller than that of YACs, this disadvantage is more than outweighed by their greater stability, making them more faithful representations of the original DNA. Recently, the large genome centers have focused greatly on constructing large BAC contigs as a prelude to large-scale DNA sequencing.

13.2.3. An early priority in the Human Genome Project was the construction of gene (transcript) maps

Coding-DNA sequencing or whole-genome sequencing?

At the outset of the Human Genome Project there was much debate over whether to go for an all-out assault (indiscriminate sequencing of all 3 billion bases), or whether to focus initially just on the coding-DNA sequences. The average coding DNA of a human gene is about 1.7 kb, but human genes occur, on average, once every 40–50 kb of DNA. As a result, coding DNA accounts for a mere 3% of the human genome (unlike Saccharomyces cerevisiae and Caenorhabditis elegans, where the gene density is much higher - 1 per 2 kb and 1 per 5 kb, respectively). To obtain coding-DNA sequences, the easiest approach would be to make a range of human cDNA libraries, then sequence cDNA clones at random.

The priority of coding-DNA sequencing was dependent on two arguments: (i) coding DNA contains the information content of the genome and so is by far the most interesting and medically relevant part; and (ii) it is such a small percentage of the genome that it can be achieved very quickly and cheaply, when compared with efforts to sequence the entire genome. Supporters of whole-genome sequencing emphasized that finding all genes could be difficult (some genes may not be well-represented in available cDNA libraries if they are very restricted in expression, or expressed transiently during early development). In addition, at least some of the noncoding DNA is functionally important, e.g. in the case of regulatory elements and sequences that are important for chromosome function.

The first comprehensive human gene map was based on short sequence tags from cDNA clones

The coding sequence priority prevailed and the first reasonably comprehensive human gene maps were constructed. This involves essentially three steps:

  • Random cDNA sequencing. Initially this meant sequencing short (~ 300 bp) sequences at the 3′ ends of cDNA clones from a variety of human cDNA libraries. These short sequences were described as expressed sequence tags (ESTs) because they permitted a simple and rapid PCR assay for a specific expressed sequence (gene) (Adams et al., 1991). In this sense, therefore, an EST is simply the gene equivalent of an STS (a term used to describe any type of sequence, but often noncoding DNA, which is specific for a particular locus). Because the 3′ UTR of almost all human genes exceeds 300 bp, the 3′ ESTs typically did not contain coding sequence.
  • Mapping ESTs to specific chromosomes. 3′ UTR sequences are not as frequently interrupted by introns as coding DNA. This means that it is usually easy to design PCR primers from an EST that will amplify the specific sequence in a genomic DNA sample. Because 3′ UTR sequences are not very well conserved during evolution, it is also possible to screen human-rodent somatic cell hybrids for the presence of a human EST (the orthologous rodent sequences are usually so diverged that they do not amplify). By using a panel of human monochromosomal somatic cell hybrids (Section 10.1.1), an EST can be mapped to a specific human chromosome.
  • Mapping ESTs to subchromosomal locations. A huge effort has been mounted at some centers to establish integrated STS-based and EST-based maps, such as those produced by the Whitehead Institute. This has involved using PCR primers that are specific for an EST (or STS) to type YACs and other clones within clone contig maps that have been produced for the relevant chromosome and/or typing of a panel of whole genome radiation hybrids. Two such panels have been used in particular (Section 10.1.3): the Genebridge 4 panel (average fragment size 25 Mb), and for higher resolution, the Stanford G3 panel (average fragment size 2.4 Mb).

Using the above approaches, the number of human genes that were placed on the physical map increased exponentially (Figure 13.3). The latest human gene map, published in October 1998, was achieved by a radiation hybrid mapping consortium led by the Sanger Centre, UK, together with various other centers, notably Stanford Human Genome Center, the Généthon lab in Paris, the Whitehead Institute and the Wellcome Trust Centre for Human Genetics at Oxford, UK. In all, map positions for over 30 000 human genes were reported (Deloukas et al., 1998; electronic reference 1), representing possibly 30–40% of the total human gene catalog.

Figure 13.3. Progress in human gene mapping.

Figure 13.3

Progress in human gene mapping. Data were reproduced from the GeneMap'99 website (electronic reference 1).

In many cases, there is little or no coding sequence for the mapped genes and considerable effort is being devoted to sequencing large inserts of human cDNA clones in various laboratories throughout the world. Different research programs are investigating gene expression in specific tissues or in specific states. For example, the Cancer Genome Anatomy Project (electronic reference 2), a program devised at the US National Cancer Institute, is devoted to studying expression of genes in various human tumor cells, including sequencing of large insert cDNA clones from cDNA libraries made from human tumor cells and large-scale expression profiling using microarrays (Section 20.2.2).

13.2.4. Accelerated sequencing efforts mean that the ultimate physical map, the complete nucleotide sequence of the human genome, should be delivered by the year 2003

At the outset of the Human Genome Project, DNA sequencing was expensive and not very efficient. It was anticipated, however, that technological developments would lead to considerable reductions in costs and much more efficient sequencing. The sequencing of the human genome at that time seemed an immense challenge because there was so little experience in sequencing large genomes. All that has changed, and some very large genomes have already been sequenced (Figure 13.4). There have been no significant changes in the basic sequencing technology; the dideoxy sequencing approach invented by Fred Sanger and his colleagues at Cambridge, UK, more than 20 years ago is still used. Instead, efficiency gains have been made through the use of automated fluorescence-based systems and capillary gel electrophoresis.

Figure 13.4. Landmarks in genome sequencing.

Figure 13.4

Landmarks in genome sequencing. Viral genome abbreviations are as follows: SV40, simian virus 40; HPV, human papilloma virus; k, lambda phage; EBV, Epstein Barr virus.

While the first few years of the Human Genome Project were devoted to producing high-resolution genetic and physical maps, large-scale human genome sequencing is now very much underway and 10% of the human genome had been sequenced by May 1999(Figure 13.5; electronic reference 3). Funded largely by the Wellcome Trust, the greatest single contributor has been the Sanger Centre at Hixton, UK (Figure 13.6). By May 1999 the Sanger Centre had contributed over 100 Mb of finished human sequence (out of a global total of 300 Mb), and had also achieved a further 65 Mb of unfinished sequence (sequences which have not yet been compiled into large contigs). In order to avoid wasteful duplication of effort, the HUGO-sponsored Human Genome Sequencing Index identifies priority chromosomes or subchromosomal regions targeted by individual sequencing centers (electronic reference 4). Currently, chromosome 22 is set to be the first human chromosome to be completed (Figure 13.7).

Figure 13.5. Progress in human genome sequencing.

Figure 13.5

Progress in human genome sequencing. See also the Genome Monitoring Table maintained at the European Bioinformatics Institute (EBI; electronic reference 4).

Figure 13.6. Large-scale DNA sequencing at the Sanger Centre.

Figure 13.6

Large-scale DNA sequencing at the Sanger Centre. The Sanger Centre at Hinxton, UK, is the largest single contributor to human genome sequencing. Data can be accessed at

Figure 13.7. Status of human chromosome 22 sequencing in May 1999.

Figure 13.7

Status of human chromosome 22 sequencing in May 1999. The figure represents the map status for chromosome 22 accessed from the Sanger Centre's website ( during May 1999. Because of heterochromatic regions on 22p, the major effort (more...)

Partly in response to competition from the private sector (Box 13.2), the UK Wellcome Trust and the US National Human Genome Research Institute have collaborated to bring forward the timescale for completion of the Human Genome Project. The aim is to produce a working draft, comprising about 90% of the human genome, by the year 2000. The Sanger Centre is expected to produce 33% of the working draft and the three major American genome sequencing centers (Washington University School of Medicine at St. Louis, Baylor College of Medicine and the Whitehead/Massachussetts Institute of Technology) are expected to achieve 60% between them. Other centers, notably in France, Germany and Japan, are also committed to sequencing specific subchromosomal regions (electronic reference 4). After completion of the working draft, the full genome sequence is expected to be achieved around 2002-2003.

Box Icon

Box 13.2

Co-operation, competition and controversy in the genome projects. Because of their scale, the genome projects are major undertakings. In many cases, there have been laudable examples of cooperation: sharing of resources between different centers and labs, (more...)

13.2.5. Programs looking at human genome sequence variation aim to understand human evolution and to facilitate identification of disease genes

At the outset, the Human Genome Project was conceived as a project to obtain the nucleotide sequence of a collection of cloned human DNA fragments collectively amounting to one or a very few haploid genomes. What it did not consider was the genetic diversity of humans. Information on human genetic diversity has subsequently been considered desirable in three major contexts:

  • Human evolution. The information should be of help in anthropological and historical research in tracking human origins, prehistoric population movements and social structure.
  • Identification of common disease genes and factors which confer susceptibility to or protection from disease. Common diseases are multifactorial, and it can be frustratingly difficult to identify the underlying genes. Genetic differences between human populations can also make some populations more susceptible to particular diseases while others are comparatively protected.
  • Forensic anthropology. The accuracy of DNA fingerprinting, a widely used tool in forensic science, is dependent, in part, on knowing how the DNA markers detected in fingerprinting vary from one population to the next.

The Human Genome Diversity Project

The idea of a global effort to study human genome sequence diversity, the Human Genome Diversity Project, was proposed by Cavalli-Sforza et al. (1991). The emphasis was predominantly focused on the need to collect DNA samples from a large number of ethnic groups. However, although supported by HUGO, this project has been in considerable difficulty (Harding and Sajantila, 1998). From the outset it has been dogged by a conspicuous lack of funding. As the primary aim of this project is simply to find markers for ethnic groups and to trace the origins of human migrations and ancestral lineages, the case for large-scale funding has not been persuasive. The project has also been beset by controversy. In some cases, researchers have visited isolated human populations on the margins of survival, obtained samples quickly without taking time to explain their significance and then left, with little or no subsequent communication. While proponents refer to the need to safeguard our cultural heritage, critics have witheringly used terms such as helicopter genetics to refer to the insensitive ‘quickly-in, quickly-out’ approach often used to obtain samples.

Single nucleotide polymorphism (SNP) maps

Human genetic maps based on microsatellite markers, although extremely valuable, have some limitations. In particular, although such markers are found all over the genome their density is limited to about one per 30 kb. In addition, typing of microsatellite markers is not so amenable to automation on a very large scale. By contrast, single nucleotide polymorphisms (SNPs) are very frequent (about 1 per kb) and typing is easily automated because they have only two alleles (see Section 11.2.3). As a result, they have been envisaged to have potentially powerful applications in association studies to identify genes underlying polygenic disease (Collins et al., 1997; Schafer and Hawkins, 1998). The first steps toward establishing a third-generation SNP-based genetic map have recently been described by Wang et al. (1998). Partly in response to initiatives from the private sector (see Box 13.2), the US Human Genome Project and the UK Wellcome Trust have committed funds for the construction of a map containing 100 000 SNPs by 2003. However, the utility of SNPs remains unproven and some data are emerging which have dampened the initial huge optimism (Pennisi, 1998).

13.3. Model organism and other genome projects

Mapping the human genome is not the only scientific focus of the Human Genome Project; at its outset, the value of sequencing genomes of model organisms was recognized. Such organisms include a variety of species, some of which have been particularly amenable to genetic analysis (Box 13.3). In part, the sequencing of smaller genomes was also considered as a pilot for large-scale sequencing of the human genome. By 1999 the genomes of about 100 organisms were being sequenced or had already been sequenced (electronic reference 5).

Box Icon

Box 13.3

Model organisms for which genome projects are considered particularly relevant to the Human Genome Project. This bacterium has been the most extensively studied, both biochemically and genetically, and was therefore an early priority for the Human Genome Project. (more...)

13.3.1. The genomes of many prokaryotic organisms have been sequenced, including well-studied experimental models and disease-associated organisms

The diversity of prokaryotic genome sequencing projects

Prokaryotic genomes are typically small (often only one or a few Mb) and are therefore amenable to comparatively rapid sequencing. By 1999 the genomes of a total of 75 different prokaryotes were being sequenced or had already been sequenced (electronic reference 5). The first to be completed (in 1995) was the 1.83 Mb genome of Haemophilus influenzae. This truly was a landmark because it was the first time that the complete genome of a free-living organism had been achieved (Tang et al., 1997). Subsequently, a variety of other firsts were achieved: the genome of the smallest autonomous self-replicating entity (Mycoplasma genitalium in 1995), the first archaeal genome (Methanococcus jannaschii in 1996), and then the important achievement of the complete sequence of the 4.6 Mb of E. coli achieved by an American group led by Fred Blattner and, independently, by a Japanese group led by Hirotada Mori and Takashi Horiuchi (see Pennisi, 1997).

The list of prokaryotic organisms whose genomes have been sequenced reveals different priorities. In some cases, the driving force was to understand evolutionary relationships between different organisms, as in the case of the archaeal genomes (Olsen and Woese, 1997), and in the case of Mycoplasma genitalium it was to understand what constitutes a minimal genome. In other cases, as in E. coli and B. subtilis, the priority was simply to further basic research using organisms that had been well studied in the laboratory. For many researchers, the big prize has been E. coli, the bacterium that has been the most intensively studied, biochemically and genetically. Surprisingly, given the huge amount of prior investigation, almost 40% of the initially identified 4288 genes have no known function and are now the subject of intensive investigation. For many other organisms, however, the primary motivation for genome sequencing has been their medical relevance.

Disease-related prokaryotic genome projects

The UK Wellcome Trust has been a major supporter of genome projects for microbial pathogens and its Beowulf Genomics program has established a large number of such programs at the Sanger Centre (electronic reference 6). Various other organizations have supported similar programs. In some cases, prokaryotes have been selected for genome sequencing because of their known associations with chronic diseases (Danesh et al., 1997). They include Helicobacter pylori (associated with peptic ulcers) and Chlamydia penumoniae (associated with respiratory disease and also with coronary heart disease). Other completed projects have yielded the genomes of prokaryotes known to be causative agents of disease (Table 13.4), such as Mycobacterium tuberculosis (Cole et al., 1998), Treponema pallidum (the causative agent of syphilis) and Rickettsia prowazekii (the causative agent of typhus, also of interest because it is thought to be closely related to the prokaryotic precursor of mitochondria - see Section 14.1.1). In addition to having a more complete understanding of these organisms, the new information can be expected to lead to more sensitive diagnostic tools and new targets for establishing drugs and vaccines.

Table 13.4. Germ wars: examples of pathogenic microorganisms for which genome projects have been developed.

Table 13.4

Germ wars: examples of pathogenic microorganisms for which genome projects have been developed.

13.3.2. The S. cerevisiae genome project was the first of several protist genome projects to be completed and provided for the first time the entire DNA sequence of a eukaryote

The Saccharomyces cerevisiae genome project

The budding yeast S. cerevisiae is a single-celled eukaryote (protist) which has been the subject of intensive genetic analyses. Its 16 chromosomes were sequenced by European and American consortia and the complete sequence was reported by Goffeau et al. 1996. This represented another milestone in biology: the first complete sequence for a eukaryotic cell. The data indicate that yeast genes are closely clustered, being spaced, on average, once every 2 kb. Of the 6340 genes, about 7% specify a mature RNA species. Although it has been one of the most intensely studied organisms, 60% of its genes had no experimentally determined function. However, a sizeable fraction of yeast genes have an identifiable mammalian homolog, and in only about 25% of yeast genes is there no clue whatsoever to their function (Botstein et al., 1997). The successful conclusion of this project has now opened up large-scale functional analyses (see Section 13.4.1).

Other protist genome projects

The other protist genome projects include organisms that have been well studied and are amenable to biochemical and genetic analyses, including the fission yeast Schizosaccharomyces pombe and the molds Aspergillus nidulans and Neurospora crassa. In addition, various organizations, including the Wellcome Trust and the World Health Organization, have been involved in supporting genome projects for various animal protists (protozoans) involved in human parasitic infections (see Table 13.4). In most cases the genome sizes are substantial. For example, the Plasmodium falciparum genome is about 30 Mb in size and contains 14 chromosomes, some of which had been sequenced at the time of writing (Gardner et al., 1998).

13.3.3. The Caenorhabditis elegans genome project was the first of several animal genome projects to be completed

The Caenorhabditis elegans genome project

In addition to the numerous microbial genome projects, a variety of animal and plant genome projects are underway, of which the C. elegans project was the first to be completed. Because of its large genome size (nearly 100 Mb), the C. elegans genome project was viewed as the major pilot model for large-scale sequencing of the human genome. From the research point of view, this organism, although a simple one only about 1 mm long, was an important model of development and was also useful for modeling other processes relevant to human cells (see Box 13.3). A consortium of the Sanger Centre and Washington University Sequencing Center in St Louis reported the essentially finished sequence at the end of 1998 (C. elegans Sequencing Consortium, 1998).

The genome project identified a total of 19 099 polypeptide-encoding genes and over 1000 RNA-encoding genes, giving an average spacing of one gene every 5 kb. A surprisingly high number of the genes appear to occur as part of an operon, where individual genes are transcribed as part of a large multigenic RNA transcript. Comparison with published sequences from elsewhere reveals that about one in three C. elegans genes shows similarities to previously known genes, and 12 000 of the polypeptide-encoding genes are of unknown function. Now major efforts are being made to investigate specific gene function, and large-scale chemical mutagenesis programs are seeking to produce large numbers of mutant phenotypes.

Other animal genome projects

Other projects include the following:

  • The D. melanogaster genome project. This was established largely as a collaboration between the University of California at Berkeley laboratory and a consortium of European laboratories (Rubin, 1998). Of the 165 Mb genome, about 125 Mb is euchromatic. Initially, the target for sequencing the 125 Mb euchromatic component was the year 2001, but collaboration with industrial partners may mean that it is finished much earlier than this (see Box 13.2). Transposable P element insertion is being employed to carry out insertional mutagenesis on a large scale (Spradling et al., 1995).
  • The mouse genome project. Because of various features (see Box 13.3), the mouse provides the model genome which is most relevant to the Human Genome Project and is expected to have about the same number of genes. Good genetic maps exist. By May 1999 more than 10 000 genes, of which about 7 000 had been mapped, had been entered into the mouse genome database maintained at the Jackson Laboratories, Bar Harbor (electronic reference 7), and large mouse EST sequencing programs were under way. A working draft of the sequence of the 3000 Mb genome is expected by 2005.

13.4. Life in the post-genome (sequencing) era

Once the sequence of the human genome is known, what difference will it make? Certainly, there will be a huge boost to basic research as we grapple with the fundamental biological question of how our genome is interpreted to specify a person. In the so-called post-genome era, accurate genetic testing will become widely available, not just for genetic disorders, but also in terms of genetic susceptibility to a variety of different conditions, including infectious diseases. But there may be a downside in terms of discrimination against individuals. Improved treatments can also be expected. The much vaunted gene therapy approaches may prove technically difficult, but the new infomation will undoubtedly assist the development of novel therapies. The following two sections are selective and merely illustrate some of the implications of knowledge of our genome.

13.4.1. Comparative and whole-genome analyses permit large-scale studies of DNA organization and evolution and of gene expression and function

The human genome project had not reached its half-way point before serious consideration was given to what the research priorities should be in the post-genome era. Certainly, the sequencing of whole genomes will provide revolutionary approaches to biomedical research. For the first time, there are opportunities to compare whole genomes and the newly developed field of bioinformatics is set to take off (Gershon et al., 1997; Smith, 1998). Genome-wide analyses of gene expression and function will become a major area of investigation.

Comparative genomics

Comparative genomics involves analysis of two or more genomes to identify the extent of similarity of various features, or large-scale screening of a genome to identify sequences present in another genome. The examples below are merely meant to be illustrative of some of the applications.

  • Evolutionary relationships. One early application involved comparison of archaeal genomes with eubacterial and eukaryotic genomes to infer evolutionary relatedness (see Section 14.1.1). Once eukaryotic genomes were sequenced, they too could be compared. For example, sequencing of the C. elegans genome permitted an evaluation of how its genes compared with those of a simple eukaryote (S. cerevisiae) and a bacterium (E. coli) (Figure 13.8; C. elegans Sequencing Consortium, 1998).
  • Identification of regulatory elements. We have very limited information about regulatory elements in complex eukaryotic genomes. By referring to databases of known regulatory element sequences, computer programs can inspect new genomic sequences for the presence of regulatory elements, but the efficiency is very low. An alternative approach is to use high evolutionary conservation as a screen for regulatory elements within otherwise poorly conserved noncoding DNA. Large-scale sequencing of orthologous chromosomal regions in reasonably distantly related species should lead to identification of common regulatory elements, in addition to common exons. One example of this approach involves sequencing the genome of Caenorhabditis briggsae and comparing it with that of C. elegans. Although the genomes of these two nematodes are essentially colinear, they diverged from a common ancestor about 80 million years ago. As a result, noncoding sequences other than regulatory regions have undergone complete sequence divergence. Equivalent large-scale comparative sequencing in orthologous human and mouse chromosomal regions is being conducted for the same purpose and pufferfish comparisons have also been important (Clark, 1999).
  • Gene identification. Electronic screening of EST databases can identify homologs of biologically interesting genes in other species. For example, systematic screening of the dbEST database of ESTs has revealed many potentially interesting human homologs of Drosophila genes known to be loci for mutant phenotypes (Banfi et al., 1996).

Figure 13.8. Percentage of matching proteins in three fully sequenced organisms.

Figure 13.8

Percentage of matching proteins in three fully sequenced organisms. The gene number given for the three organisms is the number of polypeptide encoding genes. As expected, the percentage of the 18 891 different C. elegans polypeptides which found a match (more...)

Functional genomics

Functional genomics refers to large-scale or global investigations of gene function. For example, the way in which a cell responds to a particular signal or environmental stimulus can be monitored by simultaneously analysing the expression patterns of every single gene. Already, microarrays have been devised to track the expression of virtually all 6200 yeast genes (Figure 13.9; Wodicka et al., 1997). Strenuous efforts are now being mounted to investigate the function of each of the 6000 or so polypeptide-encoding yeast genes (Winzeler and Davis, 1997).

Figure 13.9. Genome-wide expression screening in Saccharomyces cerevisiae.

Figure 13.9

Genome-wide expression screening in Saccharomyces cerevisiae. The mRNA levels for practically all (about 6100) yeast genes can be monitored simultaneously using the GeneChip® Ye 6100 set from Affymetrix. Each gene is analyzed with about 20 pairs (more...)

Similar types of approaches will also permit extensive investigation of human gene function and dissection of complex regulatory pathways. Once all human genes are known, we can know all the products. We are accustomed to large-scale DNA analyses (which bred the term genome); now as we explore RNA and polypeptide expression products on a global scale, new terms are being coined: the transcriptosome (the total collection of RNA transcripts in a cell) and the proteome (the total collection of polypeptides/proteins expressed in a cell). The new science of proteomics is devoted to the study of global changes in protein expression and the systematic study of protein-protein interactions (Blackstock and Weir, 1999; Dove, 1999). The former can be tracked by using high-throughput expression assays (such as 2D gel electrophoresis and mass spectrometry); the latter by large-scale application of methods such as two hybrid screening (Section 20.4.1). For example, this approach is now being used to systematically screen the yeast proteome (Lecrenier et al., 1998).

13.4.2. Without proper safeguards, the Human Genome Project could lead to discrimination against carriers of disease genes and to a resurgence of eugenics

Any major scientific advance carries with it the fear of exploitation. The Human Genome Project is no exception, and the perceived benefits of the project can also have a downside. For example, once we know all the human genes and can detect large numbers of disease-associated mutations, there will be enormous benefit in targeting prevention of disease to those individuals who can be shown to carry disease genes. However, the same information can also be used to discriminate against such individuals by insurance companies. For example, there is the very real prospect of insurance companies insisting on genetic screening tests for the presence of genes that confer susceptibility to common disorders, such as diabetes, cardiovascular disease, cancers and various mental disorders. Perfectly healthy individuals who happen to be identified as carrying such disease-associated alleles may then be refused life or medical insurance. Clearly, such discrimination is practised on a small scale at the moment; what is alarming to many is the prospect of discrimination against a very large percentage of the individuals in our society. It is also important to preserve people's right not to know. A fundamental ethical principle in all genetic counseling and genetic testing is that genetic information should be generated only in response to an explicit request from a fully informed adult patient.

Another troublesome area is the question of biological determinism and whether comprehensive knowledge of human genes could foster a revival of eugenics, the application of selective breeding or other genetic techniques to ‘improve’ human qualities (Garver and Garver, 1994). In the past, negative eugenic movements in the US and Germany severely discriminated against individuals who were adjudged to be inferior in some way, notably by forcing them to be sterilized. The possibility also exists of a preoccupation with genetic enhancement to positively select for heritable qualities that are judged to be desirable (see Section 22.6). In recognition of the above problems, the US Human Genome Project has devoted considerable resources to support research into the ethical, legal and social impact of the project.

Further reading

  1. Borsani G, Ballabio A, Banfi S. A practical guide to orient yourself in the labyrinth of genome databases. Hum. Mol. Genet. (1998);7:1641–1648. [PubMed: 9735386]
  2. Database issue of Nucleic Acid Research. Nucleic Acids Res. (1999) ;17:3441–3665.
  3. Guyer M S, Collins F S. The Human Genome Project and the future of medicine. Am. J. Dis. Child. (1993);147:1145–1152. [PubMed: 8237907]
  4. Wilkie T (1993) Perilous Knowledge: the Human Genome Project and its Implications. Faber and Faber, New York.
  5. US Department of Health and Human Services and US Department of Energy (1990) Understanding our genetic inheritance. The US Human Genome Project: the first five years FY 1991-1995.
  6. US National Institute of Health and Department of Energy (1993) Genetic Information and Health Insurance. US NIH-DOE Working Group on Ethical and Social Implications of Human Genome Research.

Electronic information on the Human Genome Project (and related projects)

    Useful websites are found at:

Electronic references

  1. GeneMap'99 at http://www​
  2. The Cancer Genome Anatomy Project at http://www​
  3. The Human Genome Sequencing Index at http://www​
  4. The Genome Monitoring Table maintained at the European Bioinformatics Institute at http://www​
  5. Terry Gaasterland's running list of genome projects at http://www-fp​.mcs.anl​.gov/~gaasterland/genomes.html.
  6. Beowulf Genomics at http://www​
  7. The Mouse Genome Database at http://www​


  1. Adams M D, Kelley J M, Gocayne J D. Complementary DNA sequencing: expressed sequence tags and Human Genome Project. Science. (1991);252:1651–1656. [PubMed: 2047873]
  2. Andrade M A, Sander C, Valencia A. Updated catalogue of homologues to human disease-related proteins in the yeast genome. FEBS Lett. (1998);426:7–16. [PubMed: 9598968]
  3. Banfi S, Borsani G, Rossi E. et al. Identification and mapping of human cDNAs homologous to Drosophila mutant genes through EST database searching. Nature Genet. (1996);13:167–174. [PubMed: 8640222]
  4. Blackstock W P, Weir M P. Proteomics: quantitative and physical mapping of cellular proteins. Trends Biotechnol. (1999);17:121–127. [PubMed: 10189717]
  5. Botstein D, White R L, Skolnick M, Davis R W. Construction of a genetic linkage map in man using restriction fragment length polymorphism. Am. J. Hum. Genet. (1980);32:314–331. [PMC free article: PMC1686077] [PubMed: 6247908]
  6. Botstein D, Chervitz S A, Cherry J M. Yeast as a model organism. Science. (1997);277:1259–1260. [PMC free article: PMC3039837] [PubMed: 9297238]
  7. C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. (1998);282:2012–2017. [PubMed: 9851916]
  8. Cantor C R. How will the human genome project improve our quality of life? Nat. Biotechnol. (1998);16:212–213. [PubMed: 9527989]
  9. Cavalli-Sforza L L, Wilson A C, Cantor C R, Cook-Deegan R M, King M C. Call for a worldwide survey of human genetic diversity: a vanishing opportunity for the Human Genome Project. Genomics. (1991);11:490–491. [PubMed: 1769670]
  10. Chumakov I, Rigavit P, Guillou S. et al. Continuum of overlapping clones spanning the entire human chromosome 21q. Nature. (1992);359(6394):380–387. [PubMed: 1406950]
  11. Clark M S. Comparative genomics: the key to understanding the Human Genome Project. BioEssays. (1999);21:121–130. [PubMed: 10193186]
  12. Cohen D, Chumakov I, Weissenbach J. A first generation physical map of the human genome. Nature. (1993);366:698–701. [PubMed: 8259213]
  13. Cole S T, Brosch R, Parkhill J. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. (1998);393:537–544. [PubMed: 9634230]
  14. Collins F, Guyer M S, Chakravarti A. Variations on a theme: cataloging human DNA sequence variation. Science. (1997);262:43–46. [PubMed: 9411782]
  15. Collins F, Patrinos A, Jordan E. et al. New goals for the US Human Genome Project: 1998-2003. Science. (1998);282:682–689. [PubMed: 9784121]
  16. Danesh J, Newton R, Beral V. A human germ project? Nature. (1997);389:21–24. [PubMed: 9288960]
  17. Deloukas P, Schuler G D, Gyapay G. A physical map of 30 000 human genes. Science. (1998);282:744–746. [PubMed: 9784132]
  18. Donis-Keller H, Green P, Helms C. A genetic linkage map of the human genome. Cell. (1987);51:319–337. [PubMed: 3664638]
  19. Dove A. Proteomics: translating genomics into products? Nat. Biotechnol. (1999);17:233–236. [PubMed: 10096288]
  20. Fire A, Xu S, Montgomery M, Kostas S A, Driver S E, Mello C C. Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature. (1998);391:806–811. [PubMed: 9486653]
  21. Gardner M J, Tettelin H, Carucci D J. Chromosome 2 sequence of the human malaria parasite Plasmodium falciparum. Science. (1998);282:1126–1128. [PubMed: 9804551]
  22. Garver K L, Garver B. The Human Genome Project and eugenic concerns. Am. J. Hum. Genet. (1994);54:148–158. [PMC free article: PMC1918077] [PubMed: 8279465]
  23. Gershon D, Sobral B W, Horton B, Wickware P, Gavaghan H, Strobl M. Bioinformatics in a post-genomics age. Nature. (1997);389:417–422. [PubMed: 9324648]
  24. Goffeau A, Barrell B G, Bussey H. Life with 6000 genes. Science. (1996);274:546–567. [PubMed: 8849441]
  25. Harding R M, Sajantila A. Human genome diversity - a Project? Nature Genet. (1998);18:307–308. [PubMed: 9537409]
  26. Hudson T J, Stein L D, Gerety S S. et al. An STS-based map of the human genome. Science. (1995);270:1945–1954. [PubMed: 8533086]
  27. Ichikawa H, Hosoda F, Arai Y, Shimizu K, Ohira M, Ohki M. A NotI restriction map of the entire long arm of human chromosome 21. Nature Genet. (1993);4:361–365. [PubMed: 8401583]
  28. Knoppers B M. Status, sale and patenting of human genetic material: an international survey. Nature Genet. (1999);22:23–25. [PubMed: 10319857]
  29. Lecrenier N, Foury F, Goffeau A. Two-hybrid systematic screening of the yeast proteome. BioEssays. (1998); 20:1–6. [PubMed: 9504041]
  30. McKusick V. HUGO news. The Human Genome Organization: history, purposes, and membership. Genomics. (1989);5:385–387. [PubMed: 2676842]
  31. Meisler M H. The role of the laboratory mouse in the human genome project. Am. J. Hum. Genet. (1996);59:764–771. [PMC free article: PMC1914805] [PubMed: 8808590]
  32. Murray J C, Buetow K H, Weber J L. A comprehensive human linkage map with centimorgan density. Science. (1994);265:2049–2054. [PubMed: 8091227]
  33. Olsen G J, Woese C R. Archaeal genomics: an overview. Cell. (1997);89:991–994. [PubMed: 9215619]
  34. Pennisi E. Laboratory workhorse decoded. Science. (1997);277:1432–1434. [PubMed: 9304210]
  35. Pennisi E. A closer look at SNPs suggests difficulties. Science. (1998);281:1787–1789. [PubMed: 9776677]
  36. Perrimon N. New advances in Drosophila provide opportunities to study gene functions. Proc. Natl Acad. Sci. USA. (1998);95:9716–9717. [PMC free article: PMC33881] [PubMed: 9707540]
  37. Roush W. A zebrafish genome project? Science. (1997);275:923. [PubMed: 9053993]
  38. Rubin G M. The Drosophila genome project: a progress report. Trends Genet. (1998);14:340–343. [PubMed: 9769725]
  39. Schafer A J, Hawkins J R. DNA variation and the future of human genetics. Nat. Biotechnol. (1998);16:33–39. [PubMed: 9447590]
  40. Smith T F. Functional genomics - bioinformatics is ready for the challenge. Trends Genet. (1998);14:291–293. [PubMed: 9676532]
  41. Spradling A C, Stern D M, Kiss I, Roote J, Laverty J, Rubin G M. Gene disruptions using P transposable elements: an integral component of the Drosophila genome project. Proc. Natl Acad. Sci. USA. (1995);92:10824–10830. [PMC free article: PMC40524] [PubMed: 7479892]
  42. St John M A, Xu T. Understanding human cancer in a fly? Am. J. Hum. Genet. (1997);61:1006–1010. [PMC free article: PMC1716045] [PubMed: 9345112]
  43. Tang C M, Hood D W, Moxon E R. Haemophilus influence: the impact of whole genome sequencing on microbiology. Trends Genet. (1997);13:399–404. [PubMed: 9351341]
  44. Thomas S M, Davies A R W, Birtwistle N J, Crowther S M, Burke J F. Ownership of the human genome. Nature. (1996);380:387–388. [PubMed: 8602236]
  45. Venter J C, Smith H O, Hood L. A new strategy for genome sequencing. Nature. (1996);381:364–365. [PubMed: 8632789]
  46. Wang D G, Fan J B, Siao C J. Large-scale identification, mapping and genotyping of single nucleotide polymorphisms in the human genome. Science. (1998);280:1077–1082. [PubMed: 9582121]
  47. Weissenbach J, Gyapay G, Dib C, Vignal A, Morissette J, Millasseau P, Vaysseix G, Lathrop M. A second generation linkage map of the human genome. Nature. (1992);359:794–801. [PubMed: 1436057]
  48. Winzeler E A, Davis R W. Functional analysis of the yeast genome. Curr. Opin. Genet. Dev. (1997);7:771–776. [PubMed: 9468786]
  49. Wodicka L, Dong H, Mittmann M, Ho M -H, Lockhart D J. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nat. Biotechnol. (1997);15:1359–1367. [PubMed: 9415887]
Copyright © 1999, Garland Science.
Bookshelf ID: NBK7562