NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Research Council (US) Committee on Advances in Collecting and Utilizing Biological Indicators and Genetic Information in Social Science Surveys; Weinstein M, Vaupel JW, Wachter KW, editors. Biosocial Surveys. Washington (DC): National Academies Press (US); 2008.

Cover of Biosocial Surveys

Biosocial Surveys.

Show details

9Are Genes Good Markers of Biological Traits?


The 20th century has been called “the century of the gene” (Keller, 2000), an era of unprecedented progress in the understanding of inheritance. The 20th century began with the rediscovery of Mendel's laws, witnessed the discovery of the structure and nature of DNA, and concluded with the launching of the human genome project. The result has been an accumulation of genetic information far beyond the comprehension of any single scientist. Nongeneticists are confronted almost daily with new facts about genes that seem to be of enormous relevance to their lives. What is one to make of these discoveries? How should one think about them?

I recently had to deal with this problem as a biologist writing about development in relation to evolution. Genetics is the foundation of modern evolutionary biology, so I had no doubt about the importance of genes. But I wanted to explain the evolution of phenotypes—the observable behavioral, physiological, and morphological characteristics of organisms. My main expertise is in the social behavior of insects, which made me keenly aware of the adaptive flexibility of behavior and the hormonally mediated links between gene expression and morphological diversification. I was especially aware that the dramatic differences between sterile social-insect “workers” and egg-laying “queens” do not depend on genetic differences between individuals, but instead spring from differences in their social environments, in particular the dominance relations among adults, and their diets as larvae. From such a background it is clear that phenotypic traits are not determined by genes alone. How then should one think about the formative role of genes? In this essay I discuss some results of my own struggles with this problem that may help other nongeneticists think about genes. The companion volume to this one, Cells and Surveys (National Research Council, 2001) contains a chapter (Wallace, 2001) on genetic markers in population surveys of human traits whose language could serve as a model of meticulous accuracy in the discussion of genetic data. This chapter is intended as a kind of reader's guide for how to relate genes to phenotypic traits in general, in order to better interpret research results and public discussions that attempt to relate genes to particular human characteristics. For a more thorough discussion, see West-Eberhard (2003, Part I).


To understand how any complex apparatus works, it helps to know how it was put together. For traits of organisms, this means understanding how they develop and how they have evolved. Sometimes evolution is depicted as a process of random genetic mutation and selection. Advances in the molecular genetics of gene expression and development, as well as in phylogenetic methods that permit more accurate histories of organismic change, support a different view: novel traits originate via the developmental reorganization of ancestral phenotypes, not just by a series of random mutations and their cumulative new effects. That is, the traits one observes have been assembled via the reorganization of older traits, with old genes used in new combinations. Furthermore, developmental reorganization can be initiated by environmental factors, as well as by mutations. In keeping with the universally acknowledged importance of environment in development, environmental induction can play an important role in the reorganizational origins of novel traits (for a summary and extensive documentation see West-Eberhard, 2003, Chapters 9-18, on evolution by developmental reorganization; Chapters 6, 20, 26 on the role of environmental factors).

These findings are relevant to the search for genetic markers—genetic loci whose different alleles correlate strongly with, and can therefore be used to predict, variation in human traits. First, due to change by reorganization of gene expression, related organisms or populations can have markedly distinctive characteristics, or “phenotypes,” without having a large number, or any, distinctive new genes or genetic alleles (alternative DNA sequences at the same chromosomal locus). This is illustrated by the small genetic distance between humans and chimpanzees despite considerable differences in their behavioral and morphological phenotypes (King and Wilson, 1975). Second, the reuse of the same genes in different contexts means that a gene found to be crucial for variation in a trait of interest—a disease phenotype or a demographic property like longevity or fertility—may prove to be expressed commonly in other contexts or may have different effects at different life stages (see, e.g., Ewbank, 2000, on the ApoE gene). Research on markers needs to take into account life stage and other contextual and environmental variables in the search for reliable predictors of particular traits. The role of the environment in the induction of genetically complex, reorganized phenotypes, as when fetal undernutrition affects the expression of obesity, diabetes, coronary heart disease, and hypertension in large numbers of adults (Osmond and Barker, 2000), is a reminder that human populations can contain appreciable frequencies of complex, well-defined phenotypic variants that do not correspond to genetic variants.

An evolutionary perspective also adds to the arsenal of techniques that can be brought to bear in the search for genetic causes. Evolutionary biologists routinely compare different species and populations to illuminate the functions and causes of particular traits by ascertaining their correlations with other traits and with the conditions of life. Comparative study can facilitate the selection of organisms that are particularly likely to throw light on particular questions of interest to demographers and social scientists, and can pinpoint traits of unsuspected value for testing certain ideas.

For example, a recent survey of a very wide range of species revealed a correlation between slow embryonic development and rapid aging among different species of birds and mammals (Ricklefs, 2006). And a phylogenetic study of parrots revealed peculiarities of the mitochondrial genome that may be associated with the unusual longevity of parrots among birds (Eberhard, Wright, and Bermingham, 2001; Wright and Eberhard, in press). In addition, evolutionary biologists are practiced in the analysis of the population genetics of genetically complex (polygenic) traits, like most of those of interest to epidemiologists and demographers (Sing, Haviland, Templeton, Zerba, and Reilly, 1992; Sing, Haviland, Templeton, and Reilly, 1995). And the disciplined practice of wondering about the functional, adaptive significance of particular kinds of genes can lead to new insights with practical results. For example, an evolutionary hypothesis regarding the significance of parentally imprinted genes—genes whose expression depends on the parent of origin—has led to a focus on these particular genes as possibly being involved in the development of autism and related disorders (Badcock and Crespi, 2006).

The past belief in the importance of mutation for the origin of novel traits has helped to perpetuate misunderstanding of the role of single genes in the evolution and development of complex traits. The notion of one-gene, one-phenotype is now widely acknowledged by biologists to be mistaken, but it is still an intuitively attractive idea that is reinforced by textbook exercises on Mendelian genetics and by research on bacterial genomes and their molecular phenotypes (e.g., Ptashne, 1992). Modern research on development and evolution in multicellular organisms helps pave the way for a more realistic view of the role of genes in the production of complex phenotypes in humans and other organisms. The question of single-gene control is further explored in the next two sections.


A “trait” is simply a somewhat discrete characteristic of an organism. It could be an aspect of morphology, a physiological state, a behavior, a molecule, or a disease, but the implication is that it is a product of development that is qualitatively distinct relative to other aspects of the organism. Some authors use the term “module” to describe a discrete trait. In operational terms, a discrete or modular trait can be defined as a product of a separate developmental pathway. But it is more accurate to say that a trait is “somewhat discrete” rather than “discrete,” or that it is “modular” rather than “a module” because no trait is completely independent of all other traits in an integrated individual organism. In addition to the discrete on-off qualitative traits of organisms, there are other traits, such as body size or longevity, that are “quantitative traits”—features that are described in terms of their numerically measurable (quantifiable) values (e.g., weight, mass, or life span). Discrete, qualitative traits have dimensions (for example, the length of a bone, the duration of a behavior) that can be measured as quantitatively variable traits.

Examples of discrete traits are differentiated tissues like skin, bone, or blood; a differentiated sex (male or female); a behavior like courtship, laughter, or aggressive attack; or a disease like schizophrenia or the flu. Each of these complex traits involves the expression of a specific set of genes or the use of a specific set of gene products. In the development of an individual organism, a discrete trait is manifested or “expressed” when a threshold for its production is passed. Particular genes and their products are not always “on” or in use. An outburst of laughter, for example, presumably has some threshold of perception or sensitivity that is passed when it is triggered, and then there is another threshold for bringing it to an end. The same is true for physiological states and for the growth and differentiation of morphological traits. The timing of the “on” and the “off” determines the value of the quantitative dimension of a discrete trait, so one can measure the duration of laughter, the strength of a physiological response like a muscle contraction, the length of a bone, or the time of onset and duration of a disease.

To illustrate the genetic structure of a complex trait, consider a relatively simple and distinctive behavioral trait like laughter. How can one characterize the genetic underpinnings of such a trait? A laughter gene—or, more accurately, a gene that influences the ability of an individual to laugh—could be a gene that affects the structure of the vocal chords, the form of the facial muscles that participate in the expression of mirth, and the muscles of the diaphragm that enable bursts of air to produce the characteristic sounds of laughter. These sets of genes—those that are expressed or whose products are used when a trait is manifested by an individual organism—are modifiers of form. Another set of genes—modifiers of regulation—influence whether or not, and when, the trait is expressed, or turned on and off. The modifiers of regulation might include genes that affect the sensory acuity (of vision, audition, and touch) that enable perception of laughter-provoking stimuli and the genes that affect the central processing of those stimuli in the brain.

Sometimes the modifiers of regulation are described as acting “upstream” of the threshold point that turns the trait on; the modifiers of form are described as being expressed “downstream” of the switch (Figure 9-1). The threshold, or switch point, is a decision point, the point at which the expression of a trait is said to be determined. Clearly, both the modification of regulation and the modification of form can be highly polygenic, since all of these contributing systems are themselves genetically complex.

FIGURE 9-1. The development of a phenotypic trait.


The development of a phenotypic trait. Development can be visualized as a series of branching pathways, each modular trait being the end point of a developmental pathway or branch. Trait determination occurs when a new branch is formed at a decision or (more...)

The genes themselves do not determine the expression of a trait like laughter—or any other trait. The genetically influenced laughter apparatus requires some environmental factor—a joke, a comical event, or a siege of tickling—to be turned on. The expression of the trait is jointly modified by genes (e.g., those influencing the level of the threshold) and the environment. The environment further influences the structure that responds, for no gene can act alone to produce a structure (e.g., muscle, nerve, and bone) without stimuli and materials that are environmental in origin. Environmental factors that influence regulation and form can be as specific and precise in their properties as are the genomic instructions themselves, as illustrated by the precision of day-length cues that trigger hibernation and diapause in the winter physiologies of organisms, and the dietary elements, such as specific vitamins and amino acids, that are required for normal human development.

The intertwined genomic and environmental influences on trait expression are further complicated by the fact that previous episodes of gene expression themselves add to the effective environment: genes act within cells, and gene products become part of the internal environment for subsequent gene expression. Previous gene expression and interaction with the environment also contribute to the physical and social environments of any subsequent episode of gene expression. In spite of this intricate history of genetic influence, it would be inaccurate to think that the genome can run everything, for no gene can act to produce a developmental signal or a protein product without materials, energy, and often cues ultimately derived from outside the organism.

The complexity of trait structure is sometimes revealed by partial expression of the trait. In the full expression of laughter, for example, a single input (such as a tickle) affects a threshold of trait determination (the initiation of laughter) that coordinates a compound response, with various, otherwise somewhat independent, aspects of the phenotype, such as respiration, facial expression, and vocal output brought into play in synchrony. At a low level of stimulation, some of these elements of the compound response might be omitted, if the threshold for all is not passed. In such a compound or mosaic trait, several different semi-independent components, each influenced by a separate set of underlying genes, are brought into play. Similarly, an infection may induce a “typical” full set of disease symptoms, but some may be absent at sub- or near-threshold expressions of the disease.

Variation in quantitative traits likewise typically depends on a multitude of genes and on environmental factors. Body size, for example, may be influenced by metabolic efficiency, appetite, foraging ability, disease resistance, and the ability to learn how to obtain food—all genetically complex aspects of the individual phenotype. And obviously, body size is affected by such environmental factors as food availability, the energy demands of fluctuating physical conditions, and the incidence of predation or disease.

This is a grossly simplified characterization of the genetic underpinnings of trait development and variation,1 but it is a useful guide for relating genes to observable phenotypic traits. Complex off-on traits, as well as quantitative traits, are universally polygenic in both regulation and form, as well as universally environmentally sensitive in their expression. In addition, discrete trait expression involves two major sets of genes: those upstream of the trait, whose expression or products influence whether or not and when the trait is produced; and those downstream of the trait, whose expression or whose products modify the form of the trait itself, rendering it distinctive relative to other traits of the same organism.


In spite of the genetic complexity of most traits of interest to biologists, medical researchers, and social scientists, it has long been common for theoretical geneticists to refer to genes “for” particular traits, like coat color or altruism. It is also common for biologists to think in terms of a gene for a trait in classroom explanations of how Mendelian genetics and trait evolution work. So language that refers to a gene “for” a complex trait is by no means new or unusual among biologists. But this linguistic convention has taken on a new significance in the present age, when genes associated with particular traits can be biochemically characterized, assigned names, and precisely located on particular chromosomes. What was formerly an abstract manner of speaking has become a real assertion about the genetic determination of traits.

For some years I have been making a collection of genes “for” human traits announced in scientific journals and the popular press. The collection began in the early 1990s with the discovery of “the obesity gene.” It now includes, among others, announcements of genes for language, intelligence, alcoholism, bipolar illness, deafness, schizophrenia, asthma, longevity, and maternal solicitude. There is also a warrior gene, a “gay” (sexual orientation) gene, and even a graceful jaw gene responsible for a human facial trait not present in other primates.

What is the empirical basis for these announcements? Usually the gene-for-trait claim is based on the discovery of a mutation or a genetic knockout that dramatically affects the probability of expressing the trait. What this means in developmental terms is that one of the many modifiers of regulation of the trait, for example, one of the polygenes that affects its threshold of expression, has been altered and acquires a large, “major gene” effect on trait expression. That is, the mutation's effect makes one of the polygenes predominate in trait determination so that it alone can control the switch for expression of the trait. This is a useful research tool because it renders one of the modifiers of regulation identifiable, whereas in its normal unmutated state its effect might have been so small as to be indistinguishable from those of the many other modifiers of regulation. The mutation's large effect gives the impression of single-gene control of the trait. In fact, the expression or nonexpression of the phenotype in individuals of the population at large is subject to polygenic and environmental influence.

Mutations of large effect are commonly responsible for familial, inherited diseases, giving the false impression of a genetically simple phenotype devoid of environmental effects. Bipolar (manic-depressive) illness is an example of a “genetic” disease inherited within families. Patterns of inheritance within affected populations suggest that the genome of some ancestor suffered a mutation, whose bearers among descendents have an increased probability of manifesting the disease.

If the model of polygenic genetic architecture I have just outlined is correct, several predictions are possible. First, mutations at different genetic loci should be associated with the disease in different families or populations. Second, there should be evidence of environmental influence on incidence of the disease. And third, there should be phenotypic (physiological, morphological, behavioral) evidence of the complexity of mechanisms underlying the disease.

All of these predictions hold for bipolar illness. The disease has been linked to regions on at least eight different chromosomes (Blackwood, Visscher, and Muir, 2001; Kelsoe et al., 2001), and mutations at different chromosomal locations are important in different populations. For example, genetic markers are located on the X chromosome in a Finnish population (Pekkarinen, Terwilliger, Bredhacka, Lannqvist, and Peltonen, 1995); chromosome 18 in Costa Rican families (Freimer et al., 1996); chromosome 22 in a general North American population (Kelsoe et al., 2001) and (at a different locus) in northern European families of Caucasian ancestry (Barrett et al., 2003); and chromosome 12 in a Danish population and (with some chromosomal overlap) in an isolated population on the North Atlantic Faroe Islands, a region colonized by Scandinavians (Degn et al., 2001).

Environmental influence on the expression of bipolar illness is suggested by the fact that occurrence of the depressive phase correlates with day length (incidence is higher in winter); and from the fact that 20 percent of identical twins of affected individuals do not show the disease (Gershon, 1990). So the genetic mutation can be present without being expressed.

The mosaic or compound nature of the bipolar disease phenotype is indicated by clinical variation in disease symptoms of this and related disorders, which has been referred to as the “bipolar spectrum” (Gershon, 1990, p. 380). In bipolar type I disorder, patients have both manic and depressive phases; in bipolar type II disorder, they show hypomania but not a full manic phase; and twin studies have indicated that there is some overlap in vulnerability for unipolar (depressive) and bipolar (manic-depressive) illness. There is also some evidence from studies of genetic relatives for an association between bipolar disease and other illnesses, such as schizoaffective disorder and cyclothymic personality disorder (Gershon, 1990).

Finally, the complexity of mechanisms underlying bipolar disease is evident from various studies. For example, research on signal transduction mechanisms in the brain suggest that bipolar disease may be due to the interaction of many kinds of abnormalities in these systems (Bezchlibnyk and Young, 2002), a conclusion supported by the fact that lithium, a substance long known empirically to bring relief to bipolar patients, appears to act by stabilizing neuronal activities, including signaling activities of several kinds and at multiple levels of influence on neural plasticity (Jope, 1999).

Mutation studies can aid in the genetic dissection of the causes of complex phenotypes. Study of the mutations and brain tissues in bipolar disease, for example, have identified altered function and levels of particular effector molecules, such as protein kinase Lambda (PKΛ) and protein kinase C (PKC) (Bezchlibnyk and Young, 2002), and of G protein receptor kinase 3 (GRK3) (Barrett et al., 2003), as involved with the signal transduction abnormalities associated with the disease.

In sum, single-gene-locus studies are clearly valuable because they permit the identification of genes and pathways that influence the development of complex traits. But they do not tell us that a particular genetic locus “controls” development of the trait. With the exception of a very few, rare diseases that appear to be truly under single-locus control (e.g., cystic fibrosis, Huntington's disease (Huntington's chorea), phenylketonuria, and Smith-Lendi-Opitz syndrome), mutant loci associated with inherited disease are unreliable global markers because disease genotype may vary with the population of origin.

Inaccurate statements about the “genetic determination” or the “environmental determination” of traits are easily fixed by simply using the word “influenced” rather that “determined.” Similarly, a gene “for” a particular trait is more accurately described as a gene that “influences” a particular trait. The single word “influence” turns a misleading headline into an accurate one. This is a linguistic mutation whose spread should be encouraged under strong selective pressure from biologists and social scientists alike.


Three of the genes in my collection of genes “for” traits are genes for quantitatively variable traits—obesity, longevity, and intelligence. The polygenic nature and environmental sensitivity of variation in these traits is so obvious as to need no evidence beyond common experience. The control of the development of a polygenic quantitative trait can be overwhelmed by a mutation of major effect, just as the determination of a discrete trait can be. So the same reservations regarding single-gene markers outlined in the previous section apply to single-gene markers of quantitative traits as well. Here I focus on longevity genes because they are “demo-genes” (Ewbank, 2000)—genes that affect a parameter of interest to demographers.

Evolutionary theory has a branch that deals with the evolution of longevity and senescence. Although this is not my specialty, one idea strikes me as particularly relevant to the search for a developmental-genetic basis of longevity. Williams (1957) has suggested that a postreproductive acceleration of senescence is expected due to a kind of antagonistic pleiotropy—the accumulation of negative pleiotropic postreproductive effects of genes that have positive effects on survival and reproduction at younger (reproductive and prereproductive) stages. Natural selection can only affect traits (and genes) expressed prior to the end of reproduction; postreproductive expression does not usually affect relative reproductive success (selection) unless it somehow affects the reproductive success of cocarriers of the same genetic alleles (e.g., relatives). This is one argument used to explain the “grandmother effect”—the long postmenopausal survival and activity of human females, who often make a substantial postreproductive contribution to rearing their grandchildren (Hawkes, O'Connell, Jones, Alvarez, and Charnov, 1998). Antagonistic pleiotropy predicts that the diseases of aging are especially likely to be associated with early benefits (Williams and Nesse, 1991) and that some genes that contribute to senescence in the elderly may show little allelic variation (polymorphism) within populations, being under strong positive selection during earlier life stages.

Reading the literature on the search for longevity genes, one senses a conviction that there must be some underlying general basis for a long life span if only one could find it. It is a search older than science itself, dating back to the medieval alchemists' search for the elixir that would prolong life. Longevity genes remain the quantitative-trait genes of widest public interest. Not everyone has severely limited intelligence or an obesity problem, but everyone has a limited life span.

Notwithstanding our deep-seated desires, commonsense reasoning and findings to date provide little hope for success in the discovery of major-effect longevity genes. Any of the enormous number of genes that contribute to survival could be considered longevity genes. Among the candidates that have been proposed are several heat-shock protein loci, which are given special attention because they are widespread among organisms, including humans, and are general-purpose responses to several kinds of systemic stress, such as heat stress, oxidants, and starvation. Some heat-shock proteins have been shown to prolong life in transgenic lines of fruit flies subjected to these stresses (Wang, Kazemi-Esfarjani, and Benzer, 2004). I can see no reason to grant higher status to these genes as longevity genes than genes that affect, say, wing development, which likewise influences fruit fly survival in multiple contexts (e.g., location of food sources under starvation, movement into the shade under heat stress, flight to escape predators). Still, the Wang et al. (2004) study represents an advance in molecular research on longevity because, instead of ignoring the environment, it experimentally manipulated the environment to investigate the longevity effects of particular genes.

Ewbank (2000) lists four criteria that a longevity gene must have to be considered demographically useful markers able to predict variation in lifespan: (1) association with the most common causes of death; (2) multiple alleles (genetic polymorphism) of the gene, associated with substantial variation in mortality; (3) large variation in frequencies of these alleles across populations; and (4) correlations of alleles with environmental or behavioral characteristics considered to be associated with mortality rates. Ewbank reports that only one human gene is known to satisfy all of these criteria, the apolipoprotein E (ApoE) gene. The ApoE gene (1) is a major risk factor for ischemic heart disease and Alzheimer disease; (2) has three common alleles (ApoE-2, ApoE-3, ApoE-4), two combinations of them (ApoE-3/ApoE-4 and ApoE-4/ApoE-4) associated with increased risk of both diseases, and two others (ApoE-2/ApoE-2 and ApoE-2/ApoE-3) associated with decreased risk of both diseases—these risks vary with both age and genetic background (Ewbank, 2000, pp. 73-74)—and with environmental circumstances (fat content of diet; Ewbank, 2000, pp. 78-79); (3) frequencies of the three alleles vary geographically (for example, the ApoE-4 allele is unusually common in Africa, where its frequency is.20); and (4) mortality rate associated with genotype is affected by diet, an environmental and behavioral trait of interest. Nonetheless, cautions Ewbank, “it is not likely that any single genotype will explain much of the heterogeneity of mortality under age 80…. To put this in perspective, I estimate that the APOE e4/4 genotype is associated with a relative risk of death at age 80 of about 2 relative to the most common genotype, e3/3. Less than 5 percent of the population has the e4/4 genotype” (Ewbank, 2000, p. 83).

This very reserved conclusion regarding the one gene that fits minimal characteristics of demographic utility suggests that particular genes are not promising markers of longevity. Perhaps there will prove to be genes or patterns of gene expression that do affect general energy or stamina over the long term. If so, parrots may be better model organisms than nematodes or fruit flies in the search for genomic characteristics associated with longevity. For studying phenotypic rather than genetic correlates of longevity with the needed large samples, insects have proven more suitable than either birds or mice (Carey, 2003).



Following Ewbank's (2000) lead, we can list the criteria to be satisfied by a dependable single-locus genetic marker of a phenotypic trait:

  1. The phenotypic trait whose future occurrence is to be predicted by the marker has to be uniformly and operationally well defined, for example, as a specific measurable value of a quantitative trait or a set of consistently associated distinctive characteristics of a qualitative, discrete trait. Intelligence, for example, is an insufficiently well-defined trait to serve as the basis for a biomarker search. Some particular aspect of what we call intelligence, as measured by some well-researched standardized test, would need to be used.
  2. To be of use as a predictor, the marker has to be detectable before the trait appears during individual development. A particular DNA sequence might satisfy this criterion, for genotype does not change during development. A protein specified by a particular DNA sequence is a less dependable marker, for reasons discussed below.
  3. There must be multiple alleles (genetic polymorphism) of the marker gene, highly correlated with measurable variation in the expression of the trait. For qualitative (discrete) traits, this means variation in whether or not the trait is expressed, not variation in some feature of the trait once it is present.
  4. The confidence in the marker is increased if it can be shown that its expression is involved in developmental mechanisms or pathways known to be involved in the development of the trait versus the nondevelopment of a discrete trait or lesser development of a quantitative trait.

Potential Traps

Given the structure of development in relation to gene expression and the polygenic nature of complex traits, there are several kinds of errors likely to appear in discussions of genetic markers and their evaluation by researchers.

Type 1 error.

The assumption that a gene (a genetic allele) commonly expressed in bearers of a trait and not expressed in nonbearers of the trait is a major cause of the trait. This is erroneous for genes expressed downstream of discrete-trait determination, for modifiers of form, while they affect the characteristics of the trait, do not affect the likelihood of its expression during development. A type 1 error is akin to taking a symptom (such as fever) to be a cause of a disease.

Type 2 error.

Unwarranted extrapolation between populations. A mutation that causes a highly heritable, familial trait may prove a useful genetic marker in the descendents of the mutant individual. But other populations with the same trait may have a different mutant cause, as in bipolar illness (see above). The same kind of variation in genetic underpinnings could occur in different populations for any polygenic trait.

Type 3 error.

Use of a protein consistently expressed in individuals bearing the trait. Proteins are sometimes used as gene identifiers, because the specific form of a protein reflects the expression of a particular genetic allele. But the presence of a protein means that the gene has already been expressed, so to serve as a predictor of the future development of the trait, it would have to be the product of a modifier of regulation known to affect the probability of developing the trait, and upstream of trait determination, not a modifier of form (which would not affect probability of trait expression). Consider a disease phenotype, for example. A protein that is consistently a symptom of the disease could not serve as a predictive marker, even though it would be present in all carriers of the disease and might be absent in all noncarriers. Proteins that are expressed very early in trait development may sometimes be useful predictive markers.

Type 4 error.

Use of a genetic allele that produces a protein associated with a trait. For example, having discovered that a particular protein is consistently symptomatic of a disease and not present in nonbearers of the disease, one might erroneously suppose that the presence of the allele responsible for that protein in a genotyped young individual would be a good predictor of the later development of the trait. It is important to realize, however, that genes can be present without ever being expressed, as in the unaffected identical twins of bipolar patients. The presence of the allele would be a necessary, but not a sufficient, condition for expression of the trait.

These kinds of errors are especially likely to appear in pharmaceutical frenzies—races to find genetic markers that enable prediction of disease and allow the identification of molecular processes to be attacked by drugs. Facile discussions of “drug biomarkers” based on proteins and genes belie the complexity of traits and the stringent criteria required of dependable markers. An article on cancer research in Science (Kaiser, 2004) cited an experienced cancer researcher as arguing that, although only a handful of biomarkers are widely used, the sequencing of the human genome and the debut of new, automated mass spectrometry machines for detecting proteins leaves the field ripe for new breakthroughs in the search for biomarkers. Although the map of the human genome can help locate identified genes, by itself it provides no information on genetic variation between individuals and populations of the kind crucial to biomarker research, for each segment of the map is (necessarily, for technical reasons) based on the genome of a single individual (Marshall, 1996); masses of protein data generated by automated machines would need to be accompanied by masses of correlated data on polymorphic genotypes, clinically well-defined phenotypes, life stages, and geographic locations of samples to be useful in biomarker research.

The pessimistic view presented here regarding single-gene markers need not apply to the search for genetic markers in general. But it does suggest that the search should be redirected, away from single genes and perhaps toward “collective markers”—sets of genetic alleles whose summed expression raises the probability of expression of a trait of interest by a specified amount. By first carefully focusing on upstream genetic modifiers of regulation, which are better predictive markers than are modifiers of form, it may be possible to construct collective markers that reflect the polygenic nature of complex traits (see, e.g., Sing et al., 1992, 1995).


Genes for traits have become great publicity devices for scientists anxious to claim applied significance and obtain increased funding for their research. Good advertising demands nice clean language, so “the obesity gene” or “the gene for alcoholism” readily displaces the more accurate “a gene that influences obesity” or “enhances the likelihood of addiction to alcohol.” The public could easily be educated, in classrooms and in the press, to exercise common sense about genetic explanations and to realize that many genes must affect traits like obesity, intelligence, and alcoholism, whose causes are obviously complex. People seem to prefer simple explanations and the promise of simple solutions to complex problems. All too often wishful ignorance prevails, and “genes for traits” have become modern elixirs that can turn wishful ignorance to commercial gold.

My favorite marker-icon is still the obesity gene, a lucrative leader in the gene-age parade that would be delightful in its absurdity had it not been so effectively misleading. Not surprisingly, given the complexity of obesity as a quantitative trait, there are now at least 10 known “obesity genes,” some of them with several allelic variants (Perusse et al., 2005). The first of the several obesity genes was hastily patented and brought millions of dollars for obesity research to the university where it was discovered. Even as I was finishing this chapter, a new gene for obesity was announced in Science (Herbert et al., 2006) and in the Washington Post (April 17, 2006, p. A6). The search for obesity genes is potentially virtually endless, given the great complexity of the trait and the variation in genetic composition of different populations for alleles that influence obesity (e.g., see Ewbank, 2000, and above, on the ApoE gene).

I have heard scientists defend obesity-gene language as “only a manner of speaking, for everyone knows that there is more to obesity than this one gene.” It is quite common for scientists, especially the many who try to be accurate in interviews, to blame the gene-for-trait language on the press, but this seems unfair if eminent scientists engage the same brand of genetic hyperbole in the scientific and public media. Nobel laureate David Baltimore (2001), for example, during the euphoria of human genome announcements, wrote in Nature that “Analysis of Single Nucleotide Polymorphisms will provide us with the power to uncover the genetic basis of our individual capabilities such as mathematical ability memory physical coordination, and even, perhaps, creativity”; and James Watson, another Nobel laureate, proclaimed (Associated Press, 2000) that “Now we have the instruction book for human life.…”

The human genome project was the most expensive genetic research project in history, a funding triumph as well as a scientific one. It is difficult to escape the impression that one reason for genetic hyperbole is the desire of scientists to bring public attention to their research and thereby increase their access to funding, space in prestigious journals, and career success. But marketing is not the only explanation for the exaggerated status of genes, for there is also genuine ignorance and misunderstanding among scientists of the role of single genes in the development and evolution of complex traits.

In addition to neglect of the developmental role of the environment and of the complexity of the genetic architecture of traits that I have discussed in this article, neglected features of evolutionary genetics also illuminate some aspects of human genetics that otherwise seem mysterious. For example, just as trait-specific genes are hard to find, so are human-specific genes, when the human genome is compared with that of other primates (Ridley, 1999). Both facts are explicable by the observation, already mentioned, that it is common to find that ancestral phenotypes, and the underlying genes, have been recombined during evolution in new coexpressed sets to produce novel phenotypic traits. Due to such “developmental” or phenotypic recombination (West-Eberhard, 2003) complexly distinctive human characteristics, such as many aspects of language, could have evolved, via reorganized gene expression, with few or even no new genes. Patterns like these require attention to whole organisms and to comparisons among related species that are increasingly rare in biology and in the training of modern biologists.

Although the social sciences naturally look to biology for guidance in understanding genes, in fact the social sciences may be able to inject some sense into the genetic interpretations of biologists. Social scientists are experts in a model organism—Homo sapiens—that is unquestionably the best studied vertebrate. They are fully aware of environmental influence on phenotypes and of individuals as integrated wholes. The present century promises to be an exciting era for cross-disciplinary research in which the relatively holistic interests of social scientists and whole-organism biologists will converge with those of geneticists. Genetics is moving toward genomic studies that are increasingly concerned with gene expression and therefore with development and the phenotype. This will mean increased attention to the conditions and mechanisms of development, including hormone action, the nervous system, and behavior. Ultimately, and unavoidably, understanding gene action will have to address variation in environments, including the social milieu. This will have profound consequences for understanding the properties of human populations that are the subjects of this volume.


I thank James R. Carey, William G. Eberhard, and an anonymous reviewer for helpful suggestions. Tim Wright permitted access to an unpublished manuscript, and Neal G. Smith and Lynne C. Hartshorn provided many essential references.


  1. Associated Press. Scientists announce DNA mapping. 2000 June 26
  2. Badcock B, Crespi B. Imbalanced genomic imprinting in brain development: An evolutionary basis for the aetiology of autism. Journal of Evolutionary Biology. 2006;10:1420–9101. [PubMed: 16780503]
  3. Baltimore D. Our genome unveiled. Nature. 2001;409:814–816. [PubMed: 11236992]
  4. Barrett TB, Hanger RI, Kennedy JL, Sadovnick AD, Remick RA, Keck PE, McElroy SL, Alexander M, Shaw SH, Kelsoe JR. Evidence that a single nucleotide polymorphism in the promoter of the G protein receptor kinase 3 gene is associated with bipolar disorder. Molecular Psychiatry. 2003;8(5):546–557. [PubMed: 12808434]
  5. Bezchlibnyk Y, Young LT. The neurobiology of bipolar disorder: Focus on signal transduction pathways and the regulation of gene expression. Canadian Journal of Psychiatry. 2002;47(2):135–148. [PubMed: 11926075]
  6. Blackwood DHR, Visscher PM, Muir WJ. Genetic studies of bipolar affective disorder in large families. British Journal of Psychiatry. 2001;178:134–136. [PubMed: 11388952]
  7. Carey JR. Longevity, the biology and demography of life span. Princeton, NJ: Princeton University Press; 2003.
  8. Degn B, Lundorf MD, Wang A, Vang M, Mors O, Kruse TA, Ewald H. Further evidence for a bipolar risk gene on chromosome 12q24 suggested by investigation of haplotype sharing and allelic association in patients from the Faroe Islands. Nature. 2001;6(4):450–455. [PubMed: 11443532]
  9. Eberhard JR, Wright TF, Bermingham E. Duplication and concerted evolution of the mitochondrial control region in the parrot genus Amazona. Molecular Biology and Evolution. 2001;18:1330–1342. [PubMed: 11420371]
  10. Ewbank D. National Research; Commission on Behavioral and Social Sciences and Education. Demography in the age of genomics: A first look at the prospects. In: Committee on PopulationFinch CE, Vaupel JW, Kinsella K, editors. Cells and surveys: Should biological measures be included in social science research? Washington, DC: National Academy Press; 2000. pp. 64–109.
  11. Freimer NB, Reus VI, Escamilla M, McInnes A, Spesny M, Leon P, Service S, Smith L, Silva S, Rojas E, Gallegos A, Meza L, Fournier E, Baharloo S, Blankenship K, Tyler D, Batki S, Vinogradov S, Weissenbach J, Barondes S, Sandkuijl LA. Genetic mapping using haplotypes, association, and linkage methods suggests a locus for severe bipolar disorder (BPI) at 18q22-q23. Nature Genetics. 1996;12:436–441. [PubMed: 8630501]
  12. Gerhart J, Kirschner M. Cells, embryos, and evolution: Toward a cellular and developmental understanding of phenotypic variation and evolutionary adaptability. Malden, MA: Blackwell; 1997.
  13. Gershon ES. Genetics. In: Goodwin FK, Redfield KR, editors. Manic-depressive illness. New York: Oxford University Press; 1990. pp. 373–401.
  14. Hawkes K, O'Connell JF, Jones NGB, Alvarez H, Charnov EL. Grandmothering, menopause, and the evolution of life history traits. Proceedings of the National Academy of Sciences, USA. 1998;953:1336–1339. [PMC free article: PMC18762] [PubMed: 9448332]
  15. Herbert A, Gerry NP, McQueen MB, Heid IM, Pfeufer A, Illig T, Wichmann HE, Meitinger T, Hunter D, Hu FB, Colditz G, Hinney A, Hebebrand J, Koberwitz K, Zhu X, Cooper R, Ardlie K, Lyon H, Hirschhorn JN, Laird NM, Lenburg ME, Lange C, Christman MF. A common genetic variant is associated with adult and childhood obesity. Science. 2006;312(5771):279–283. [PubMed: 16614226]
  16. Jope RS. Anti-bipolar therapy: Mechanism of action of lithium. Molecular Paychiatry. 1999;4(2):117–128. [PubMed: 10208444]
  17. Kaiser J. NCI hears a pitch for biomarker studies. Science. 2004;306:1119. [PubMed: 15539577]
  18. Keller HF. The century of the gene. Cambridge, MA: Harvard University Press; 2000.
  19. Kelsoe JR, Spence MA, Loetscher E, Foguet M, Sadovnick AD, Remick RA, Flodman P, Khristich J, Mroczkowski-Parker Z, Brown JL, Masser D, Ungerleider S, Rapaport MH, Wishart WL, Luebbert H. A genome survey indicates a possible susceptibility locus for bipolar disorder on chromosome 22. Proceedings National Academy of Sciences, USA. 2001;98(2):585–590. [PMC free article: PMC14631] [PubMed: 11149935]
  20. King MC, Wilson AC. Evolution at two levels: Molecular similarities and biological differences between humans and chimpanzees. Science. 1975;188:107–116. [PubMed: 1090005]
  21. Kirschner M, Gerhart J. The plausibility of life. New Haven, CT: Yale University Press; 2005.
  22. Marshall E. Whose genome is it anyway? Science. 1996;273:1788–1789. [PubMed: 8815540]
  23. National Research Council; Commission on Behavioral and Social Sciences and Education. Cells and surveys: Should biological measures be included in social science research? Committee on PopulationFinch CE, Vaupel JW, Kinsella K, editors. Washington, DC: National Academy Press; 2001. [PubMed: 23166967]
  24. Osmond C, Barker DJP. Fetal infant and childhood growth are predictors of coronary heart disease, diabetes, and hypertension in adult men and women. Environmental Health Perspectives Supplements. 2000;108(S3):545–553. [PMC free article: PMC1637808] [PubMed: 10852853]
  25. Pekkarinen P, Terwilliger J, Bredhacka PE, Lannqvist J, Peltonen L. Evidence of a predisposing locus to bipolar disorder on Xq24-q27.1 in an extended Finnish pedigree. Genome Research. 1995;5:105–115. [PubMed: 9132265]
  26. Perusse L, Rankinen T, Zuberi A, Chagnon Y, Weisnagel SJ, Argyropoulos G, Walts B, Snyder EE, Bouchard C. The human obesity gene map: The 2004 update. Obesity Research. 2005;13(3):381–490. [PubMed: 15833932]
  27. Ptashne M. A genetic switch. second. Cambridge, MA: Cell Press and Blackwell Scientific; 1992.
  28. Ricklefs RE. Embryo development and ageing in birds and mammals. Proceedings Royal Society London B. 2006:1–6. [PMC free article: PMC1635478] [PubMed: 16846916] [Cross Ref]
  29. Ridley M. Genome. London, England: Fourth Estate; 1999.
  30. Sing CF, Haviland MB, Templeton AR, Zerba KE, Reilly SL. Biological complexity and strategies for finding DNA variations responsible for inter-individual variation in risk of a common chronic disease, coronary artery disease. Annals of Medicine. 1992;24:539–547. [PubMed: 1485951]
  31. Sing CF, Haviland MB, Templeton AR, Reilly SL. Alternative genetic strategies for predicting risk of atherosclerosis. In: Woodford FP, Davignon J, Sniderman A, editors. Proceedings of the 10th International Symposium on Atherosclerosis Montreal. New York: Elsevier; 1995. pp. 638–644.
  32. Wallace RB. National Research Council; Commission on Behavioral and Social Sciences and Education. Applying genetic study designs to social and behavioral population surveys. In: Committee on PopulationFinch CE, Vaupel JW, Kinsella K, editors. Cells and surveys: Should biological measures be included in social science research? Washington, DC: National Academy Press; 2001. pp. 229–249. [PubMed: 23166967]
  33. Wang HD, Kazemi-Esfarjani P, Benzer S. Multiple-stress analysis for isolation of Drosophila longevity genes. Proceedings National Academy of Sciences, USA. 2004;101(34):12610–12615. [PMC free article: PMC515105] [PubMed: 15308776]
  34. West-Eberhard MJ. Developmental plasticity and evolution. New York: Oxford University Press; 2003.
  35. Williams GC. Pleiotropy natural selection, and the evolution of senescence. Evolution. 1957;11:398–411.
  36. Williams GC, Nesse RM. The dawn of Darwinian medicine. Quarterly Review of Biology. 1991;66(1):1–22. [PubMed: 2052670]
  37. Wright TF, Lackey LB, Schirtzinger EE, Gonzalez LA, Eberhard JR. Mitochondrial control-region duplications and longevity in parrots. in press.



For a discussion of developmental processes actually affected by modifiers of regulation and form, of alternative terms like “regulatory genes” and “structural genes,” and a more thorough discussion of modularity and of genetics in relation to the development and evolution of novel traits, see West-Eberhard (2003). Gerhart and Kirschner (1997) and Kirschner and Gerhart (2005) discuss the nonrandom nature of novel traits derived from preexisting responses of cells.

Copyright © 2008, National Academy of Sciences.
Bookshelf ID: NBK62426


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (7.4M)

Related information

  • PMC
    PubMed Central citations
  • PubMed
    Links to PubMed

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...