Use of twins to study environmental effects.

Extrapolation from pharmacogenetic studies would indicate that there is a great deal of genetic variability in the response of humans to noxious environmental agents. Co-twins provide the most closely matched genetic controls possible and, in addition, are matched perfectly for age and often have shared very similar environments since before birth. The efficiency of co-twin control studies can be further increased by the use of sequential analysis, so that for the studies of the effects of environmental agents on human populations, twins would give answers that would require many more unrelated subjects. Use of twins in epidemiology studies is now in its infancy, but investigators should carefully consider the use of this powerful experimental tool and begin to identify twins in population surveys.


Introduction
Human experimentation is faced by everincreasing costs, more competition for available resources and the ever-present need to protect the experimental subjects from unnecessary risks. To counter these forces, extreme care must be taken to select the most powerful and efficient experimental designs for human epidemiological research. This paper explores the co-twin control model for use in epidemiological studies as an extremely efficient method of selecting controls which are matched genetically and environmentally as well as being the same age. Coupling paired comparisons of twins with sequential analysis is proposed as the most efficient method of studying low dose effects of environmental agents on human populations.

Background
Twins have been used in human experimentation for over one  December 1981 square, for example, contains no genetic variance because MZ twins are genetically identical. The DZ within-pair contains one-half of population additive genetic variance. The difference between these mean squares (MWDZ -MWMZ) is, therefore, an estimate of one-half population genetic variance. An area in which twin studies have been used almost exclusively to estimate genetic variance is the field of pharmacogenetics. Propping (1) reviewed studies of 11 drugs, all studied in twins and all showing a large amount of genetic variability in the rate of metabolism. Brewer (2) coined the term, ecogenetics, by generalizing that genetic variation not only was relevant to drug action but needed to be considered in the human response to any kind of environmental agent. Because most of the environmental agents of concern have been implicated in teratogenesis, mutagenesis, carcinogenesis or the genesis of other diseases it is not feasible in many instances to do experimental studies of these substances in man. Therefore, observational or epidemiological studies are of prime importance.
Twins have just begun to be used in epidemiological studies. Leadership in this field has been taken by the National Heart, Lung and Blood Institute's Epidemiology Branch and the collaborative study of heart disease risk factors (3) by using the National Academy of Sciences National Research Council panel of World War II veteran twins (4).

Use of Twins in Epidemiological Studies
For most traits of importance in medicine, there is marked variability among individuals. In the most simple terms, this variability can be ascribed to genetic factors, environmental factors or an interaction between genes and environment. This inherent variability often obscures the effects of environmental influences by overwhelming them with background noise or experimental error and makes the choice of appropriate controls critical. Table 1 shows how co-twin control studies can be used to control much of this experimental error. The MZ within-pair mean square, for example, contains no genetic variance, which precludes the possibility of genetic-environmental variation and contains only a fraction of population environmental variance, inversely related to the environmental covariance of MZ twins. Because the twins share the same environment throughout gestation and often very similar environments after birth, this covariance often makes up a considerable portion of environmental variance making MZ twins often "as alike as two peas in a pod." Dizygotic twins, while not genetically identical, may also be used in co-twin control studies, and, while less concordant than MZ twins, are in most cases much more alike than age-and sex-matched controls often used in epidemiological studies. Dizygotic twins are also useful in studying the associations between genetic markers and disease states or risk factors (6).
There are several experimental situations for which co-twin control studies are suited. Historically, the most commonly used co-twin control study is a clinical trial in which the two members of twin pairs are assigned randomly to two treatments or a treatment and a control. Twins are especially suited for long-term clinical trials where environmental trends are important and for studies in which carryover effects preclude the use of a crossover design. A second potential use for twins is in population surveys in which exposure history to an environmental factor is to be correlated with a disease endpoint. Twins can be ascertained as part of any population survey and then an appropriate sample chosen based upon exposure history with the co-twin serving as a built-in control.
In any proposed co-twin control study the added expense of obtaining twins must be compared to the expected advantage of using twins. It is often relatively inexpensive to ascertain twins which account for about 1% of the births in this country and therefore, one in fifty individuals. In addition, a recent review (7) revealed 21 active twin panels in the United States and Canada in addition to the resources for zygosity diagnosis available in virtually every state through clinical genetics units (8). The simple procedure of adding the question, "Are you a twin?" to population surveys would often provide a wealth of responses from which to select appropriate twin pairs for study.
The number of twin pairs needed to test a hypothesis can be calculated given an estimate of within twin-pair variability for the trait in question and the size of the difference to be detected. A method is available for estimating the number of twin pairs needed by using uniformity trials in which data from twin pairs not subjected to treatment are used to estimate the experimental errors of a completely randomized clinical trial and a co-twin control study (9). The within-pair variation (MWMZ) is an estimate of the co-twin control  design for normally distributed measurements with known variability (12). study experimental error. The amongand withinpair mean squares are used to obtain an estimate of experimental error for single-born individuals in the same population [(MAMZ -Mwmz)/2]. In this paper a nomogram is presented that allows the investigator to calculate the relative numbers of unrelated individuals and twins needed. This nomogram takes into consideration the relative sizes of the experimental errors and the relative degrees of freedom because a co-twin control study, like any paired comparison, has fewer degrees of freedom than a completely randomized design. A comparison of the relative efficiencies of matched pair studies with co-twin control studies is simpler because the degrees of freedom are comparable and the only additional information needed is an estimate of how much variability is controlled by the matched pairs as compared to twins plus an estimate of the difficulty of obtaining matched controls versus twins. It has recently been proposed that another magnitude of efficiency can be obtained by coupling co-twin control studies with sequential analysis (10). Sequential analysis was developed in the early 1940's and its use in clinical trials has been reviewed by Armitage (11). In sequential analysis the results are analyzed continuously and a decision made whether to continue or discontinue the trial with each analysis. This method has extreme appeal for human studies which are often conducted serially as appropriate subjects are ascertained. Figure 1 shows a model of a sequential analysis after Schneiderman and Armitage (12). This model is for normally distributed measurements with known variability and with no prior expectation of which treatment will result in the greater values for the variable being studied. The number of twin pairs to be studied (N) is first estimated, and then twin pairs are studied serially with one member being assigned at random to each of two treatments. The sum of the differences is plotted (treatment A minus treatment B), and if this sum crosses the upper boundary, treatment A is judged to cause a higher value than treatment B. The reverse is true if the lower boundary is crossed. The upper and lower boundaries are generally set at ot levels of 0.05 or 0.01. Crossing the middle boundary, generally set at a much lower a, value than the upper or lower boundaries, indicates that continuation of the experiment gives little possibility of rejecting the null hypothesis. Therefore, crossing any of the boundaries results in termination of the experiment. It is not unrealistic to expect that on the average sequential analysis would reduce the number of experimental subjects needed by one half. In general, sequential analysis will add maximum efficiency in short-term studies or longer studies in which experimental subjects are ascertained over a relatively long period of time. There are series of sequential models well suited to various experimental conditions. If the study involves, for example, a population survey of environmental exposure and its relationship to another quantitative variable, the appropriate statistic could be the correlation between twin differences for the environmental influence and the twin differences for the trait being studied. In addition sequential designs are well suited for use qualitative traits that are either present or absent.