Biomarkers in epidemiology: scientific issues and ethical implications.

The current generation of biologic markers have three characteristics that differentiate them from previous ones. These include the ability to detect xenobiotics at concentrations at the cellular and molecular level, to detect earlier biologic changes presumptive of disease or disease risk, and to identify a detailed continuum of events between an exposure and resultant disease. If biomarkers are to enhance cancer epidemiology, they must be valid, reliable, and practical. When these characteristics have not been previously demonstrated, pilot studies should be conducted prior to the primary study. Interdisciplinary communication and collaboration is required so that useful markers are selected and that collection and handling, assay, and interpretation are appropriate. The status of many biomarkers is that they have been developed in the laboratory but lack validation for field use. Validation of a marker for use in a population requires attention to issues of background prevalence, sample size, natural history, persistence, variability, confounding factors, and predictive value. Additionally, practical features such as subject preparation, access to specimens, specimen storage aspects, and costs must be clarified. Ultimately, the use of biologic markers in epidemiologic studies will depend on how well the markers increase ability to reduce misclassification, provide for better interpretation of exposure-disease associations, and increase opportunities for prevention. Validation studies and general research using biomarkers also have clinical, ethical, and legal implications. These range from communicating uncertainty about the meaning of a marker to the kinds of societal response that result when groups or individuals are identified as having an "abnormal" marker frequency.

The current generation of biologic markers have three characteristics that differentiate them from previous ones. These include the ability to detect xenobiotics at concentrations at the cellular and molecular level, to detect earlier biologic changes presumptive of disease or disease risk, and to identify a detailed continuum of events between an exposure and resultant disease. If biomarkers are to enhance cancer epidemiology, they must be valid, reliable, and practical. When these characteristics have not been previously demonstrated, pilot studies should be conducted prior to the primary study. Interdisciplinary communication and collaboration is required so that useful markers are selected and that collection and handling, assay, and interpretation are appropriate. The status of many biomarkers is that they have been developed in the laboratory but lack validation for field use. Validation of a marker for use in a population requires attention to issues of background prevalence, sample size, natural history, persistence, variability, confounding factors, and predictive value. Additionally, practical features such as subject preparation, access to specimens, specimen storage aspects, and costs must be clarified. Ultimately, the use of biologic markers in epidemiologic studies will depend on how well the markers increase ability to reduce misclassification, provide for better interpretation of exposure-disease associations, and increase opportunities for prevention. Validation studies and general research using biomarkers also have clinical, ethical, and legal implications. These range from communicating uncertainty about the meaning of a marker to the kinds of societal response that result when groups or individuals are identified as having an "abnormal" marker frequency.

Conceptual and Methodologic Issues
The scientific literature on biomarkers has been characterized more by attention to issues surrounding the development of assays than by the methodology for their use in epidemiologic research or by their ethical and legal impact. Such emphasis on the analytical is natural in view of the stage of development of markers; however, if markers are to be useful in cancer epidemiology and in human risk assessment, issues related to epidemiological and field studies must be addressed. To be useful in cancer epidemiology, applications of biomarkers should reduce misclassification of exposures and disease, enhance detection of exposure-disease associations, or increase opportunities for intervention. Biomarkers have two particularly useful characteristics: analytical sensitivity and the ability to represent steps in a heuristic continuum between 'National Institute for Occupational Safety and Health, 4676 Columbia Parkway, Cincinnati, OH 45226. an exogenous exposure and a resultant disease. Biomarkers have been shown to be highly sensitive indicators. For example, it is possible to detect xenobiotic-DNA adduct binding at the level of 1 in 1015 adductednucleotides (1). Markers have also been shown to detect cancer earlier than clinical diagnosis. For example, a combination of DNA hyperploidy and the M344 antibody allows detection of low-grade bladder cancers before they are morphologically apparent (2). The generic model of a continuum of events between xenobiotic exposure and disease is now well known. It is illustrated in Figure 1 for exposure to ethylene oxide, a model that has been used in risk assessment (3)(4)(5).
If biomarkers are to be useful in epidemiologic research, they must also be shown to be valid, reliable, and practical. These characteristics have been widely discussed (1,(6)(7)(8)(9). For example, hemoglobin adducts meet these criteria. They are valid because they have been shown to occur in the same proportion as the increase in cancer risk; they are reliable because repeated measurements are consis- tent; and they are practical because they can be obtained in a blood specimen. Assays for hydroxyethyl adducts to hemoglobin adducts are highly sensitive (in the analytic sense). Exposures to ethylene oxide as low as 0.05 ppm have been reported to produce hemoglobin adducts (10), and the relationship with exposure is linear. Hydroxyethyl adducts are not repaired like DNA adducts and so represent exposure over the previous four months [the life of the red cell (11)]. The specificity of hemoglobin adducts is a more complex issue. Workers with no occupational exposure to ethylene oxide also have hydroxyethyl adducts since other endogenous and exogenous sources of ethylene oxide such as smoking and exposure to sources of ethylene can produce these adducts (12). A marker such as hydroxyethyl adducts may thus not be exclusive for occupational exposure to ethylene oxide, but it will integrate the effects of diverse sources and routes of exposure and therefore encompass all ethylene oxide exposures -occupational and nonoccupational. The lack of complete specificity of hydroxyethyl hemoglobin adducts as indicators of ethylene oxide exposure is therefore not a serious limitation. The levels of these adducts in nonoccupationally exposed people is generally much less than in those occupationally exposed. Adducts are members of a class of biomarkers of exposure. For markers of effect, the picture is less clear, since few have been validated for disease outcome, such as for example, cancer. Even with one of the most promising examples, the p53 tumor-suppressor gene, only 50-75% of cancer cases contain this mutation (13); for other onco-genes and tumor suppressor genes, the percentage is lower. The temporal characteristics of these cancer markers are unclear; their role and timing in the natural history of cancer have not yet been defined, nor has their predictive value been determined. This is also true for most intermediate or surrogate markers, including cytogenetic markers in lymphocytes such as sister chromatid exchanges, chromosomal micronuclei, hprt gene mutations, and the oncogenic (oncogenes, suppressor genes, and growth factors) markers. For markers of susceptibility, such as debrisoquine or acetylation phenotype polymorphism, an increasing record of validity is developing. Caporaso and colleagues (14) have shown that individuals who are extensive metabolizers of debrisoquine have a greater risk of lung cancer than those who are poor or intermediate metabolizers (odds ratio = 6.1; 95% confidence interval = 2.2-17.7). With regard to the acetylation phenotype, Vineis et al. (15) have shown that slow acetylators develop more 4-aminobiphenyl-hemoglobin adducts than fast acetylators. Whether the acetylation phenotype is a risk factor for cancer has not been corroborated, despite widely cited references (16,17) to a higher frequency of slow acetylators among bladder cancer cases. We plan studies in which incident bladder cancer cases and controls from exposed and nonexposed populations will be compared.
Until the validity, reliability, and practicality of a marker have been demonstrated, pilot studies are useful. Perera (1) and Everson (18), among others, have demon-strated a strategy and approaches for such pilot studies: start with known high-dose groups such as chemotherapy patients, proceed to highly exposed occupational groups, then study occupational and environmental groups with lower exposure. The goal of these studies is or should be to determine the characteristics of markers that are prerequisites for their use in large population studies. These characteristics include a dose-response relationship, persistence, inter-and intra-person variation, correlation between markers, and correlation with clinical response. For example, Perera et al. (19) studied cancer patients treated with cisplatinum based chemotherapy and found post-treatment differences in a battery of biologic markers, including increased binding of hemoglobin and plasma protein to cisplatinum and increased levels of sister chromatid exchange.
A hallmark of the early studies utilizing biological markers is extensive interdisciplinary and often interinstitutional collaboration. It is likely that this trend will continue. Despite superb examples of such collaboration, a range of nagging questions will accompany this type of research: is the project directed by the laboratory or the field component, to what extent will resources be allocated for quality control in the laboratory and in the field, where should the data be published, and who is the first author?
The key to answering these questions is the ability to foster interdisciplinary communication. For epidemiologists, this may mean augmented training to understand the rudimentary concepts and terminology of molecular biology, genetics, and pathology, as well as the practical aspects of the use of laboratory methods as research tools. Laboratory scientists must learn the importance of design and training in statistical and epidemiological methods in population studies. Collaborative interdisciplinary research may be encouraged by the new journal Cancer Epidemiology, Biomarkers and Prevention.
In many of the early studies using biomarkers, particularly genetic and molecular markers, adequate attention was not given to subject selection, control of confounding, or choice of statistical analyses. Subjects often appear to have been selected with no appreciation of the impact of bias or attention to confounding factors, sample size, power or other design features. Granted, many ofthe early studies were conducted to see if an assay "worked" or how it performed under a range of conditions. In most of these cases, investigators had the good sense not to include statistical analyses, since they were generally not appropriate. Other studies, however, included practically no discussion of statistical design features or evaluation ofthe underlying assumptions for statistical tests.
In these studies, attention is often paid only to one aspect of the validity of a marker, which has different meanings to laboratory scientists and to population scientists. To the laboratory scientist, validitvy generally means the ability of a test to respond in the presence of a marker and not to respond in its absence. To the epidemiologist, validity pertains to predictive value. Ultimately, from an epidemiologic viewpoint, a marker will be valid and useful if it reduces misclassification, provides for better inter-pretation of exposure-disease associations, or is useful for prevention. These objectives are discussed in the following sections.

Reduce Misclassification
Exposure classification is one of the weakest aspects of epidemiology. Droz et al. (20) compared exposure classification by air monitoring with exposure classification by concomitant biologic (urine) monitoring and found extensive disagreement. While both air and biologic monitoring are surrogate measures of biologically effective dose, biologic monitoring generally allows for assessment of exposures by all routes and for longer exposure periods and encompasses individual metabolic characteristics. These biomarkers, however, have their limitations. Chief among these is the biologic half-life. This is illustrated in Figure 2, which shows the extent of exposure history that can be represented by a biologic marker (20). Factors that influence the dose of a xenobiotic must be considered. For example, Droz (21) demonstrated how the measured dose of organic solvents in workers was influenced by the following variables: interday and intraday fluctuation of exposure, repetition of exposure, physical workload, body build, and metabolism. Individuals classified as having the same exposure by air measurement may still have a different dose owing to the influence of these variables.
The use of biomarkers has been proposed to reduce misclassification of exposure; however, care must be taken not to introduce a new misclassification with the biomarker. For example, individual differences in cell kinetics and DNA repair may affect the reported level of a marker.

Provide Better Interpretation of Exposure-Disease Associations
The determination of exposure-disease associations without knowing the mechanism is as old as epidemiology itself. As exposures to xenobiotic are controlled to lower levels and as epidemiologists strive to disentangle the effects of multiple exposures and various host factors, understanding of mechanisms will be more important. The promise of biomarker studies is that separate exposures  (20). 1000 can be discerned and a risk assigned to each. For example, the use of micronucleus formation as an intermediate end point in epidemiologic studies can be enhanced by use of an antikinetichore antibody assay that can discriminate between aneugens and clastogens (22). Genotoxic agents are quite often specific in the effects they produce. For example, radiation induces primary chromosomal breaks and therefore produces kinetichore-negative micronuclei (22), whereas a mixture of benzene metabolites induces an increase mainly in kinetichore-positive micronuclei (23).

Use for Intervention
The best strategy for cancer intervention programs is to build them on a strong foundation of laboratory and epidemiologic research. Validated biologic markers that have been identified in epidemiologic studies as risk factors for a particular cancer may be the focus of primary or secondary prevention programs. For example, identification of slow acetylators among workers employed in industries where aromatic amines are used may provide a rationale for the frequency of screening for bladder cancer; however, if the relative risk for bladder cancer among slow acetylators is of the order of 1.5-2.0 and since approximately 50% of the population has this polymorphism, at least one-third of the population would be missed if a screening program were directed mainly at slow acetylators. An additional risk factor such as an exposure marker would reduce this oversight. Hence, by stratifying a work group on the basis of acetylation phenotype and arylamine-hemoglobin adduct levels, resources could be targeted to the workers at greatest risk (24,25). Prior to such use, however, the ethical and legal implications of distinguishing people on the basis of biologic markers need consideration.

Implications
The use of biologic markers imposes new clinical, ethical, and legal obligations upon researchers. These include scrutinizing the conditions involved in subject recruitment, specimen collection, and specimen access; reporting results; dealing with outliers; considering the effects of labeling subjects "abnormal"; and safeguarding privacy and confidentiality.

Subject Recruitment and Specimen Collection
The methods used for obtaining subjects or their specimens can raise ethical issues. The dangers include giving an implied or false sense of benefit when none is expected; misrepresenting the risk of harm in informed consent documents; or using any of various forms of coercion, ranging from making the subject fear incurring the displeasure of their physicians to the implicit or explicit indication that failure to participate will have implications for job security.
Attention must also be given to excluding potential subjects who may have a negative physiological reaction to the study procedure. For example, in a study of the debrisoquine phenotype using dextromethorphan, we had to address the concern of our Human Subjects Review Board about why we were not excluding subjects with cardiac arrhythmia or hypertension.

Access to Banked Specimens
A potentially controversial issue is the use of specimens for purposes for which they were not collected or by researchers not identified on the consent form. With the increasingly common practice of banking specimens and the fast pace of assay development and marker research, it is likely that there will be pressure to apply new assays to banked specimens without going back to the subjects for permission. This problem can be alleviated in part by using broad language on consent forms, although this may not be supported by institutional review boards. A second approach would be to have each new use for specimens assessed by a review panel, the members of which would serve as representatives of the subjects.

Communicating Results to Subjects
The whole issue of reporting results to study subjects is one that laboratory and field scientists have found difficult. Some argue that the findings are purely the results of research and are uninterpretable on an individual basis. Others take a more paternalistic attitude and decide that there is no good reason to convey the results because they have no implications for health.
When a researcher attempts to communicate results to study subjects, a number of issues must be considered. First, most current biomarker research has no clinical value, yet study subjects generally want to know if a study indicates if they are "all right." Second, many biomarker studies produce results of uncertain meaning to the investigator. How should this uncertainty be conveyed to research subjects? Currently, there is a paucity of data on ranges of normal levels for most biomarkers. Indeed, one of the objectives of contemporary research is to establish such ranges. Until that is done, it will be difficult to convey the full sense of what findings mean. One of the best ways to interpret results for subjects is to provide their individual results in comparison to those of the rest of the group being studied, although care must be taken with this approach since many factors influence a biomarker measurement. Other, more convoluted scenarios can be envisioned. For example, biomarkers that were purely research variables at the start of a study may be determined at some later time to indicate significant risk or clinical complications. What is our responsibility towards subjects in alerting them to these untoward findings?
Dealing with Subjects with Outlying Results and Labeling Subjects "Abnormal" One reason for communicating the results of marker assays to subjects is that those with highly abnormal results can have appropriate medical follow-up. This is sensible in the context of medical tests but may not be generally feasible for research assays. Still, when there is some potential that a test is indicative of risk, a plan may need to be developed for dealing with subjects with outlying results. This might include repeating the assay, counseling, or recommending a diagnostic evaluation.
Persons with results in the tails of the statistical distribution may be labeled as "abnormal." This could lead to prejudicial responses from employers, insurers, lenders, and other social institutions that consider health-related matters in their deliberations.

Safeguarding Privacy and Confidentiality
Data collected in biologic marker studies, especially data that indicate risk, susceptibility, or potential early changes, may be used inappropriately. Thus, subjects of studies involving markers should be able to expect that their privacy will be maintained and their results kept confidential. This is especially true in relation to occupational opportunities and insurability. Employers may be able to prevent disease by excluding susceptible people from potentially harmful jobs and insurers may be able to save money by refusing to insure such people or insuring them at a higher rate. Using markers that have not been validated to make such decisions may put an unfair burden on study subjects (25). There are many correlations (from cross-sectional studies) between genetic markers and disease, but very few markers have been validated with regard to predicting disease under exposure conditions (25,26). This issue initially arose in the context of genetic screening in the workplace. Since many cancer markers have a genetic component (i.e., they are phenotypic or genotypic expressions), the similarity between biomarker research and genetic screening may be quite close. Both require anticipatory vigilance against possible untoward or nefarious use of results. Even on the basis of validated markers, the advisability of discriminating against people with abnormal findings is questionable and should not be used in lieu of environmental control.
If biologic markers are to be useful in cancer epidemiology, attention must be paid not only to their use in studies but also to the societal impact of their use. This may require new forms of activity, such as marker registries, broader application of disability laws, and extensive followup testing. These activities typify extreme consequences. The most probable impacts on cancer epidemiology are the requirements to be open and communicative with subjects before, during, and after the study and to insure confidentiality of study findings. This approach should lead to continued productive research using biologic markers in cancer epidemiology.