U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Velentgas P, Dreyer NA, Nourjah P, et al., editors. Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Jan.

Cover of Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide

Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide.

Show details

Chapter 4Exposure Definition and Measurement

, PharmD, PhD and , PhD.

Author Information

Characterization of exposure is a central issue in the analysis of observational data; however, no “one size fits all” solution exists for exposure measurement. In this chapter, we discuss potential exposure measurement approaches for observational comparative effectiveness research (CER). First, it is helpful to lay out a theoretical link between the exposure and the event/outcome of interest that draws from the study's conceptual framework. For interventions that target health and well-being, the physiological or psychological basis for the mechanism of action, whether known or hypothesized, should guide the development of the exposure definition. When possible, an operational definition of exposure that has evidence of validity with estimates of sensitivity, specificity, and positive predictive value should be used. Other important factors to consider when defining exposure are the timeframe (induction and latent periods), changes in exposure status or exposure to other therapies, and consistency and accuracy of exposure measurement. The frequency, format, and intensity of the exposure is another important consideration for the measurement of exposure in CER studies, which is applicable to medications (e.g. dose) as well as health service interventions that may require multiple sessions, visits, or interactions. This chapter also discusses methods for avoiding nondifferential and differential measurement error, which can introduce bias, and describes the importance of determining the likelihood of bias and effects on study results. We conclude with a checklist of key considerations for the characterization and operationalization of exposure in CER protocols.


In epidemiology, the term “exposure” can be broadly applied to any factor that may be associated with an outcome of interest. When using observational data sources, researchers often rely on readily available (existing) data elements to identify whether individuals have been exposed to a factor of interest. One of the key considerations in study design is how to determine and then characterize exposure to a factor, given knowledge of the strengths and limitations of the data elements available in existing observational data.

The term “exposure” can be applied to the primary explanatory variable of interest and to other variables that may be associated with the outcome, such as confounders or effect modifiers, which also must be addressed in the analysis of the primary outcome. For example, in a study of the comparative effectiveness of proton pump inhibitors and antibiotic treatment of H. pylori for the prevention of recurrent gastrointestinal (GI) bleeding, the primary exposures of interest are proton pump inhibitors and the antibiotics for H. pylori. However, it would also be important to measure exposure to aspirin and nonsteroidal anti-inflammatory drugs (NSAIDs), which would increase the risk of GI bleeding independent of treatment status. Similarly, in a comparative evaluation of cognitive behavioral therapy (CBT) for treatment of depression compared with no CBT, it would be important to measure not only the exposure to CBT (e.g., number and/or type of therapy sessions), but also exposure to other factors such as antidepressant medication.

Each intervention (e.g., medication, surgery, patient education program) requires a unique and thoughtful approach to exposure ascertainment. While it may only be necessary to identify if and when an intervention occurred to assign individuals to the appropriate comparison group for one-time interventions such as surgery or vaccine administration, for pharmacologic and other more sustained interventions such as educational interventions, it will often be important to consider the intensity of the exposure by incorporating the dose, frequency, and duration. In addition, for pharmacologic and behavioral interventions the mode of delivery or the context in which the intervention takes place may also be important factors for determining exposure. For example, to evaluate the comparative effectiveness of a multivisit behavioral intervention for weight loss compared with a single-visit program, it is important to consider the total number of visits to ascertain exposure.

The data elements available in a dataset may dictate how exposure is measured. Unlike randomized clinical trials, in which mechanisms exist to ensure exposure and to capture relevant characteristics of exposure, observational comparative effectiveness studies often have to rely on proxy indicators for the intervention of interest. In clinical trials of medications, drug levels may be monitored, pill counts may be performed, and medications may be dispensed in limited days' supply around routine study visits to facilitate medication use. When relying on observational data, however, exposure ascertainment is often based on medication dispensing records, and only under rare exceptions will drug levels be available to corroborate medication exposure (e.g., international normalized ratio [INR] rates might be available from medical records for studies of anticoagulants).

No “one size fits all” solution exists for exposure measurement. Researchers who seek to address similar clinical questions for the same chronic condition may use different approaches to measuring exposure to the treatments of interest.1-5 For example, in evaluating the association between use of inhaled corticosteroids (ICS) and fracture risk in patients with chronic obstructive pulmonary disease (COPD), the period used to define exposure to ICS ranged from ever having used ICS to use during the entire study period to use in the last 365 days to use in the last 30 days. In addition, exposure was characterized dichotomously (e.g., ever/never) or categorically, based on the amount of exposure during the measurement time periods. These examples show that methods for measuring exposure, even for addressing the same clinical question, can vary. Thus, the intent of this chapter is to identify important issues to consider in the determination of exposure and describe the strengths and limitations of different options that are available given the nature of the research question.

Conceptual Considerations for Exposure Measurement

Linking Exposure Measurement to Study Question

A study's conceptual basis should serve as the foundation for developing an operational definition of exposure. That is, if the objective of the study is to examine the impact of chronic use of a new medication on patient outcomes, then the measurement of exposure should match this goal. Specifically, the definition of exposure should capture the long-term use of the medication and not simply focus on a single-use event. The exposure measurement could include alternative measures that capture single-use events; however, the exposure measurement should be able to distinguish short-term use from long-term use so that the primary study question can be adequately addressed.

Examining the Exposure/Outcome Relationship

The known properties of the intervention of interest also should guide the development of exposure measures. It is helpful to lay out a theoretical and biological link between the exposure and the event/outcome of interest that draws from the study's conceptual framework. The biological mechanism of action, whether known or hypothesized, should guide the development of the exposure definition. If the primary exposure of interest in the analysis is a medication, it may be relevant to briefly describe how the pharmacology, the pharmacodynamics (the effects of medication on the body), and the pharmacokinetics (the process of drug absorption, distribution, metabolism, and excretion from the body) informed the exposure definition. For example, in a comparison of bisphosphonates for the prevention of osteoporotic fractures, the exposure definition would need to be tailored to the specific bisphosphonate due to differences in the pharmacokinetics of the various medications. The definition of exposure for ibandronate, which is a bisphosphonate indicated for osteoporosis administered once per month and has a very long half-life, would likely need to be different than the definition of exposure for alendronate, a treatment alternative that is administered orally daily or weekly. When operationalizing exposure to these two medications, it would be insufficient to examine medication use in the last week for identifying current use of ibandronate, but sufficient for current use of alendronate. Analogous scenarios can be envisioned for nonpharmacological interventions. For example, in a study examining a multivisit educational intervention for weight loss, the effect of the intervention would not be expected until individuals participated in at least one (or some) of the sessions. Therefore, it would not be appropriate to create an exposure definition based on registration in the program unless subject participation could be verified.

Examples of Exposure/Outcome Relationships

As noted above, it is helpful to lay out a theoretical and biological link between the exposure and the event/outcome of interest that draws from a conceptual framework. Several examples of exposure and event relationships are displayed in Figure 4.1. These panels show how an exposure might be associated with an increased likelihood of a benefit or harm.

Figure 4.1. Examples of exposure(s) and risk/benefit associations.

Figure 4.1

Examples of exposure(s) and risk/benefit associations.

The first column (A–C) shows multiple exposures over time where the timing of the exposure is not consistent and stops midway through the observation period. Panel A shows a scenario in which there is a “threshold effect”—where the benefit (or risk) associated with the exposure increases after a specific amount of exposure and the level of benefit/risk is maintained from that point forward. In defining exposure under this scenario, it would be important to define the cumulative amount of exposure. For example, if evaluating the comparative effectiveness of antibiotics for the treatment of acute infection, there may be a threshold of exposure above which the medication is considered effective treatment. In this case, the exposure measurement should measure the cumulative exposure to the medication over the observation timeframe and define individuals as exposed when the threshold is surpassed (if the exposure variable is dichotomized).This situation contrasts with that in Panel B, in which the association between the exposure and the effect decreases rapidly after the exposure is removed. This type of association could be encountered when evaluating the comparative effectiveness of antihypertensive medications for blood pressure control. In this case, there may be (a) some minimum amount of exposure necessary for the medication to begin to have an effect and (b) an association between the frequency of administration and effectiveness. When the exposure is removed, however, blood pressure may no longer be controlled and effectiveness decreases rapidly. In operationalizing this exposure-event association it would be necessary to measure the amount of exposure, the frequency with which it occurred, and when exposure ended. In panel C, there is an increase in the likelihood of the outcome with each exposure that diminishes after the exposure is removed. This may represent an educational weight loss intervention. In this example, continued exposure improves the effectiveness of the intervention, but when the intervention is removed, there is a slow regain of weight. Similarly to Panel B, it is important to consider both the timing and the amount of exposure for the weight loss intervention. Because the effectiveness diminishes slowly only after the exposure is removed, it is important to consider a longer exposure window than when effectiveness diminishes rapidly.

The second column shows scenarios where the exposure of interest occurs at a single point in time, such as a surgical procedure or vaccination. The relationship in panel D shows an immediate and sustained effect following exposure. This could represent a surgical procedure and is a situation in which the measurement of exposure is straightforward as long as the event can be accurately identified, as exposure status would not vary across the observation period. Measurement of exposure in panels E and F is more complex. In panel E, the exposure is a single event in time with an immediate effect that diminishes over time. An example of this could be a percutaneous coronary intervention (PCI) where the time scale on the x-axis is measured in years. There is an immediate effect from the exposure (intervention) of opening the coronary arteries that contributes to a reduced risk of acute myocardial infarction (AMI). However, the effectiveness of the PCI decreases over time, with the risk of AMI returning to what it was prior to the intervention. In this example, it is clearly important to identify and measure the intervals at which the risk is modified by PCI. After a sufficient amount of time has passed from the initial PCI, it may not be appropriate to consider the individual exposed. At the very least, the amount of time that has passed postexposure should be considered when creating the operational definition of exposure. Panel F represents a scenario where the effect from a single exposure is not immediate but happens relatively rapidly and then is sustained. Such a situation could be imagined in a comparative effectiveness study of a vaccination. The benefits of the vaccination may not be realized until there has been an appropriate immunological response from the individual, and the exposure definition should be created based on the expected timing of the response, consistent with clinical pharmacological studies of the vaccine.

The final column of Figure 4.1 represents scenarios in which there are multiple exposures over time with different exposure-risk/benefit relationships. In each of these examples, it is important to consider the cumulative amount of exposure when developing the exposure definition. In panel G, the depicted relationship shows a dose-response in which the risk or benefit increases at a slower rate after a threshold of exposure is reached. An example of this could be a behavioral intervention that includes personal counseling for lifestyle modifications to improve hypertension management. There may be a minimum number of sessions needed before the intervention has any effect and, after a threshold is reached, the incremental effectiveness of a single session (exposure) is diminished. In measuring exposure in this example, it would be important to determine the number of sessions that an individual participated in, especially if multiple exposure categories are being created. Panel H shows a linear increase in the risk/benefit associated with exposure. This example may be best illustrated by a comparative safety evaluation of the impact of oral corticosteroids on fracture risk. Continued exposure to oral corticosteroids may continue to increase the risk of fracture associated with their use. In this example, it would be necessary to characterize cumulative exposure when creating exposure definitions, as there will be a difference in the risk of those exposed to “a little” in comparison to those exposed to “a lot.” The final scenario is panel I, which shows a large change in risk/benefit upon initial exposure and then an increase in the risk/benefit at a slower rate with each subsequent exposure. For panel I, it would be most important to determine if the exposure occurred (as this is associated with the largest change in risk/benefit), and then quantify the amount of exposure.

Induction and Latent Periods

In creating exposure definitions, it is also important to consider the induction and latent periods associated with the exposure and outcome of interest.6 The induction period is the time from when the causal effects of the exposure have been completed to the start of the event or outcome. During the induction period, additional exposures will not influence the likelihood of an event or outcome because all of the exposure necessary to cause the event or outcome has been completed. For example, additional exposure to the vaccine for mumps during childhood will not increase or decrease the likelihood of getting mumps once the initial exposure to the vaccine has occurred.

The latent period is the time from when the outcome starts to when the outcome is identified. In other words, it is the period between when the disease or outcome begins and when the outcome is identified or diagnosed. Similar to the induction period, exposures during the latent period will not influence the outcome. Practically, it may be very difficult to distinguish between latent and induction periods, and it may be particularly difficult to identify the beginning of the latent period. However, both periods should be considered and ultimately not included in the measurement of exposure. In practical terms, it is sufficient to consider the induction and latent period as a single time period over which exposures will not have an effect on the outcome. A timeline depicting multiple exposures, the induction period, the latent period, and the outcome of interest is shown in Figure 4.2.

Figure 4.2. Timeline of exposure, induction period, latent period, and outcome.

Figure 4.2

Timeline of exposure, induction period, latent period, and outcome. Adapted with permission from White E, Armstrong BK, Saracci R. Principles of exposure measurement in epidemiology. 2nd edition, New York: Oxford University Press Inc.; 2008.

As an example of the incorporation of both the induction and latent periods in exposure measurement, consider the evaluation of the comparative effectiveness of a cholesterol-lowering medication for the prevention of myocardial infarction. First, the induction period for the medication could be lengthy if the effectiveness is achieved through lowering cholesterol to prevent atherosclerosis. Second, there is likely a very small latent period from disease onset to identification/diagnosis. That is, the time from when the myocardial infarction starts to when it is identified will be relatively short. Any medication use that occurs during the induction and latent periods should not be included in the operational definition of exposure. For this example, it would be inappropriate to consider an individual exposed to the medication of interest if they had a single dose of the medication the day prior to the event, as this would not have contributed to any risk reduction for the event. Because of the short latent period, it would be unlikely that exposures occurred during that timeframe. Exposure should be measured during a time period when the use of lipid-lowering medications is expected to have an effect on the outcome. Therefore, the exposure definition should encompass a timeframe where the benefit of lipid-lowering medications is expected, and this should be justified based on what is known about the link between atherosclerosis and myocardial infarction and the known biological action of lipid lowering medications.

Changes in Exposure Status

Another relevant consideration when developing exposure measurement relates to changes in exposure status, particularly if patients switch between active exposures when two or more are being investigated. While medication or exposure switching may be more relevant for design and/or analysis chapters in this guidance, it is also important to consider how it might relate to exposure measurement. One of the important factors associated with medication switching when creating exposure definitions is to determine if “spillover” effects might persist from the medication that was discontinued. If this is true, it would be necessary to extend the measurement of exposure beyond the point when the switch occurred. Similarly, depending upon the effects of the intervention that was started, it is important to consider its biological effects when developing the exposure definition following a switch. Importantly, these issues do not apply only to medications; “spillover” effects can also be observed with behavioral or other interventions where the effect extends beyond the last observed contact.

Data Sources

Exposure Measurement Using Existing Electronic Data

The ability to measure exposures based on available data is also an important consideration when creating an operational definition of exposure. Is there a consistent and accurate way to identify the exposure in the dataset? If the exposure of interest is a surgical procedure, for example, is there a single code that is used to identify that procedure or is it necessary to expand the identification beyond a single code? If using more than one code, do the codes only identify the procedure of interest or is there variability in the procedures identified? For medications, the data likely reflect prescriptions or medication orders (EHR) or pharmacy dispensings (PBM or health insurer administrative claims) but not actual use. Is it necessary to know whether a given medication was taken by the patient on a particular day or time of day?

To illustrate these issues, consider the case in which the primary intervention of interest is colonoscopy. Depending on the source of the data, colonoscopies may be identified with a CPT code (e.g., CPT 45355 Colonoscopy, rigid or flexible, transabdominal via colostomy, single or multiple), an HCPCS code (e.g., G0105 Colorectal cancer screening; colonoscopy on individual at high risk), or an ICD-9 procedure code (e.g., 45.23 Colonoscopy). To accurately identify this procedure, it is necessary to consider more than one type of procedure code when classifying exposure. All of these may reliably identify exposure to the procedure, but use of only one may be insufficient to identify the event. This may be influenced by the source of the data and the purpose of the data. For example, one set of codes from the list may be useful if using hospital billing data, while another may be useful for physician claims data. When making this decision, it is important for the investigators to balance the selection of the codes and the accurate identification of the exposure or intervention; creating a code list that is too broad will introduce exposure misclassification. Overall, it will be important to provide evidence on the most accurate and valid mechanism for the identification of the exposure or intervention across the datasets being used in the analysis. Researchers should therefore cite any previous validation studies or perhaps conduct a small validation study on the algorithm proposed for the exposure measurement to justify decisions regarding exposure identification. Issues in selection of a data source are covered in detail in chapter 8 (Data Sources).

Exposure Measurement via Prospective Data Collection

In addition to using existing data sources, it may be feasible or necessary to prospectively collect exposure information, in some circumstances from patients or physicians, for use in an observational comparative effectiveness study. Abstraction of (paper) medical records is a type of prospective data collection that draws on existing medical records that have not been compiled in a research-ready format.

The validity and accuracy of self-reported exposure information may depend on the type of exposure information being collected (i.e., medication use versus history of a surgical procedure), or on whether the information is focused on past exposures or is prospectively collected contemporary exposure information. The characteristics of the exposure and the patient population are likely to influence the validity of the information that is collected. The recall of information on a surgical procedure may be much more accurate than the recall of the use of medications. For example, women may be able to accurately recall having had a hysterectomy or tubal sterilization,7 while their ability to recall prior use of NSAIDs may be quite inaccurate.8 In these examples, the accuracy of recall for hysterectomy was 96 percent while only 57 percent of those who had a dispensing record for an NSAID reported use of an NSAID—a disparity that shows the potential for exposure misclassification when using self-reported recall for medication use. In the medication example, factors associated with better recall were more recent use of a medication and repeated use of a medication. Similar to the use of other sources of data for exposure measurement, use of this type of data should be supported by evidence of its validity.

Creating an Exposure Definition

Time Window

A key component in defining exposure is the time period during which exposure is defined, often referred to as the time window of exposure. The exposure time window should reflect the period during which the exposure is having its effects relevant to the outcome of interest.6 In defining the exposure time window, it is necessary to consider the induction and latent periods. As noted in the statin example above, the exposure time window to evaluate the effectiveness of statins for preventing AMIs should be over the time period that statins can have their impact on cardiovascular events, which would be over the preceding several years rather than, for instance, over the 2 weeks immediately preceding an event.

There is no gold standard for defining the exposure time window, but the period selected should be justified based on the biologic and clinical pathways between the intervention/exposure and the outcome. At the same time, practical limitations of the study data should be acknowledged when defining the exposure time window. For example, lifetime exposure to a medication may be the ideal definition for an exposure in some circumstances but most existing datasets will not contain this information. It then becomes necessary to justify a more pragmatic approach to defining exposure given the length of followup on individuals available in the dataset. A variety of approaches to defining exposure time windows have been used in both cohort and case-control studies. As highlighted in the introductory section of this chapter, investigators have selected different exposure time windows even when examining the same clinical question. In most of these examples, the choice of the exposure time window is not clearly justified. Ideally, this choice should be related back to the conceptual framework and biological plausibility of the question being addressed. However, as noted above, there are pragmatic limitations to the ability to measure exposure, and in the case where selection of the exposure time window is arbitrary or limited by data, sensitivity analyses should be performed in order to evaluate the robustness of the results to the time window.

Unit of Analysis

When creating a definition for an exposure measurement, it is necessary to consider the unit of analysis for the study and the measurement precision possible within the constraints of the data. The nature of the intervention largely dictates the appropriate unit of analysis. If the intervention of interest does not vary with time, the unit of measurement can be defined at the patient level because exposure status can be accurately classified for the duration of the analysis. This may be the case for surgical procedures or other interventions that occur at a single point in time and that have a persistent effect (panel D in Figure 4.1). For other interventions or exposures, units of analysis may be more appropriately defined in terms of person-time, as the exposure status of individuals may vary over the course of the study period. This is a common approach for defining exposure in studies of medication treatment outcomes, as medication regimens often involve addition or discontinuation of medications, suboptimal adherence, dosage changes, or other factors that may cause changes in exposure to the intervention of interest.

Measurement Scale

The scale of the exposure measure should be operationalized in a manner that makes the most use of the information available. The more precisely an exposure is measured, the less measurement error. In many observational CER studies, the intervention of interest can be measured as a dichotomous variable (i.e., exposed or not exposed). For example, an individual either had or did not have a surgical procedure.

For other types of exposures/interventions in observational CER, it may be desirable to measure exposure as a continuous covariate, particularly when there is a dose-response relationship (e.g., panel H of Figure 4.1). However, the ability to operationalize exposure as a continuous variable may be limited by the availability of the exposure data and uncertainty surrounding its accuracy. Under cases of nondifferential misclassification in a continuous exposure variable, the degree of bias toward the null hypothesis is impacted by the precision of the exposure measurement, not by the bias in the exposure measure.9 Therefore, if the accuracy of the classification can be improved by using an alternative approach to scaling (e.g., measuring exposure as a categorical variable), it is possible to introduce less bias towards the null than is associated with the continuous measure. For example, if an individual was dispensed three separate prescriptions, each with a 30-day medication supply, she may not have taken the entire 90-day supply, but it is likely that she took more than a 60-day supply. In this case, an ordinal scaling of exposure measure for the number of doses of a medication may be preferable when it may not be possible to accurately identify the actual number of doses taken.

Dosage and Dose-Response

The concept of dose is an important consideration for the measurement of exposure in observational comparative effectiveness studies. Indeed, as shown in each of the event and exposure relationships depicted in the first column of Figure 4.1, the cumulative dose, or total amount of exposure over a specified time period, is often optimal for adequately defining exposure. To calculate cumulative dose, three elements of exposure are necessary: (1) the frequency of exposure, (2) the amount/dose of each exposure occurrence, and (3) the duration of exposure. Importantly, the concept of dose is applicable not only to medications but also to health services interventions that require multiple sessions, visits, or interactions. With respect to medications, it may be possible to obtain all the information necessary to calculate cumulative exposure to a specific prescribed medication from pharmacy claims data, where such data are typically collected for billing purposes. Information on the dose of each dispensed medication in the United States is available through the National Drug Code (NDC) for the product. Upon extracting information on the strength of each dose from the NDC code, dose strength can be combined with quantity dispensed and days' supply to determine the amount of each exposure event and the frequency of the exposure. When using data outside of the United States, the World Health Organization's Anatomical Therapeutic Chemical (ATC) Classification System may be used to measure exposure based on defined daily doses (DDDs), which are the assumed average maintenance doses per day for a drug used based on its main indication in adults (http://www.whocc.no/ddd/definition_and_general_considera/). Cumulative dose exposure definitions can be used to explore a dose-response relationship between the exposure and the event. Cumulative dose can also be used to determine if there is a threshold effect.

While cumulative exposure may be an important concept in many comparative effectiveness studies of medications, it may not be as relevant in other studies. There may be medications where use is so intermittent that it is not possible or relevant to capture cumulative exposure. This is also the case with one-time interventions like surgical procedures, where the concept of dose has less meaning.

Modes of administration and different dosage forms can present complexities in operationalizing a definition of exposure when using administrative data. For example, a study using observational data to examine the effectiveness of hydrocortisone as a treatment for irritable bowel disease (IBD) would seek to identify only those prescriptions for hydrocortisone that were used for IBD treatment. This could be accomplished by focusing only on specific dosage forms that would be used in the treatment of IBD, to avoid misclassification of exposure to other forms of hydrocortisone. Therefore, the definition of exposure needs to be specific to the exposure of interest and avoid misclassification due the availability of other dosage forms or routes of administration. Conversely, it may be necessary to create a wider definition that looks across multiple dosage forms if the question of interest is focused on a systemic effect of a medication that could be delivered in multiple forms.

Similarly, behavioral factors might modify the effect of the observed association. These can include factors such as medication adherence, which may be considered in the definition of exposure. Several examples of observational studies of medications exist that required a specific level of adherence prior to categorizing an individual as exposed. For example, a study may require that an individual use at least 75 percent of their prescribed medication on a regular basis before they are considered exposed. This is most frequently operationalized by calculating the medication possession ratio and determining if it crosses a threshold before categorizing an individual as exposed; again, the approach should be linked to the hypothesized mechanism of effect. More detailed descriptions of approaches to analyzing medication compliance and persistence using retrospective databases are available.10 Currently, there is no gold standard that indicates what amount of a given medication needs to be used prior to its having its effect. The choice of a threshold should be supported by a rationale for the level that is selected. In addition, while a measure of adherence can be used as a measure of amount of exposure or the dose, it is also important to consider differences in adherent versus nonadherent patients. That is, patients who are adherent to their treatment regimens may be systematically different from those who are nonadherent to treatment. These differences impact the outcomes being measured, independent of the exposure measurement. These factors should be considered when deciding whether or not to incorporate adherence as part of the exposure measure.

Precision of Exposure Measure

The source of the data being used for the analysis can limit the ability to precisely characterize exposure. For instance, EMR data may provide only information on medication orders or active drug lists, which would not allow for accurate classification of exposure on a daily basis. Attempting to do so would likely introduce high levels of exposure misclassification. The use of administrative claims data that provide information on medication dispensing may provide a more accurate estimate of the use of medications on a more routine basis. However, this data source will only reflect the dispensing of medications and not actual medication use. Multiple dispensings may provide greater assurance that the individual is being routinely exposed to the medication but cannot guarantee the patient has taken the medication. A more accurate measure of medication use would be information on medication assays. However, only a select number of medications have routine labs drawn to ascertain levels, and this does not present a practical solution in most observational CER projects. Thus, while dispensing data may provide a more accurate measurement on a more routine basis than other sources of data, assumptions about actual use are still inherent in the use of these data to determine exposure status. Investigators should understand the benefits and limitations associated with the data source being used, and should ensure that the exposure can be measured with sufficient precision to answer the research question of interest.

Exposure to Multiple Therapies

A complexity in observational CER is the lack of control over other medications used by individuals in the study, and the fact that exposure to other medications is unlikely to be randomly distributed among the exposed and unexposed groups. Therefore, when characterizing the primary exposure of interest, it is also important to consider the influence of other exposures on the outcome. Multiplicative or additive effects may be possible. For example, it may be important to consider the joint antihypertensive effects of various classes of antihypertensive medications in a comparative effectiveness study, as these medications will frequently be used in combination.

Issues of Bias

Measurement Error

In observational CER studies, both nondifferential and differential measurement error can introduce bias. Differential misclassification occurs when the error in the exposure measurement is dependent on the event of interest. This measurement error can result in biased estimates either away from or towards the null, making the observed association look stronger or weaker than the true underlying association. Differential measurement error can even lead to observed associations that are in the opposite direction of the true underlying association. Nondifferential measurement error occurs when errors in the measurement of exposure are proportionally the same in both the group that does and the group that does not experience the outcome of interest. For the most part, this type of measurement error will bias the results toward the null hypothesis, causing an underestimate of the true effect of the association.

The goal of any measurement of exposure is to minimize the amount of misclassification that occurs as part of the study design. For dichotomous measures, investigators should attempt to maximize the sensitivity and specificity of the measure to minimize the amount of misclassification. One source of misclassification in observational studies results from the failure to account for changes in exposure to medication during the observational period. Such a situation would support a person-time unit of analysis. In cohort studies, exposure status may be determined at a single point in time; this may not be reflective of use of the medication over the study period. There may be frequent changes to medication regimens during followup; simply classifying patients as exposed or not exposed at the onset of the study period can lead to a high degree of misclassification that is nondifferential.11 This may be true for exposures that occur intermittently and those that occur on a more frequent basis but are associated with high rates of nonadherence.

The potential influence on misclassification of choices made in operationalizing the exposure definition should be considered by the investigators when designing the study. For example, what is the potential for misclassification of exposure with a given choice of the exposure time window? Will selecting a relatively short exposure time window produce a high degree of misclassification of exposure that would potentially lead to a biased effect estimate? Investigators should consider the practical limitations of the data and the influence that these limitations might have on the measurement error. There are many other potential sources of misclassification when measuring exposure, including: (1) measurement of exposure during induction or latent periods, (2) failure to incorporate the sustained effects of the medication or other intervention when creating an exposure definition, and (3) use of health care services not captured in the data source. To expand upon the latter issue, data from health systems like insurance companies often lack the ability to capture out-of-system health care utilization. Many administrative claims databases also do not capture in-hospital medication use. Such exposures will not be recorded in the data source and may lead to misclassification known as immeasurable time bias, which occurs when exposure during a period such as hospitalization cannot be measured, and is not accounted for in the analysis of study data.12

Over-the-counter (OTC) medications present a scenario in which misclassification is particularly problematic. Measurements based on administrative or EMR data will underestimate the use of OTC products and lead to misclassification of exposure to those medications. The inability to measure exposure during the observation period can also be problematic if the available data do not fully capture all sources of exposure. The use of OTC medication as an exposure is but one example of not being able to accurately capture all exposures, but this can occur in other circumstances. For example, hospital billing data will usually not include detailed information on the medications used during the inpatient stay, which can lead to misclassification of exposure during a hospitalization. So while the individual is using health care that is captured by the data source, there is insufficient detail to accurately capture exposure. Therefore, investigators should determine if there are periods of time in which the exposure status of individuals cannot be ascertained in the data being used in the analysis, and should evaluate the potential impact on exposure measurement.

A specific type of measurement bias for exposures that has received a lot of attention in recent literature is immortal time bias.13 This bias occurs when person-time is inappropriately assigned to an exposure category. A common example of immortal time bias occurs when exposure is defined based on the requirement of two dispensings of a medication. The time period between those two dispensings represents an immortal period, in which events among exposed individuals (e.g., death) would not be attributed to exposure because the individuals exposed to only one dispensing have not qualified as exposed according to the definition. Clearly, this introduces a bias into the observed association and is remedied by correctly classifying person-time from the beginning of the exposure period (i.e., the first dispensing in this example). For time-based, event-based, and exposure-based cohort definitions, the bias in the rate ratio that arises from the immortal time increases with duration of immortal time.13


In this chapter, we have introduced many issues to consider in creating definitions for exposure when conducting CER using observational data. The operationalization of exposure should be guided by the clinical pathways/conceptual framework that motivate a CER question, knowledge of the characteristics of the exposure/intervention and outcome of interest, awareness of the level of detail on exposure in a dataset and of options for characterizing exposure, and deliberation over approaches to limit the potential for bias and measurement error. Below, we have created recommendations in the form of a checklist that encompasses many of the key considerations raised in this chapter to guide the operationalization of exposure.

Checklist: Guidance and key considerations for exposure determination and characterization in CER protocols

GuidanceKey ConsiderationsCheck
Propose a definition of exposure that is consistent with the clinical/conceptual basis for the research question.Consider the physiological effects of the exposure/intervention when creating an operational definition of exposure.
Determine the most suitable scale for the measurement of exposure.
Provide a rationale for exposure time window choice.For medications, consider factors such as dose, duration of treatment, pharmacodynamic/pharmacokinetic properties such as half-life, and known or hypothesized biological mechanisms associated with the medication of interest.
Describe the proposed data source(s) and explain how they are adequate and appropriate for defining exposure.
Provide evidence of the validity of the operational definition of exposure with estimates of sensitivity, specificity, and positive predictive value, when possible.If there are no validation studies to define the exposure of interest, utilize measures and definitions that have been most commonly reported in the literature to facilitate comparison of results.
Alternative definitions could be developed and used in addition to a “commonly used” definition for exposure, particularly if there are reasons to suspect there may be more accurate definitions available.
Support choice for unit of analysis for exposure measurement, e.g., person-months of exposure, and discuss the tradeoffs for alternative units of measurement.
Address issues of differential and nondifferential bias related to exposure measurement and propose strategies for reducing error and bias, where possible.


Hubbard R, Tattersfield A, Smith C, et al. Use of inhaled corticosteroids and the risk of fracture. Chest. 2006;130:1082–8. [PubMed: 17035441]
Johannes CB, Schneider GA, Dube TJ, et al. The risk of nonvertebral fracture related to inhaled corticosteroid exposure among adults with chronic respiratory disease. Chest. 2005;127:89–97. [PubMed: 15653967]
Lee TA, Weiss KB. Fracture risk associated with inhaled corticosteroid use in chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2004;169:855–9. [PubMed: 14711795]
Miller DP, Watkins SE, Sampson T, et al. Long-term use of fluticasone propionate/salmeterol fixed-dose combination and incidence of nonvertebral fractures among patients with COPD in the UK General Practice Research Database. Phys Sportsmed. 2010;38:19–27. [PubMed: 21150138]
Vestergaard P, Rejnmark L, Mosekilde L. Fracture risk in patients with chronic lung diseases treated with bronchodilator drugs and inhaled and oral corticosteroids. Chest. 2007;132:1599–607. [PubMed: 17890464]
Rothman KJ. Induction and Latent Periods. Am J Epidemiol. 1981;114(2):253–9. [PubMed: 7304560]
Green A, Purdie D, Green L, et al. Validity of self-reported hysterectomy and tubal sterilisation. The Survey of Women's Health Study Group. Aust N Z J Public Health. 1997;21:337–40. [PubMed: 9270164]
West SL, Savitz DA, Koch G, et al. Recall accuracy for prescription medications: self-report compared with database information. Am J Epidemiol. 1995;142:1103–12. [PubMed: 7485055]
White E, Armstrong BK, Saracci R. Principles of exposure measurement in epidemiology. 2nd ed. New York: Oxford University Press Inc.; 2008.
Peterson AM, Nau DP, Cramer JA, et al. A checklist for medication compliance and persistence studies using retrospective databases. Value in Health. 2007;10(1):3–12. [PubMed: 17261111]
Ray WA, Thapa PB, Gideon P. Misclassification of current benzodiazepine exposure by use of a single baseline measurement and its effects upon studies of injuries. Pharmacoepidemiol Drug Saf. 2002;11:663–9. [PubMed: 12512242]
Suissa S. Immeasureable time bias in observational studies of drug effects on mortality. Am J Epidemiol. 2008;168(3):329–35. [PubMed: 18515793]
Suissa S. Immortal time bias in pharmacoepidemiology. Am J Epidemiol. 2008;167(4):492–9. [PubMed: 18056625]
Copyright © 2013, Agency for Healthcare Research and Quality.

Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide is copyrighted by the Agency for Healthcare Research and Quality (AHRQ). The product and its contents may be used and incorporated into other materials on the following three conditions: (1) the contents are not changed in any way (including covers and front matter), (2) no fee is charged by the reproducer of the product or its contents for its use, and (3) the user obtains permission from the copyright holders identified therein for materials noted as copyrighted by others. The product may not be sold for profit or incorporated into any profitmaking venture without the expressed written permission of AHRQ.

Bookshelf ID: NBK126191


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (5.8M)

Related information

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...