NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Chou R, Fu R, Carson S, et al. Empirical Evaluation of the Association Between Methodological Shortcomings and Estimates of Adverse Events. Rockville (MD): Agency for Healthcare Research and Quality (US); 2006 Oct. (Technical Reviews, No. 13.)

Cover of Empirical Evaluation of the Association Between Methodological Shortcomings and Estimates of Adverse Events

Empirical Evaluation of the Association Between Methodological Shortcomings and Estimates of Adverse Events.

Show details


We empirically evaluated the association between perceived methodological shortcomings and estimates of serious adverse events associated with clinical interventions. Our main results are based on a large set of studies of CEA for symptomatic stenosis. Although we also analyzed studies of CEA for asymptomatic stenosis and randomized controlled trials of rofecoxib for arthritis, those results are mainly hypothesis generating because of the low rate of serious adverse events and the small number of studies included in those data sets.

We found that certain pre-defined quality criteria predicted differences in pooled rates of stroke or death in randomized controlled trials, cohort studies, and uncontrolled surgical series of CEA for symptomatic stenosis, and remained predictive after controlling for other methodological and clinical variables through multiple regression or subgroup analyses. We are not aware of any other studies that have empirically tested a large number of quality criteria designed to measure shortcomings in the measurement or reporting of adverse events against actual estimates of harms.

We also found that it may be feasible to develop empirically validated quality rating instruments for assessing the validity of studies reporting harms. To our knowledge, this is the first quality rating instrument that has been developed using a data set that included both randomized trials and observational studies. We found that a summary quality rating instrument with four methodological criteria predicted adverse events associated with CEA for symptomatic stenosis as well as instruments with more criteria. The summary quality rating instrument predicted adverse events better than any individual criterion. We also found a dose-response relationship: the more criteria met, the higher the estimate of adverse events.

Each of the four criteria included in the final quality rating instrument may measure different aspects of adverse event assessment. Biased selection (criterion 1), for example, could systematically affect adverse event rates if patients who underwent the procedure but were excluded from analysis were more likely to have an adverse event. Patients lost to follow-up (criterion 3) would not be assessed for adverse events for the full duration of the study, and could also be at higher risk for adverse events, which could lead to attrition bias. Pre-specifying adverse events (criterion 4) suggests increased attention paid to adverse event assessment, and was highly associated with (but slightly more predictive than) two other criteria that may measure a similar characteristic: adequate description of ascertainment technique (criterion 5) and independent assessment (criterion 6). Adequate duration of follow-up (criterion 8) is important because studies that only evaluated patients until discharge from the hospital could miss complications that occurred within 30 days but after discharge.

Some of the pre-defined criteria that were excluded from the final quality rating instrument may not be associated with predictable biases in rates of adverse events, but could still reflect important aspects of adverse event assessment. For example, although inadequate description of population (criterion 2) would make it difficult to assess external validity, it could be associated with patients at either higher or lower risk for complications. Similarly, inadequate description of ascertainment technique (criterion 5) could be associated with systematic over- or under-reporting of adverse events, depending on the ascertainment technique used.

Three other findings from analyses of this set of studies deserve mention. First, high Journal Impact Factor,27, author category,24 and proportion of text devoted to reporting adverse event results appeared to be proxies for quality of adverse event assessment. Reporting bias could confound the association between author category and lower complication rates, as surgeons may be more apt to report good results. Second, adverse events were more frequent in randomized, controlled trials compared to observational studies. This could be because the effects of patient and surgeon selection40 in randomized trials are offset by generally better adverse event assessment, or because observational studies are more likely than randomized controlled trials to go unpublished if study findings are unfavorable. Among observational studies, higher quality ratings predicted higher rates of adverse events—a factor that should be taken into account when comparing the results of randomized controlled trials and observational studies.4144 Third, population-based studies were associated with higher rates of complications than non-population-based studies, even when quality criteria were also considered. This could be because population-based studies are more representative of the entire population and different surgeons than other observational studies. Alternatively, population-based studies could be more effective in obtaining complete outcomes data through large databases.45

We were unable to replicate the associations between quality ratings and estimates of harms in a smaller data set of studies of CEA for asymptomatic stenosis. One possible explanation for these results is that these analyses had less power to detect differences related to quality because of the lower rate of complications, less variance in rates of complications between studies, and substantially fewer studies to analyze. In addition, factors pertaining to external validity (such as patient selection or factors related to the delivery of the intervention) could be an important source of variation in this set of studies but more difficult to adequately control for because of the smaller data set. An important finding is that even for the same intervention (CEA), the same quality criteria may not consistently predict adverse events when applied to studies evaluating different populations or settings.

For studies of rofecoxib, our findings were generally similar to a recent meta-analysis by Juni et al—namely, that the presence of an independent, external endpoint committee blinded to treatment allocation was the strongest predictor for a higher risk of myocardial infarction.2 On the other hand, blinded outcomes assessment (not necessarily as part of an independent review committee) was not associated with higher estimates of risk. Appropriate allocation concealment, another factor commonly used to assess internal validity of clinical trials, did not predict reported risk of myocardial infarction in an earlier meta-analysis of rofecoxib trials.2 These findings support the hypothesis that unique considerations in adverse event reporting may require a distinct set of criteria separate from those used to assess internal validity.

Our findings have several limitations. First, the only outcomes assessed were major adverse drug events and post-surgical complications. Applicability of the results to assessment of minor side effects and complications is unknown. Second, as in other studies, our assessment of methodological shortcomings primarily relied on information available in published reports. However, even though poor reporting and poor quality are often associated, they are not synonymous.46 Inadequate reporting of adverse events methods can lead to misclassification, or assumptions that studies are methodologically deficient even when they were designed, conducted, and analyzed properly.17 This is illustrated by the fact that the use of unpublished information to determine the presence of an external endpoint committee in trials of rofecoxib resulted in better predictions of myocardial infarction risk than determinations based on published reports alone. Appropriate methods for detecting unsuspected adverse events (such as myocardial infarction in earlier trials of rofecoxib) could be particularly susceptible to poor reporting. Third, important aspects for rating the quality of adverse event assessment may be difficult to measure using quality rating criteria. The use of accurate and precise ascertainment techniques, for example, is likely to be an important factor,9 but difficult to define objectively. One possibility could be to distinguish between studies that used active techniques to identify adverse events versus those that used more passive methods. One recent study found that using a checklist to identify 53 possible adverse events resulted in identification of 20-fold more events compared to using open-ended questions.47 However, the validity of using different methods for assessing adverse events had not yet been assessed. Fourth, publication bias could have distorted our conclusions, if either high-quality studies with lower estimates of harms or low-quality studies with higher estimates of harms were less likely to be published.

Quality criteria could also have differential predictive ability depending on whether relative or absolute measures are used to quantify harms. Relative measures such as odds ratios or relative risks are particularly important when comparing harms from different interventions. Absolute rates of adverse events, on the other hand, are helpful for quantifying the balance of harms and benefits associated with a particular intervention.48 Methodological factors that affect estimates of one measure of harms may not affect the other. A factor that leads to systematic under-counting of adverse events and therefore affects the absolute rate, for example, might not significantly change the odds ratio if the bias affects both treatment groups similarly.

Most importantly, analyses need to be performed to determine whether methodological shortcomings are associated with lower estimates of harms for other surgical or drug interventions and outcomes. Like studies evaluating the effects of methodological shortcomings on estimates of efficacy from clinical trials,49, 50 we found that developing a generic summary instrument for rating quality of adverse event assessment is problematic because different aspects of quality were more important for one set of studies compared to another. Specifically, major complications associated with CEA for symptomatic stenosis and rofecoxib for arthritis were predicted by different quality rating criteria, and no quality rating criteria predicted adverse events in studies of CEA for asymptomatic stenosis. It is therefore important for systematic reviewers to evaluate individual quality criteria when judging the quality of adverse event estimates, rather than relying on generic summary scales.

A key lesson from analyzing studies of harms is that in addition to doing a better job of looking for adverse events and measuring them reliably, it is also important for researchers to adequately evaluate and report factors that may influence complication rates.13 Readers should carefully assess for potential sources of bias as well as other sources of variation (such as differences in populations and interventions51) when interpreting results of studies reporting harms. Future studies of this area are needed and should investigate data sets large enough to detect differences in adverse event rates, include studies utilizing both randomized and non-randomized designs, evaluate associations using absolute as well as relative event rates, and carefully examine the association between individual and summarized quality criteria and differential estimates of harms.

PubReader format: click here to try


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (381K)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...