Appendix D. Quality assessment methods for drug class reviews for the Drug Effectiveness Review Project

Publication Details

Study quality is objectively assessed using predetermined criteria for internal validity, based on the combination of the US Preventive Services Task Force and the National Health Service Centre for Reviews and Dissemination criteria. This appendix lists questions that are posed for each included study in order to assess study quality. These quality-assessment questions differ for systematic reviews, controlled trials, and nonrandomized trials.

Regardless of design, all studies that are included are assessed for quality and assigned a rating of “good,” “fair,” or “poor.” Studies with fatal flaws are rated poor quality. A fatal flaw is failure to meet combinations of criteria that may indicate the presence of bias. An example would be inadequate procedure for randomization or allocation concealment combined with important differences in prognostic factors at baseline. Studies that meet all criteria are rated good quality, and the remainder is rated fair quality. As the fair-quality category is broad, studies with this rating vary in their strengths and weaknesses: The results of some fair-quality studies are likely to be valid, while others are only probably valid. A poor-quality trial is not valid; the results are at least as likely to reflect flaws in the study design as a true difference between the compared drugs.

Systematic Reviews

  1. Does the review report a clear review question and inclusion/exclusion criteria that relate to the primary studies?
    A good-quality review should focus on a well-defined question or set of questions. These questions ideally are reflected in the inclusion/exclusion criteria, which guide the decision of whether to include or exclude specific primary studies. The criteria should relate to the 4 components of study design: indications (patient populations), interventions (drugs), and outcomes of interest. In addition, details should be reported relating to the process of decision-making, such as how many reviewers were involved, whether the studies were examined independently, and how disagreements between reviewers were resolved.
  2. Is there evidence of a substantial effort to search for all relevant research?
    If details of electronic database searches and other identification strategies are given, the answer to this question usually is yes. Ideally, search terms, dates, and language restrictions should be presented. In addition, descriptions of hand searching, attempts to identify unpublished material, and any contact with authors, industry, and research institutes should be provided. The appropriateness of the database(s) searched by the authors should also be considered. For example, if only Medline was searched for a review looking at proton pump inhibitors then it is unlikely that all relevant studies were located.
  3. Is the validity of included studies adequately assessed?
    A systematic assessment of the quality of primary studies should include an explanation of the criteria used (for example, how randomization was done, whether outcome assessment was blinded, whether analysis was on an intention-to-treat basis). Authors may use a published checklist or scale or one that they have designed specifically for their review. Again, the process relating to the assessment should be explained (how many reviewers were involved, whether the assessment was independent, and how discrepancies between reviewers were resolved).
  4. Is sufficient detail of the individual studies presented?
    The review should demonstrate that the studies included are suitable to answer the question posed and that a judgment on the appropriateness of the authors’ conclusions can be made. If a paper includes a table giving information on the design and results of the individual studies or includes a narrative description of the studies within the text, this criterion is usually fulfilled. If relevant, the tables or text should include information on study design, sample sizes, patient characteristics, interventions, settings, outcome measures, follow-up periods, drop-out rates (withdrawals), effectiveness results, and adverse events.
  5. Are the primary studies summarized appropriately?
    The authors should attempt to synthesize the results from individual studies. In all cases, there should be a narrative summary of results, which may or may not be accompanied by a quantitative summary (meta-analysis). For reviews that provide a meta-analysis, heterogeneity between studies should be assessed using statistical techniques. If heterogeneity is present, the possible reasons (including chance) should be investigated. In addition, the individual studies should be weighted in some way (for example, according to sample size or inverse of the variance) so that studies that are considered to provide the most reliable data have greater impact on the summary statistic.

Controlled Trials

Assessment of internal validity

  1. Was the assignment to treatment groups really random?
    • Adequate approaches to sequence generation:
      • Computer-generated random numbers
      • Random-numbers table
    • Inferior approaches to sequence generation:
      • Use of alternation, case record number, birth date, or day of week
    • Not reported
  2. Was the treatment allocation concealed?
    • Adequate approaches to concealment of randomization:
      • Centralized or pharmacy-controlled randomization
      • Serially numbered identical containers
      • On-site computer-based system with a randomization sequence that is not readable until allocation
    • Inferior approaches to concealment of randomization:
      • Use of alternation, case record number, birth date, or day of week
      • Open random-numbers list
      • Serially numbered envelopes (Even sealed opaque envelopes can be subject to manipulation.)
    • Not reported
  3. Were the groups similar at baseline in terms of prognostic factors?
  4. Were the eligibility criteria specified?
  5. Were outcome assessors blinded to the treatment allocation?
  6. Was the care provider blinded?
  7. Was the patient kept unaware of the treatment received?
  8. Did the article include an intention-to-treat analysis or provide the data needed to calculate it (number assigned to each group, number of subjects who finished in each group, and their results)?
  9. Did the study maintain comparable groups?
  10. Did the article report attrition, crossovers, adherence, and contamination?
  11. Is there important differential loss to followup or overall high loss to followup (giving numbers for each group)?

Assessment of external validity (applicability)

  1. How similar is the population to the population to which the intervention would be applied?
  2. How many patients were recruited?
  3. What were the exclusion criteria for recruitment? (Give numbers excluded at each step.)
  4. What was the funding source and role of funder in the study?
  5. Did the control group receive the standard of care?
  6. What was the length of follow-up? (Give numbers at each stage of attrition.

Nonrandomized Studies

Assessment of internal validity

  1. Was the selection of patients for inclusion unbiased? In other words, was any group of patients systematically excluded?
  2. Is there important differential loss to follow-up or overall high loss to follow-up? (Give numbers in each group.)
  3. Were the investigated events specified and defined?
  4. Was there a clear description of the techniques used to identify the events?
  5. Was there unbiased and accurate ascertainment of events (independent ascertainers and validation of ascertainment technique)?
  6. Were potential confounding variables and risk factors identified and examined using acceptable statistical techniques?
  7. Did the duration of follow-up correlate with reasonable timing for investigated events? (Does it meet the stated threshold?)

Assessment of external validity

  1. Was the description of the population adequate?
  2. How similar is the population to the population to which the intervention would be applied?
  3. How many patients were recruited?
  4. What were the exclusion criteria for recruitment? (Give numbers excluded at each step.)
  5. What was the funding source and role of funder in the study?


  1. Centre for Reviews and Dissemination. CRD Report Number 4. 2nd ed. University of York; UK: 2001. p. Undertaking systematic reviews of research on effectiveness: CRD’s guidance for those carrying out or commissioning reviews.
  2. Harris RP, Helfand M, Woolf SH, et al. Current methods of the US Preventive Services Task Force: a review of the process. Am J Prev Med. Apr. 2001;20(3 Suppl):21–35. [PubMed: 11306229]