As in the 2007 Evidence Report, Closing the Quality Gap: A Critical Analysis of Quality Improvement Strategies. Volume 6: Prevention of Healthcare-Associated Infections,3 the current report focuses on QI strategies for implementing preventive interventions for the following healthcare-associated infections (HAI): CLABSI, VAP, SSI, and CAUTI. The scope of the current report has been expanded from the previous report. Both hospital and nonhospital health care settings, such as ambulatory surgery centers, freestanding dialysis centers, and long-term care facilities are included. Also, information recommended for consideration in the recent RAND report for AHRQ on Assessing the Evidence for Context-Sensitive Effectiveness and Safety of Patient Safety Practices20 is included in the evaluation.

Analytic Framework

The analytic framework depicts the potential impact of the implementation of QI strategies on reducing the following HAI: CLABSI, VAP, SSI, and CAUTI (Figure 1). Key Question 1 shows the link between QI strategies and health outcomes: such as decreased infection rates, decreased complications and mortality, as well as unintended consequences. Key Question 1a shows the link between QI strategies and process outcomes; that is, adherence to preventive interventions. There are economic implications from both the process outcomes and the health outcomes, as depicted by Key Question 1b. Characteristics of the QI strategies, such as type of strategy, duration of the implementation, and setting, determine the effect of the QI strategies on the outcomes (Key Question 1c). Link Key Question 2 marks the interaction between the implementation of QI strategies and contextual factors of the organization. For example, institutions with an existing patient safety infrastructure may have fewer barriers to implementing QI strategies than other institutions.

Figure 1 is a depiction of the analytic framework, i.e., a graph that depicts key questions one and two. KQ1 shows the link between QI strategies and health outcomes, such as decreased infection rates, decreased complications and mortality, as well as other unintended consequences. KQ1a shows the link between QI strategies and process outcomes, the adherence to preventive measures. There are economic implications from both the process outcomes and the health outcomes, as depicted by KQ1b. Characteristics of the QI strategies such as type of strategy, duration of the implementation, and setting, determine the effect of the QI strategies on the outcomes (KQ1c). Link KQ2 marks the interaction between the implementation of QI strategies and contextual factors of the organization.

Figure 1

Analytical framework for systematic review on quality improvement strategies to reduce healthcare-associated infections. Abbreviations: CAUTI = catheter-associated urinary tract infection; CLABSI = central line–associated bloodstream infection; (more...)

Literature Search Strategy

The same search strategy used in the prior report3 (Appendix A) was rerun on MEDLINE®, CINAHL®, and Embase®. Duplicate records were deleted. The search covered the time period from January 2006, when the search in the last report ended, to April 2011. The search was updated in January 2012 while the draft report was available for public comment, and relevant articles were added. Additional efforts were made to identify articles on interventions in nonhospital settings, which are likely to be reported less frequently. The members of the Technical Expert Panel (TEP) were queried, and they provided recommendations of experts on these additional settings. Articles authored by these recommended experts were retrieved. A search on relevant studies in nursing homes was conducted in July 2011. We also screened the bibliographies of included articles to identify additional references. Web sites of entities involved in efforts to reduce HAI, such as the Institute for Healthcare Improvement, were scanned to ensure that no relevant peer-reviewed publications were missed and to identify descriptions of implementation strategies for which outcomes have been published in the peer-reviewed literature.

Article Selection

Titles and abstracts from the literature search citations were placed in a Microsoft Access® database for the first round of screening. Three trained reviewers conducted the screening. Each title and abstract was screened and marked as either: (1) retrieve for full-text review, (2) do not retrieve for full-text review, or (3) uncertain. Studies were marked for retrieval for full-text review if the citation reported the outcomes of an intervention for (a) any one of the four specified HAI, or (b) a combination of HAI that included at least one of the four. The reasons for excluding an article were noted. Articles deemed uncertain for full-text review were screened by a second reviewer. If both reviewers were uncertain, the article was retrieved for full-text review. To ensure the quality of this first round of screening, an investigator not involved with the screening reviewed a random sample of 114 titles and abstracts that were marked “do not retrieve.” The investigator agreed with all the exclusions. The project lead reviewed another 101 abstracts marked “do not retrieve,” one-third from each of the three reviewers, and agreed with all of the exclusions.

The full-text articles were retrieved and a similar process was followed to select the final group of articles for inclusion and abstraction in the report. Articles were included if the study described an implementation strategy to increase adherence with one or more of the preventive interventions listed above, with the intent of reducing one or more of the four types of infections covered in this report. A listing of studies excluded at the full-text level and reasons for exclusion can be found in Appendix B. Evidence tables of abstracted data can be found in Appendix C.

Inclusion and Exclusion Criteria

The same selection criteria were used for this report as for the 2007 report,3 with the addition of a criterion related to the setting. Specifically, included studies were required to:

  • Report the effect of a QI strategy on the incidence of HAI (CLABSI, VAP, SSI, or CAUTI), or report the effect of a QI strategy on adherence to evidence-based prevention interventions.
    • The specific prevention interventions used to reduce infections were selected from recommendations with a grade of 1A or 1B in the HICPAC guidelines (see www​, analogous to the approach used in the 2007 report, or with a grade of A-I or A-II in the SHEA/IDSA Compendium of Strategies to Prevent Healthcare-Associated Infections in Acute Care Hospitals.16-19 The list of preventive interventions was reviewed and amended by the TEP. The compiled list of infection-specific preventive interventions can be found in Table 2 as well as Appendix D.
    • If the study did not describe a QI strategy and focused on the effect of prevention interventions only, such as comparing antibiotic choice to prevent SSI or comparing antiseptic cleansers for skin preparation prior to surgery, the study was excluded.
  • Use either an experimental design with a control group or a quasi-experimental design.
    • Quasi-experimental studies must have a clearly defined baseline and post-intervention time period.
    • Interrupted time series designs, by definition, must report more than one time point of data before and after the intervention.
    • Studies that reported only postintervention data were excluded.
  • Report on one of the following settings: hospitals, outpatient surgical centers, freestanding dialysis centers, and long-term care facilities.
  • To be included, studies that report related outcomes, such as costs, health services utilization, patient or provider satisfaction with care, or unanticipated consequences of an intervention, must also report infection rates or adherence with preventive interventions.
  • Conduct a statistical analysis comparing baseline and postintervention infection rates or adherence rates.
    • If a study reported baseline and postintervention infection rates or adherence rates, but did not perform a statistical analysis to compare the rates, the study was excluded.
  • Have a combined baseline and postintervention patient sample size ≥100.
Table 2. Included preventive interventions for healthcare-associated infections.

Table 2

Included preventive interventions for healthcare-associated infections.

Inclusion of Articles From the 2007 Report

Articles included in the 2007 report were screened by a single reviewer using the inclusion criteria of the current report. Articles with only two-group tests, whether controlled or uncontrolled, were excluded. Those selected for inclusion were reviewed by a second reviewer. Both the first and second reviewers assessed study quality; any discrepancies were resolved through consensus or use of a third reviewer. Selected elements relating to study design, implementation, and results were abstracted from each article.

Data Abstraction and Data Management

Many of the data elements to be abstracted were qualitative, so an extensive training process was conducted to increase consistency among abstractors. A list of the data abstraction elements can be found in Appendix E. Five sample articles from the included articles list were independently abstracted by each abstractor. A meeting was held to discuss any differences and to agree on common strategies. A second meeting was held several weeks after abstraction began to agree on what to include in fields where there was ambiguity and to add or delete fields as needed. Abstractors then corrected the previously completed abstractions. When new abstractors were added, they abstracted the same five articles and were informed about the common strategies.

Following the training process, reviewers abstracted articles selected for inclusion in the review; a second reviewer conducted a fact check on the abstracted items, using a clean copy of the article. Discrepancies were discussed by the abstractor and the fact checker; any unresolved issues were decided through consultation with a third reviewer. Quality appraisals for each article were conducted independently by two reviewers; discrepancies were resolved by discussion, or by the inclusion of a third reviewer, when necessary.

The authors of the RAND report suggested elements should be considered. These elements were adapted for this review and can be found in the following data elements list.20

The following data elements were abstracted from the included articles:

  • Study description
    • Study design
    • Health care setting and clinical setting
    • Population size
    • Population demographic and clinical characteristics
    • Statistical analyses performed
  • Context, adapted from RAND report20
    • Theory or logic model behind the patient safety practice
    • Structural organizational characteristics (such as size, location, financial status, existing quality and safety infrastructure)
    • External factors (such as regulatory requirements or incentive systems)
    • Patient safety culture, teamwork, and leadership at the level of the unit
    • Availability of implementation and management tools (such as staff education and training, use of internal audit and feedback, presence of internal or external individuals responsible for implementation)
    • Description of interveners, intervenees, and their roles in the implementation process
  • QI Strategy
    • Type of QI strategy
      • Clinician education
      • Patient education
      • Audit and feedback
      • Clinician reminder systems
      • Organizational change
      • Financial or regulatory incentives for patients or clinicians
      • A combination of the above
    • Preventive intervention
      • See Table 2 in Inclusion and Exclusion Criteria section
    • Length of intervention, length of followup
    • Target of QI strategy (all clinical staff, physicians, nurses, respiratory therapists, other ancillary staff, patients, other)
    • Method of allocation into intervention and control groups
  • Outcome measures
    • Baseline and postintervention infection rates
    • Baseline and postintervention adherence to preventive interventions
    • Infection related complications, mortality
    • Costs, cost-effectiveness, return on investment
    • Unanticipated complications

However, for the update search, only data abstraction fields that were involved in the synthesis of the report were abstracted.a This was done for efficiency purposes as the addition of the update was larger than expected.

Individual Study Quality Assessment

Challenges in Evaluating Quality Improvement Efforts

Evaluating the impact of QI efforts is challenging. Most clinical QI interventions occur at the group level (e.g., hospital, intensive care unit). Therefore, an individual level randomized controlled trial, the generally preferred research design for other clinical trials, is not recommended.33 For example, if the intervention aims to increase adherence to recommended strategies to reduce HAI, the clinical staff who adopt the recommended practices may apply them to most patients, not simply to those randomized to the intervention. Cluster randomized trials, which randomize the site or group rather than the individual, are the strongest design for evaluations of QI efforts,34 if they are designed and implemented well.

Most studies of QI strategies are effectiveness studies, rather than efficacy studies. The interventions are implemented in a “real world” clinical setting, rather than the highly controlled designs typical of efficacy studies. The setting for the QI study may have already implemented other QI strategies. The specific interventions often vary from study to study, and the way in which they are interpreted may differ by setting and, in some cases, by health care provider. Although the definitions of the outcomes—for example, infections—are largely standardized, the actual measurement may vary from one setting to another. Adherence to preventive interventions may provide supportive evidence, but may be measured differently or focus on distinct preventive interventions. These differences do not negate the value of evaluating the impact of QI interventions. Rather, they highlight the need to interpret the results with careful consideration of all these issues. Furthermore, a group of studies with similar results provides stronger evidence than a single study.

There are a number of factors that may confound the results of quasi-experimental studies, examples of which are listed below. More extensive discussions can be found in a series of articles addressing efforts to reduce HAI35,36 or in the revision37 of the classic text on quasi-experimental design by Cook and Campbell.38

  • Unlike most clinical trials, QI studies often do not follow the same patients over time. The patients included in the baseline group may be different from those in the postintervention group a year or two later. Therefore, any differences between the groups of patients that may increase or decrease the risk of infection should be taken into account. In a simple before-after study, this can be done using regression analysis. In a cluster randomized controlled trial, the expectation is that randomization will allocate these factors evenly between groups. But with smaller sample sizes, this may not always occur.
  • Infection (or adherence) rates may have been changing before the intervention was undertaken. For example, given the increased attention to HAI in the aftermath of publications of seminal reports by the Institute of Medicine, such as To Err is Human39 and Closing the Quality Chasm,40 infection rates may have been falling over time in the institution(s) where the study is conducted. A simple before-after study, even if differences in patient characteristics are accounted for using regression, may mistake the effect of the intervention with the underlying trend in infection rates that preceded the intervention (Figure 2).
  • For example, if the baseline point A is simply compared with postintervention point B, it appears that the intervention has been effective in reducing the infection rate. However, if several data points are measured before the intervention and several after, the secular trend before the intervention can be determined. In the example in Figure 2, if the before and after data points are on line ab, then the intervention does not appear to have had an impact on infection rates. Rather, infection rates were declining before the intervention and continued to fall at the same rate (i.e., slope of the line) afterward. On the other hand, if the data points before the intervention are on line cd, then the infection rate was not declining before the intervention. After the intervention, the whole line falls down to line ef, so infection rates did decline after the intervention, with a onetime drop equivalent to d – e. Many other combinations are possible. The point is that without having multiple data points to discern the trend and position of the line before or after the intervention, one cannot tell whether the rate of infection declined solely as a result of the intervention. A simple before-after study design compares point A with point B and cannot control for any secular trend that may confound the interpretation of the decline. Interrupted time series, with at least three data points before and three data points after the intervention, permit differentiation between these two scenarios: line ab versus lines cd and ef.
  • Another potential factor is regression to the mean. For example, if an intervention is undertaken because infection rates have spiked, a decrease in infections after the intervention may be due to regression to the mean. The baseline outcomes may represent an unusually high infection rate that would have declined to a more typical level even without the intervention.
  • The assumption of independence of observations underlying many statistical approaches is violated in most of these study designs. First, outcomes for patients within a given site are unlikely to be independent, because of the common context within a site and the fact that the patients may be cared for by the same providers, among other possible factors. This issue may be addressed by using a site level research design, such as the cluster randomized, controlled trial, or using statistical techniques that account for the interdependence of observations from the same site. Second, when rates from the same site are measured over time, as in interrupted time series, the data points for each site are also related and may be more similar the shorter the time that has elapsed between measurements. This phenomenon is called autocorrelation and may be tested for (e.g., using the Durbin-Watson test) and appropriately addressed once detected or may simply be taken into account in the original choice of statistical approach (e.g., autoregressive integrated moving average [ARIMA] model).
  • Another possibility is that some external factor caused infection rates (or adherence rates) to change around the same time as the intervention was implemented. To detect this situation, the changes before and after the intervention need to be compared with changes over the same time period in a comparable setting that did not have the intervention. In other words, adding a contemporaneous control group can be helpful in identifying this situation.
Figure 2 presents an illustration of potential confounding by temporal trend. The figure shows an x axis for time versus a y axis for infection rate. The intervention is represented by a dotted vertical line about halfway down on the x axis. There is a diagonal line falling from left to right named lined ab. There are two horizontal lines, line cd which goes from the y axis to the intervention line, and line ef, which is parallel to line cd but lower on the y axis. Line ef begins at the intervention line and continues to the right hand side of the graph. Point A is at the intersection of line ab and line cd. Point B is at the intersection of line ab and line ef. In other words, segment AB is on line ab and could represent continuation of the previous trend. Alternatively, point A is on line cd and point B is on line ef, so the movement from point A to point B could represent a onetime drop from line cd to line ef (lines cd and ef are parallel, and the slope is the same).

Figure 2

Illustration of potential confounding by secular trend.

The strongest evidence of causality possible with these types of studies is when both adherence and infection rates are reported. One may then observe a potentially causal link between implementing an intervention using specific QI strategies, an increase in adherence rates to the preventive interventions, and a decline in infection rates. When only adherence is measured, one can infer that the infection rate should decline if the adherence rate rises. This is especially true when there is strong evidence linking the use of the preventive intervention and infection rates, but one cannot rule out the potential effect of intervening factors. Similarly, if infection rates decline after an intervention one might assume that the intervention was effective, but there are other possible factors that cannot be ruled out.

One potentially complicating factor is that measures of adherence may be far more common than infections. What if the adherence rate shows statistically significant improvement, while the change in infection rates is nonsignificant? This result could be due to the weakness of the link between the preventive intervention underlying the adherence rate and the infection rate; to insufficient power to detect a statistically significant change in infections (infections occurring relatively rarely); or to other confounding factors (e.g., a rise in infection rates due to increased prevalence of infectious agents or to other changes in the system of care).

Evaluation of Study Designs

The evidence on the effectiveness of different QI strategies to encourage the use of preventive interventions, which in turn may reduce the rates of HAI, is contained in a set of studies that is heterogeneous in terms of research design, statistical methods, interventions, settings, and outcomes. The approaches used in these studies differ substantially from those used in more traditional clinical trials.

These study design categories form the basis for quality evaluation of individual articles. Several other characteristics are also taken into account, as noted below. Table 3 summarizes some of the key characteristics of these study types. The table is based on discussions by Shadish, Cook, and Campbell,37 Wagner and colleagues,41 Harris and colleagues,35 and Shardell and colleagues ,36 which provide additional details on these issues.

Table 3. Characteristics of different study designs.

Table 3

Characteristics of different study designs.

Evaluation of Study Quality

To assess the quality of the studies included in our review, we initially planned to use the quality assessment criteria developed by the authors of the 2007 AHRQ Evidence Report on HAI.3 This original plan was altered after an examination of the studies highlighted the heterogeneity of the research designs, statistical methods, and outcomes. In addition to the study design, which was emphasized in the 2007 report, the statistical approaches used to analyze the data are a key determinant of the validity of these studies. Therefore, both study design and adequacy of statistical analysis are now included as quality criteria. Two items from the 2007 report are included as well: whether both adherence rates and infection rates were reported, and whether the intervention was independent of other QI efforts. The following item from the RTI Item Bank42 for assessing risk of bias and precision for observational studies was also included: Is the length of followup sufficient to support evaluation of primary outcomes and harms? One-year followup was considered necessary to demonstrate durability of results. Some of the validity criteria used in the last report, for example, whether CLABSI, VAP, and CAUTI rates were adjusted for device days, were almost universally present and provided no discriminatory power. Therefore, this criterion was not used to assess quality, but its widespread use is noted. Completeness of reporting, as described in the SQUIRE guidelines,23 for example, was not assessed independently. To summarize, the criteria to evaluate study quality are as follows:

  1. Study design
  2. Whether baseline and postintervention adherence rates were reported and analyzed statistically
  3. Whether baseline and postintervention infection rates were reported and analyzed statistically
  4. Whether the statistical analysis was adequate
    1. Were potential confounders (e.g., baseline patient characteristics) assessed?
    2. If potential confounders existed, were they controlled for in the analysis?
    3. For interrupted time series designs, was an interrupted time series analysis used?
  5. Whether the intervention was independent of other QI improvement efforts implemented at the same time
  6. Whether the followup period was 1 year or longer

Study design was used for the initial study quality classification so that all controlled trials were assigned higher quality; interrupted time-series analyses were assigned a quality of medium; and all simple before-after studies were assigned a quality of lower. Then, for each study, criteria 2 through 6 listed above, were assigned a plus, minus, or uncertain. Any study with two or more minuses was moved to the next lower quality ranking.

The terms “higher” and “lower” are used to indicate the relative ranking of quality in this report. All of these studies were conducted in “real world” situations where the many controls against bias available in clinical randomized, controlled trials, for example, are not feasible. Such trials are often precluded for ethical reasons. Furthermore, the focus on the group as the unit of analysis weakens the study design because the sample size is usually much smaller, taking into account the number of groups and the intraclass correlation coefficient. All of the quality assessments and conclusions about evidence were made with this limitation in mind.

Data Synthesis and Grading the Body of Evidence

As in the previous review,3 the articles in this review differed greatly in QI targets, QI strategies, methods of measuring adherence to preventive interventions, preventive interventions, contexts, and study design. Quantitative analyses are not feasible and the studies are synthesized in a qualitative manner.

The articles included in this review are divided into two categories, those with infection rates or adherence rates that were adjusted for confounding or secular trends and those that adjusted for neither. Because of the extensive challenges to the validity of the latter, they are not included in the detailed description of the body of evidence or assessment of the strength of evidence. They are described briefly under each type of infection in the Results chapter of the full report, included in Appendix C, and enumerated in Appendix F.

The overall strength-of-evidence grade was determined in compliance with AHRQ's Methods Guide for Effectiveness and Comparative Effectiveness Reviews43 and is based on a system developed by the Grading of Recommendations Assessment, Development and Evaluation (GRADE) Working Group.44 This system explicitly addressed the following domains: risk of bias, consistency, directness, and precision. The grade of evidence strength was classified into the following four categories:

  • High. High confidence that the evidence reflected the true effect. Further research was very unlikely to change our confidence in the estimate of effect.
  • Moderate. Moderate confidence that the evidence reflected the true effect. Further research may have changed our confidence in the estimate of effect and may have changed the estimate.
  • Low. Low confidence that the evidence reflected the true effect. Further research was likely to change our confidence in the estimate of effect and was likely to change the estimate.
  • Insufficient. Evidence was either unavailable or did not permit estimation of an effect.

Additional domains including strength of association, publication bias, coherence, dose response relationship, and residual confounding were addressed if appropriate. Specific outcomes and comparisons were rated depending on the evidence found in the literature review. The grade rating was made by independent reviewers, and disagreements were resolved by consensus adjudication.

Originally, we planned to use the modification of the GRADE approach for patient safety practices proposed in the RAND report,20 but then decided to use the qualitative approach outlined above, given the heterogeneity of the included studies.

Peer Review, Public Commentary, and Technical Expert Panel

A Technical Expert Panel (TEP) was formed to provide consultation on the development of the protocol and evidence tables for the review. Ad hoc clinical questions were also addressed to the TEP. The TEP consisted of experts in healthcare–associated infectious diseases, epidemiology, hospital medicine, surgery, critical care, and perioperative nursing.

Experts in hospital–acquired infections and QI implementation fields and individuals representing stakeholder and user communities were invited to provide external peer review of this CER; AHRQ and an associate editor also provided comments. The draft report was posted on the AHRQ website for 4 weeks to elicit public comment. We addressed all reviewer comments, revising the text as appropriate, and documented everything in a disposition of comments report that will be made available 3 months after the Agency posts the final CER on the AHRQ website.



The following fields were NOT abstracted for the studies that controlled for confounding and/or secular trend: (1) clinical characteristics, (2) number of health care staff, (3) interventionists, (4) intervention expected influence on behavior, (5) financial status, (6) description of incentives, and (7) description of feedback and consequences.

For the studies that did not control for confounding, less was abstracted. Only the following fields were abstracted for this set of articles: (1) study design, (2) infections reported, (3) QI strategies, (4) intervention and comparator used, and (5) cost data, if available