Effectiveness of QI Strategies

This investigation determined that QI interventions provided small-to-modest improvements in glycemic control and provider adherence. Taken as a whole, the interventions studied in the 66 included comparisons reported a median absolute reduction in serum HbA1c of 0.48% (IQ range: 0.20%, 1.38%) and median absolute increase in provider adherence of 4.9% (IQ range: 3.8%, 15.0%) above any improvements observed from “usual care.” The researchers also found that interventions involving more than one QI strategy resulted in a greater benefit than did interventions using a single strategy. This difference achieved statistical significance, but nevertheless should be interpreted with caution, as the small number of single-faceted interventions in the review makes confounding by other factors (e.g., the intensity of these single interventions, as well as various patient, provider, and organizational characteristics) a substantial possibility. Disease management and changes to the existing medical record system (e.g., implementation of a specialized patient registry, or a more generalized clinical information system) were associated with trend toward larger improvements in glycemic control, but these relationships were not pre-specified as hypotheses. Moreover, they would not withstand correction for multiple comparisons in this largely exploratory analysis, and were less pronounced in RCTs.

The findings should be interpreted cautiously for several reasons. First, the reviewers found that larger and more rigorously designed trials found a smaller benefit than did smaller or less rigorously designed trials. As discussed below, this finding strongly suggests the presence of publication bias. Second, most interventions involved multiple QI strategies, thus limiting assessments of the intrinsic benefit for any particular QI strategy. Finally, this review considered only QI studies regarding diabetes. QI studies related to other diseases are relevant in understanding the usefulness of specific QI strategies, as discussed in further detail below. Because of the importance of potential publication bias, the discussion will begin with this topic.

Publication Bias

The researchers found a significant inverse correlation between trial design and the magnitude of reported improvements in provider adherence, and, to a lesser extent, glycemic control (i.e., comparisons employing a randomized design reported significantly smaller improvements). Whereas non-randomized trials reported a median absolute improvement in provider adherence of 18.0% (IQ range: 17.2%, 21.0%), randomized trials reported a median improvement of only 4.5% (IQ range: 3.5%, 5.4%). The Spearman correlation coefficient was significant for both outcomes, but the investigators also tested the impact of trial design in a regression model adjusted for baseline differences between the control and intervention groups, as well as weighting by sample size, to ensure that this did not reflect baseline imbalances in study groups (which would occur more commonly in non-randomized trials). This analysis eliminated the significant relationship between trial design and glycemic control, but the relationship remained highly significant for provider adherence. On average, the improvement in provider adherence observed in randomized trials was 14.3% less than that observed in non-randomized trials (p=0.001).

For studies of glycemic control, the relationship to trial design was less clear-cut, but a striking inverse relationship existed with sample size. Among the 38 comparisons reporting changes in mean glycemic control, those falling in the lowest quartile of sample size reported a median reduction in serum HbA1c of 1.35% (IQ range: 0.81%, 1.73%), whereas those in the highest quartile reported a median reduction of only 0.10% (IQ range: 0.10%, 0.33%). This inverse relationship between sample size and observed impact on glycemic control was statistically significant (Spearman rank correlation coefficient = 0.39, 95% CI: 0.01, 0.67; p= 0.04).§§ These findings strongly suggest substantial publication bias operating at the level of sample size and trial design, such that publication of smaller studies with non-randomized designs occurs more often when reported improvements are large, than when the improvements are small or negative.

Correction for multiple hypothesis testing might give the appearance of chance association for some of these relationships. However, the results of these hypothesis tests also must be considered in the context of the prior probability or expectation that such associations might well exist. As discussed in the Methods section, publication bias is likely to affect this review, at least to the same extent that it exists for meta-analyses of clinical research—if not to a greater extent. Sample size and trial design are the two most often identified factors playing a role in publication bias. Thus, the detection of an inverse relationship between either study size or trial design and the magnitude of reported effect is more plausibly regarded as a confirmation of publication bias, than as a chance association due to multiple comparisons.

Benefit of Multifaceted Interventions

Despite the associations of effect size with sample size and trial design, certain findings appear to reflect more than just the effects of publication bias. In particular, interventions having at least two component QI strategies were associated with median effects significantly larger than were single-faceted interventions. The 32 comparisons involving interventions with at least two strategies reported a median reduction in serum HbA1c of 0.60% (95% CI: 0.30%, 1.40%) compared with a median reduction of 0.00% (IQ range: -0.08%, 0.16%). These medians are unlikely to be equivalent, given the Mann-Whitney test result of p=0.01. The significance of this difference further increased (p=0.005) when the investigators reclassified interventions using a scheme similar to other authors, in which the major substrategies of provider education and organizational change were treated as their own categories. These results might be considered of borderline significance, given the multiple hypotheses explored in the analysis, except that this hypothesis differed from the others in its role as one of three a priori hypotheses.

Nevertheless, this finding will require further exploration. Other reviews have reached conflicting conclusions regarding the relative impact of adding more QI strategies, irrespective of their content.16, 54, 85, 103 As with the analysis of specific QI types, the apparent impacts of a particular number of strategies (even the simple distinction between single- and multifaceted) is confounded by the distribution of the particular QI types across interventions. The authors cannot rule out the possibility that the one or two strategies with the largest, true underlying effects happen to be included in strategies that incorporated more QI components. Further confounding undoubtedly occurs as a result of non-random relationships between the adoption of more complex interventions and characteristics related to the local proponents of the intervention and/or the organizational milieu. For example, more complex interventions may occur more commonly in institutions with a greater commitment to quality improvement, which might affect support from senior management, availability of resources, and attitudes of participants, among other potential predictors of intervention success.

Uncertain Benefit for Specific QI Strategies

Disease management was the only strategy to exhibit an impact on median effects on glycemic control that approached a level of significance such that it would withstand correction for multiple hypothesis testing. Even without such adjustment, however, this apparent effect was diminished somewhat by a focus on larger trials and diminished substantially by restricting the analysis to randomized trials. Moreover, the regression analysis adjusting for baseline group imbalances and weighting by sample size yielded a non-significant result for disease management as a predictor of improved glycemic control.

A recent and systematic review of disease management strategies reported significant beneficial effects on measures of disease control such as the authors examined.17 This comprehensive and well-conducted review had the advantage of cutting across multiple conditions (in contrast to this review of diabetes, and another systematic review focused on disease management for heart failure patients167). The recent crosscutting review,17 however, did not take into account cluster effects.168 Nor did it adjust for baseline differences between intervention and control groups. As outlined in the Methods section and reviewed elsewhere at length, the adoption of a randomized design does not preclude the need to adjust for baseline differences likely to impact the outcome of interest, even when these baseline differences do not appear significant.9094

Among other individual QI strategies, trials using provider education achieved the highest absolute reduction in HbA1c and had a significant Mann-Whitney comparison test, versus trials without provider education. Provider education also was the only strategy to emerge as a significant predictor for improved provider adherence in regression analysis. As outlined above, however, these results were found to lose their significance if adjusted for multiple comparison testing.

Little Benefit from Existing Clinical Information Systems

Apart from the implementation of a new clinical information system (which was treated as a type of organizational change), the investigators further assessed the potential impact of existing clinical information systems performing any of five specific roles. Thirty percent of the included interventions involved some role for a clinical information system, and these interventions reported greater median improvement in glycemic control than did interventions without any role for a clinical information system. This difference was not statistically significant, however, (p=0.10 for Mann-Whitney test even without adjustment for multiple comparisons) and shifting the focus to larger studies and those with a randomized design diminished substantially the appearance of a benefit for interventions with some role for a clinical information system. Clinical information systems also had no apparent additional effect on provider adherence, compared with interventions without any role for an information system.

Focusing on specific roles for clinical information systems suggested no incremental benefit for any particular informatics function (e.g., decision support, auditing clinical performance, reminder systems). It should be noted that these findings reflect very small numbers of studies. The disappointing findings, however, also should be considered in light of likely confounding factors, which could inflate reported effects. For instance, the presence of sophisticated information systems is likely to be associated with the presence of other factors plausibly associated with successful interventions (e.g., greater financial resources, increased institutional investment in QI). Furthermore, while other reviews have found evidence supporting the impact of decision support systems,33, 169 it is noteworthy that the most recent and possibly best-designed study assessing the impact of a clinical information system in outpatient management of chronic illnesses showed no beneficial impact on processes of care or any patient outcome for asthma or chronic angina.170 The same investigators are likely to publish the results of a similar trial focused specifically on diabetes care in the near future,171 which will add substantially to the evidence addressing this topic.

Of course, the absence of a demonstrable benefit does not prove a lack of benefit, and there are sound a priori reasons to believe that changes to existing medical record systems (e.g., a clinical information system deployment) might confer some benefit in diabetes care. In addition to the non-significance of this result, however, it is worth noting that evaluations of clinical information systems involve a special type of publication bias, insomuch as systems with failed172174 or unsatisfactory175 implementations generally are excluded from evaluations of the of the intervention benefits, even though these implementations consume significant QI resources.

Comparison with Previous Review of this Topic

In the previous Cochrane review of this topic,16 all included trials were judged to have more than one QI strategy, permitting no direct comparison of single and multifaceted interventions. (The authors inferred a benefit from multifaceted interventions based on the general finding of positive effects for the various multifaceted interventions evaluated.) The researchers involved with this review regarded 14 trials as having a single QI strategy, nine of which were published after the last substantive update to the Cochrane review.89, 104, 112, 114, 116, 119, 121, 147, 155 The multiple QI strategies designation given to the remaining five studies by the Cochrane reviewers reflected differences in taxonomy in several instances. For instance, educational meetings and distribution of educational materials were considered separate strategies, rather than substrategies within the broader category of patient education. In other cases, however, there appears to have been a difference in judgment between the present reviewers and those involved with the previous study. Nevertheless, when the present investigators employed a taxonomy more akin to that used in the Cochrane review (in which major substrategies of provider education and organizational change were treated as their own categories), multifaceted interventions showed a median reduction in serum HbA1c of significantly greater magnitude than that reported by single strategy interventions (0.58% vs. 0.05%; p=0.005).

The Cochrane review also included a study176 as an interrupted time series that the present investigators regarded as a simple before-after study and therefore excluded it. Re-review of these studies might produce consensus, but this disagreement further reinforces the notion that catagorizing design attributes for trials can be challenging and requires interpretation based on limited descriptions.

The Cochrane review16 of this topic suggested that multifaceted interventions carried benefit, and cited organizational changes (including computerized patient tracking systems and structured recall of patients) as particularly worthwhile interventions. The present review lends greater credence to the benefit of multifaceted interventions, as the Cochrane review included no single faceted interventions and performed no quantitative analysis. Thus, the benefit of multifaceted interventions was inferred simply from the qualitative impression of benefit derived from the included studies, without any comparison to single-faceted interventions.


An important limitation of this review arises from the studies themselves, and underscores the need for more rigorously designed studies of quality improvement interventions. The limitation can be split into two categories: issues specifically related to the design and interpretation of research in quality improvement64, 79, 177, 178 and problems with respect to the optimal design and reporting of health care research in general. Examples of the second category include a failure to report baseline data for the outcomes of interest, data omissions such as sample sizes or standard deviations, and inappropriate choices of statistical tests. Examples of the first category—problems more specifically related to quality improvement—highlight important gaps in the literature. They include limited descriptions of the interventions themselves, such that many interventions could not be categorized except in the most general terms (e.g., “provider education” or “disease management”), nor replicated by other investigators; omissions of important information regarding factors likely to affect intervention success or failure (e.g., the degree of institutional support, the availability of ancillary administrative resources, the attitudes of participants towards the intervention, or the perceived quality target). Another factor hampering quality improvement research is the lack of grounding in a theoretical understanding of how to effect change at the level of individual behavior or organizational culture and structure (see Chapter 3). For instance, the vast majority of studies provided no answer to the most basic question of why a particular QI strategy was selected to address a given problem (e.g., why provider education and not, say, audit and feedback—or vice versa). Choices regarding the format for delivering the selected QI strategy similarly received little to no attention. Some of this information is challenging to collect or quantify (e.g., degree of institutional support), but other types of data (e.g., attitudes of participants) could be collected and may prove helpful in determining why some QI interventions succeed while others fail.

The lack of theoretic grounding for many QI interventions is in stark contrast to the clinical research (see Chapter 3). By the time a clinical intervention reaches the stage of evaluation in a randomized trial, a substantial body of research (both basic scientific and epidemiologic) generally exists and lends credence to the hypothesis that the intervention will benefit patients. as A number of theoretic models of behavioral or organizational change so exist, as noted in Chapter 2, but their usefulness and applicability to QI interventions in health care has not been well studied. Improving the state of evidence regarding QI interventions will require commensurate improvements in the preliminary research leading up to the design or selection of a particular intervention.

Despite the aforementioned general concerns regarding the methodology and underpinnings of many QI intervention evaluations, it is possible that beneficial interventions already exist and that our analysis has failed to identify their benefits relative to other interventions or usual care. It is worth noting in this regard that the investigators' analysis suggested a modest but statistically significant correlation between study period and baseline provider adherence (i.e., more recent studies tended to report higher baseline adherence). This suggests quality improvement has occured and, perhaps more importantly, that achieving the same rate of improvement may become more difficult with time. The decision to pursue further improvements with respect to any given target depends on a number of factors: the effectiveness of the targeted process of care (e.g., how well tight control works to prevent diabetic complications); the effectiveness of strategies for promoting further changes in patient or provider behavior; the costs associated with the targeted quality gap; and the costs of interventions attempting to narrow this gap.179

The small number of studies in certain areas also is likely to limit our ability to detect true benefits for QI strategies. Although this review is the largest study of this particular topic area to date, the number of studies for a particular QI strategy assessing specific outcomes was generally 10 or fewer. When possible, the investigators performed regression analysis and also nonparametric tests to assess for differences between medians. Given the small number of studies in some groups, however, the possibility that small effects may have gone undetected cannot be ignored.

In an attempt to increase the number of studies providing data for the quantitative analysis, the researchers focused on glycemic control (measured by serum HbA1c ) as the sole measure of disease control. Thus, they were unable to capture the impacts of the interventions on other important aspects of diabetes-related morbidity, such as cardiovascular disease. A QI strategy producing even a modest impact on blood pressure control or hyperlipidemia could confer substantial benefits to patients. In fact, one of the most comprehensive studies in the sample82 demonstrated a reduction in cardiovascular mortality, while reporting no significant reduction in HbA1c for intervention patients compared with patients receiving usual care. Thus, in the absence of markedly positive effects for any single strategy, a crosscutting strategy that has modest individual effects on glycemic control, hypertension, and hyperlipidemia could result in a significant benefit for diabetic patients.



This inverse correlation persisted with restriction among randomized trials and even among randomized trials without clustering effects (Spearman = 0.415; p=0.07). The loss of statistical significance for the second subset presumably reflects the decreased number of studies.