The methods for this review follow the methods suggested in the Agency for Healthcare Research and Quality (AHRQ) Methods Guide for Effectiveness and Comparative Effectiveness Reviews (available at The main sections in this chapter reflect the elements of the protocol established for the review (and the Closing the Quality Gap series). All methods and analyses were determined a priori, unless otherwise specified.

Topic Refinement and Review Protocol

Topics for the Closing the Quality Gap series were solicited from the portfolio leads at AHRQ. The nominations included a brief background and context, the importance and/or rationale for the topic, the focus or population of interest, relevant outcomes, and references to recent or ongoing work. Among the topics that were nominated, the following considerations were made in selection for inclusion in the series: the ability to focus and clarify the topic area appropriately; relevance to quality improvement and a systems approach; applicability to the Evidence-based Practice Center (EPC) program; amenable to systematic review; the potential for duplication and/or overlap with other known or ongoing work; relevance and potential impact in improving care; and fit of the topics as a whole in reflecting the AHRQ portfolios.

The EPC then clarified the scope of the project. A key consideration was ensuring that the report built upon and added to existing syntheses of this topic. Rather than replicate ongoing updates of a Cochrane review by Haynes and colleagues,28 we sought to address some of the areas outside its purview, and in doing so, pay attention to the themes of the Closing the Quality Gap series and AHRQ’s concerns regarding priority and vulnerable populations. The specific constraints of the Haynes review that we wanted to address included (1) the requirement that included studies had to report both adherence and health outcomes, (2) the focus on randomized controlled trials (RCTs) alone, (3) the absence of subanalyses on vulnerable subpopulations, and (4) the lack of focus on adverse events.

As noted in the introduction, one reason for expanding the scope to include studies that report adherence alone rather than both health outcomes and adherence is that this approach allowed us to include a more representative range of interventions that might improve adherence. We note that interventions may be designed to alter moderators of medication adherence at the level of the patient, health care provider, health system, or policy. The reason for expanding the scope to include some observational studies (such as controlled clinical trials, cohort studies with comparators, and large database analyses) is that these studies allowed us to assess the effectiveness of policy innovation in practice settings that are not usually tested in trial settings.

AHRQ staff generated the initial topics for this series and our review. We generated an analytic framework, preliminary Key Questions (KQs), and preliminary inclusion/exclusion criteria in the form of PICOTS (populations, interventions, comparators, outcomes, timing, settings). Our KQs were posted for public comment on AHRQ’s Effective Health Care Web site from March 11, 2011, to April 8, 2011. We revised the KQs as needed based on review of the comments and discussion with a five-member Technical Expert Panel (TEP), primarily for readability and greater comprehensiveness.

TEP members represented several professions (medicine, nursing, and pharmacy) and research areas (health services, pharmacoepidemiology, patient education, self-management, and health literacy). They provided high-level content and methodologic expertise throughout the development of the review.

Literature Search Strategy

Search Strategy

To identify articles relevant to each KQ, we began with a focused MEDLINE® search for medication adherence interventions using a combination of Medical Subject Headings (MeSH) and title and abstract keywords (Appendix A). We searched Cochrane Library and the Cochrane Central Trials Registry using analogous search terms. To identify articles specifically relevant to KQ 2, we conducted a second, “policy-oriented” search (Appendix A) and added unique results to those references identified in the main search for medication adherence interventions. We reviewed our search strategy with TEP members and supplemented it as needed according to their recommendations. In addition, to avoid retrieval bias, we manually searched the reference lists of pertinent reviews on this topic to look for any relevant citations that might have been missed by our searches. We imported all citations into an EndNote® X4 (Thomson Reuters, New York, NY) electronic database.

We conducted an updated literature search (of the same databases searched initially) concurrent with the peer review process. Literature suggested by peer reviewers or from the public were investigated and, if appropriate, incorporated into the final review. Appropriateness for inclusion in the review was determined by the same methods listed above.

Inclusion and Exclusion Criteria

Table 2 presents the inclusion/exclusion criteria for our review. Details about PICOTS related to inclusion/exclusion criteria can be found in the Introduction chapter.

Table 2. Inclusion/exclusion criteria.

Table 2

Inclusion/exclusion criteria.

Study Selection

Two trained members of the research team independently reviewed all titles and abstracts (identified through searches) for eligibility against our inclusion/exclusion criteria. The abstract review form is shown in Appendix B. Studies marked for possible inclusion by either reviewer underwent a full-text review. For studies that lacked adequate information to determine inclusion or exclusion, we retrieved the full text and then made the determination. All results were tracked in an EndNote® database.

We retrieved and reviewed the full text of all titles included during the title and abstract review phase. Two trained members of the team independently reviewed each full-text article for inclusion or exclusion based on the eligibility criteria described above. The full-text review form is shown in Appendix B. If both reviewers agreed that a study did not meet the eligibility criteria, the study was excluded. If the reviewers disagreed, they resolved conflicts by discussion and consensus or by consulting a third member of the review team. All results were tracked in an EndNote database. We recorded the principal reason that each excluded full-text publication did not satisfy the eligibility criteria (Appendix C).

Data Extraction

For studies that met our inclusion criteria, a trained reviewer abstracted important information into evidence tables; a second senior member of the team reviewed all data abstractions for completeness and accuracy. We designed and used structured data abstraction forms to gather pertinent information from each article, including characteristics of study populations, interventions, comparators, settings, study designs, methods, and results. All data abstraction was performed using Microsoft Excel® software. Evidence tables containing all abstracted data from included studies are presented in Appendix D. Evidence tables are presented in alphabetical order by last name of first author.

As specified above for KQ 1 and KQ 2, we abstracted data on other outcomes only for interventions that showed statistically significant improvement in at least one measure of medication adherence. We used thresholds for medication adherence as defined by each study; that is, we did not predefine standards for improvement in medication adherence for all clinical conditions. We recorded all morbidity and biomarker data for studies reporting any statistically significant improvement in medication adherence. We abstracted information on patient characteristics such as age, sex, race and ethnicity, special health care needs (such as low health literacy, comorbid disease, or severe disease), income, insurance status, and geographic location (inner city or rural), when available. We recorded intention-to-treat (ITT) results when available; ITT analysis treats all participants as if they have completed the study within their treatment assignment groups, even if they have stopped participating. This type of analysis can be done by carrying forward participants’ baseline observations or their last observations before study completion or attrition. We also abstracted intervention characteristics as described in KQ 3.

Risk-of-Bias Assessment of Individual Studies

To assess the risk of bias (internal validity) of studies, we used predefined criteria based on those developed by AHRQ83 and specified in the RTI Item Bank.84 In general terms, the results from a low-risk-of-bias study are considered to be valid. A study with moderate risk of bias is susceptible to some risk of bias but probably not enough to invalidate its results. A study assessed as high risk of bias has significant risk of bias (e.g., stemming from serious errors in design or analysis) that may invalidate its results.

Specific concerns for our review include selection bias, information bias, and detection bias. For selection bias, we evaluated studies for their approaches to recruitment and accounting or controlling for variations in past nonadherent behavior. Selection bias occurs when comparison groups are systematically different because of nonequivalent sample recruitment methods.

For information bias, we evaluated studies for their application of proper research design to reduce the possibility that factors other than the interventions affected outcomes of interest. Information bias refers to systematic error in the measurement of covariate and outcome data that leads to differences between comparison groups not caused by the intervention of interest. Design elements that reduced the risk of information bias included the use of double blinding, allocation concealment, ITT analysis, nonselective outcome reporting, and strategies to prevent or reduce treatment contamination. When investigators did not use ITT analysis, we considered the risk of information bias to be elevated if treatment completers differed from noncompleters or if completers were not compared with noncompleters.

For detection bias, we evaluated the method of recording adherence. In particular, we evaluated whether adherence measures relied solely on self-reported data. Detection bias is a type of information bias in which the measurement of outcomes is prone to error because of how they are measured.

Two reviewers independently assigned risk of bias ratings for each study. Disagreements between the two reviewers were resolved by discussion and consensus or by consulting a third member of the team. We excluded studies that were dually assessed as having high risk of bias from further analysis. The evidence tables present consensus ratings for all studies with low, medium or high risk of bias (Appendix E). A list of scales used in included studies is presented in Appendix F.

Data Synthesis

We elected to stratify our results in KQ 1 by clinical condition. We based our choice of clinical condition (rather than, say intervention type) as our primary analytic lens because this approach allowed us to disentangle the possible confounding between clinical condition and type of intervention. Our analytic approach is useful for researchers working within a clinical condition. We present a brief synopsis of intervention effectiveness across clinical conditions in our discussion chapter for those clinical providers interested in the effectiveness of particular intervention approaches aimed at patients, providers, or the system.

Given the wide variation of care in the “usual care” arms of included interventions, we did not attempt indirect comparisons across interventions for KQ 1. For trials that selected patients with two concurrent clinical conditions and evaluated medication adherence and other outcomes for both conditions, we sought to reduce repetition by focusing on the outcomes specific to the medication relevant to each clinical condition. We grouped trials that selected patients with more than two concurrent clinical conditions under a section entitled “multiple chronic conditions.”

KQ 2, on policy interventions, summarizes information on interventions designed to address many or all clinical conditions. We present KQ 2 by intervention type first and then provide condition-specific details. KQ 3 presents results categorized by intervention characteristics. KQ 4 presents outcomes by vulnerable subpopulation and KQ 5 presents a list of adverse events.

We specified all outcomes other than morbidity and biomarkers a priori and listed them above in the PICOTS criteria (listed in the Introduction). Because of the breadth of the topic for our review, we elected, based on feedback from our TEP, to collect a comprehensive set of biomarkers and morbidity outcomes rather than make a priori judgments about which specific outcomes to include. When appropriate data were available, we described results from direct comparisons. We did not attempt indirect comparisons, given the heterogeneity of usual care comparators.

We evaluated whether the collected data could be pooled by considering similarity of PICOTS. In instances with three or more similar studies (population, intervention, comparator, outcome), we considered conducting quantitative analyses (i.e., meta-analysis) of the data from those studies. When quantitative analyses were not appropriate (e.g., because of heterogeneity, insufficient numbers of similar studies, or insufficiency or variation in outcome reporting), we synthesized the data qualitatively.

Grading Strength of Evidence

We graded the strength of evidence based on the guidance established for the EPC program.85 Developed to grade the overall strength of a body of evidence, this approach incorporates four key domains: risk of bias (including study design and aggregate quality), consistency, directness, and precision of the evidence. We reviewed and handsearched citations from relevant systematic reviews to ensure that we included all eligible studies.

We graded the strength of evidence for medication adherence, morbidity, mortality, and other long-term health outcomes for KQ 1 and KQ 2, for vulnerable subpopulations (KQ 4), and for harms (KQ 5). Two reviewers independently scored each domain for each key outcome and resolved differences by consensus; when they could not reach consensus, a third senior reviewer arbitrated the decision. Table 3 defines the strength-of-evidence grades.

Table 3. Definitions of the grades of overall strength of evidence.

Table 3

Definitions of the grades of overall strength of evidence.

Applicability Assessment

We assessed the applicability of the evidence following guidance from Atkins and colleagues.86

We used the PICOTS framework to explore factors that affect or limit applicability. They included the following:

  • Population

    Narrow eligibility criteria or exclusion of patients with comorbidities.

    Large differences between demographics of the study population and community patients.

    Narrow or unrepresentative disease severity, stage of illness, or comorbidities.

  • Interventions

    Intensity and delivery of behavioral interventions that may not be feasible for routine use.

    Highly selected intervention team or level of training and proficiency not widely available.

  • Outcomes

    Composite outcomes that mix outcomes of different clinical or policy significance.

    Short-term or surrogate outcomes.

Peer Review and Public Commentary

This report received external peer review. Peer Reviewers were charged with commenting on the content, structure, and format of the evidence report, providing additional relevant citations, and pointing out issues related to how we conceptualized the topic and analyzed the evidence. Our Peer Reviewers (listed in the front matter) gave us permission to acknowledge their review of the draft. In addition, the Scientific Resource Center placed the draft report on the AHRQ Web site ( for public review. We compiled all peer review and public comments and addressed each one individually, revising the text as appropriate. AHRQ staff and an associate editor provided reviews. A disposition of comments from public commentary and peer review will posted on the AHRQ Effective Healthcare Web site ( 3 months after the final report is posted.