NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Berkman ND, Sheridan SL, Donahue KE, et al. Health Literacy Interventions and Outcomes: An Updated Systematic Review. Rockville (MD): Agency for Healthcare Research and Quality (US); 2011 Mar. (Evidence Reports/Technology Assessments, No. 199.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Health Literacy Interventions and Outcomes: An Updated Systematic Review

Health Literacy Interventions and Outcomes: An Updated Systematic Review.

Show details


In this chapter, we document the procedures used by the RTI International–University of North Carolina Evidence-based Practice Center (RTI–UNC EPC) to develop this comprehensive evidence report Health Literacy Interventions and Outcomes, an update to our 2004 systematic review Literacy and Health Outcomes. The key questions (KQ s) for this update review are the same as those in the original review, with the exception that literacy has been replaced by the broader term health literacy. This decision, which is discussed in detail in Chapter 1, was primarily made to acknowledge numeracy (the ability to use quantitative information) and oral literacy (the ability to listen and speak effectively) in addition to print literacy. Thus, in this review as in our original report, we include studies that purport to measure either participants' health literacy or their general literacy in a health setting; we, however, refer to these measures in aggregate as measures of health literacy. We additionally separately review studies of numeracy and health outcomes to highlight the findings from this relatively new body of research. Although we attempted to review the relationship between oral health literacy skills and health outcomes, we found no studies that measured oral health literacy skills that met our other inclusion criteria.

Our specific methodology in conducting an updated review is discussed below. To provide a framework for the review, we first present changes from our prior review. We then describe the KQ s and their underlying analytic framework, our inclusion and exclusion criteria, search and retrieval process, and methods of abstracting relevant information from the eligible articles to generate evidence tables. We also discuss our criteria for rating the quality of individual studies and for grading the strength of evidence as a whole.

Our overall goals were to evaluate whether newer literature was appropriate for answering our key questions and to determine whether earlier conclusions changed. We modified the original methods as follows:

  • We broadened our definition of health literacy to be consistent with the Ratzan and Parker (2000) definition used by Healthy People 2010 and the Institute of Medicine. Thus, we now include studies that evaluated the numeracy skills of participants. Our inclusion criteria also encompassed studies that used measures of oral (spoken) health literacy or other skills-based approaches to health literacy measurement, but we did not find any such published studies.
  • We examined the outcome of knowledge only in relation to outcomes related to numeracy level and intervention studies because evidence in the earlier review clearly concluded that greater literacy skills and higher health-related knowledge levels are positively related.
  • We required that studies directly measured the health literacy of the study population and did not conclude health literacy level via self-report or similarity to other populations.
  • We modified criteria for evaluating individual study quality to incorporate advances in the methodology of conducting systematic reviews, including not using a numeric summary of individual criteria in determining the overall quality rating.
  • We included studies conducted in developing countries as long as an objective assessment of literacy or health literacy was measured directly in participants.
  • If information was missing from articles about intervention studies, we queried the investigators to allow richer interpretation about what interventions may be effective in mitigating the effects of low health literacy.

Key Questions and Analytic Framework

Based on the growing appreciation of the complexity of the relationship between health literacy and obtaining medical care and achieving good health outcomes, we pose two key questions in this report. Both have four parts.

KQ 1.

Are health literacy skills related to

  1. Use of health care services?
  2. Health outcomes?
  3. Costs of health care?
  4. Disparities in health outcomes or health care service use according to race, ethnicity, culture, or age?
KQ 2.

For individuals with low health literacy skills, what are effective interventions to

  1. Improve use of health care services?
  2. Improve health outcomes?
  3. Affect the costs of health care?
  4. Improve health outcomes and/or health care service use among different racial, ethnic, cultural, or age groups?

Figure 1 depicts the analytic framework for our KQ s. Solid lines show the relationship between health literacy skills and outcomes (KQ 1) and between interventions and outcomes (KQ 2); dotted lines show factors that might influence or be intermediaries in these relationships.

Figure 1. Analytic framework for the health literacy systematic review. This flow diagram depicts the analytic framework for our key questions. They are a series of boxes connected by sold and dotted lines. Solid lines show the relationship between health literacy skills and outcomes (key question (1) and between interventions and outcomes (key question (2); dotted lines show factors that might influence or be intermediaries in these relationships. The box on the left labeled health literacy level is connected to another box with a solid line. This box is labeled key question 1 A: use of health care services. The box on the right is labeled intervention. It is also connected to the key question 1 A box, use of health care services. There is a box on the bottom labeled age, culture, race, and ethnicity. This box is connected with dotted lines to both the boxes labeled health literacy level and intervention. The two boxes labeled health literacy level and intervention are interconnected by solid lines to other boxes labeled cost of health care, use of health care services, and health outcomes.

Figure 1

Analytic framework for the health literacy systematic review.

Figure 2 outlines a more detailed logic model explicating outcomes that were included in our review. This model draws both on several models of health literacy proposed by researchers in the field and on an integrated model of behavioral theory.55,56 The Integrative Theory, proposed by Fishbein in 2000, reflects a growing consensus that (1) a core set of variables (e.g., attitudes, social norms, and self-efficacy) derived from the major predictive theories of behavior change (e.g., Health Belief Model, Theory of Reasoned Action, Social Cognitive Theory) are responsible for most of behavioral intention, and that (2) these variables, in combination with an adequate skill set and removal of environmental constraints, predict actual behavior change.55

This figure is a logic model for analyzing studies of health literacy. Proceeding from left to right in a series of boxes arranged in a flow chart, the figure begins at health literacy level and moves to two boxes, knowledge and accurate risk perception and self-efficacy. Knowledge and accurate risk perception proceeds to four boxes. The first three boxes are attitudes, social norms, and self-efficacy, which advance to intent for health behavior. The fourth box, skills, includes five items: take medications, self-monitoring, recognize emergency, seek additional health information, and access care. The skills box, the intent for health behavior box, and two other boxes, support from provider/joint decisionmaking and resources, which includes ability to pay and access to care, point toward the next box, initiation of health behavior. This box advances to adherence to health behavior. This box and another box, use of health care services, which includes emergency room visits, office visits, hospitalization, and prevention, both point to a final box, health outcomes, which includes disease, disease severity, quality of life, and death.

Figure 2

Logic model for the health literacy systematic review.

Our logic model was used to determine whether studies considered for inclusion have relevant health outcomes. It also guided our presentation of included articles. It was not meant to be a definitive guide to the relationship between variables because many of these relationships have not been explicitly tested in the field of health literacy. Furthermore, it was not meant to provide a definitive statement about what constitutes a “good outcome.” For some outcomes in the logic model, increases represent the good outcome (e.g., adherence, most screening tests).

For other outcomes, decreases represent the good outcome (e.g., hospitalizations, mortality). For KQ 1a and 2a, we consider any process of care as a health service; this includes clinic and hospital visits, hospitalizations, and use of preventive and screening services. For KQ 1b and 2b, we use the term “health outcomes” broadly to encompass both intermediate and distal outcomes, even though in many cases the intermediate outcomes will be only surrogates or proxies for health-related end results of care. Outcome categories include the following:

Knowledge: As described above, we consider knowledge as a final outcome only in relation to numeracy (KQ 1) and intervention studies (KQ 2). We do not include it in our consideration of the relationship between health literacy and health outcomes (KQ 1) because evidence in the earlier review clearly concluded that greater literacy skills and higher health-related knowledge levels are positively related.

Self-efficacy: Self-efficacy, a person's confidence in his or her ability to carry out a health behavior, is an important intermediate outcome in many behavioral theoretical models. It is a predictor of behavioral intent.

Behavioral intent: Behavioral intent is a person's stated likelihood of starting a behavior. It is an important hypothesized intermediate step in the causal pathway between health literacy level and health outcomes.

Skills and behaviors: The relationship between health literacy and intermediate and ultimate outcomes depends on a person's health skills and behaviors. Skills include a person's ability to recognize emergency situations, seek additional health information, or access needed health care. Behaviors include actions such as taking medication, changing one's lifestyle, or monitoring one's health.

Adherence to health behavior: Adherence is the ability to carry out a health behavior over a meaningful period of time, such as regularly taking a medication “as prescribed” over the period of time for which it is prescribed. Adherence is an important predictor of health outcomes.

Measures of disease incidence, prevalence, morbidity, and mortality: This category includes such outcomes as rates of physical and mental health conditions, stages of cancer presentation, severity of diseases, measures of disease control and complications, and death rates. These outcomes may be measured by biomarkers, validated survey instruments and questionnaires, patient self-report, or, in the case of mortality, vital records or proxy reports.

Health status: This outcome includes generic (and condition-specific) measures of health status or health-related quality of life; the domains of interest are physical health and mental health functioning (e.g., cognitive abilities), pain or fatigue, and perhaps social functioning and social networks. They are usually assessed by self-report questionnaires that have been shown to predict health outcomes.

Of particular note for KQ 1b is that we did not examine outcomes related to attitudes. This decision was based on the belief that attitudes result from knowledge, which, as described above, is not examined in the current report. Further, we did not examine outcomes related to social norms or patient-provider relationships (e.g., shared decisionmaking) because we thought that these variables likely affected the direction or strength of the relationship between behavioral intent and health outcomes rather than lying on the causal pathway. Clearly, however, empiric work is needed to test these assertions prior to future reviews.

For KQ 1c on measuring the cost of health care, we included any study that measured the monetary cost of health care services, including both direct and indirect costs. For KQ 2c, we also included studies measuring the cost of the intervention.

Finally, to address KQ s 1d and 2d concerning disparities in health outcomes and use of health care services, we looked for studies that reported on health literacy level as a mediator of the relationship between age, race, ethnicity, or cultural background and health outcomes (or the effectiveness of interventions) and also included studies that reported moderators of the strength of the relationship between health literacy and health outcomes. This distinction between mediating and moderating is important. A moderator affects the direction or strength of a relationship between an independent and dependent variable and is generally examined by looking for differential effects in subgroup analysis. A moderator effect is commonly observed in an analytic model through a statistically significant interaction of the exposure and the moderator. A mediator, on the other hand, accounts for that relationship, answering the question as to how or why things occur. There are multiple approaches to mediation analysis, including path analysis, structural equation modeling, and methods such as those proposed by Baron and Kenny.57 All test the relationships between the exposure and mediator, mediator and outcome, and exposure and outcome before and after adjusting for the mediator. To determine mediation, they require a reduction in the magnitude of the relationship between the exposure and outcome when the mediator is added to the model.

Literature Search and Retrieval Process

Database Search Terms

To identify the relevant literature for our review, we searched five electronic databases: MEDLINE,® the Cumulative Index to Nursing and Allied Health Literature (CINAHL), the Cochrane Library, PsychINFO, and the Educational Resources Information Center (ERIC). For health literacy, we searched using a variety of terms limited to English and studies conducted with human participants (no laboratory or animal studies) published from 2003 to May 25, 2010. For numeracy, we searched the same databases from 1966 to May 25, 2010. We conducted key word searches because no MeSH headings specifically identify health-literacy-related articles. The terms “health literacy,” “numeracy,” and “literacy,” and terms or phrases related to instruments known to measure health literacy and numeracy were the focus of the search. We limited the “health literacy” and “literacy [tw = ‘text word’]” searches to 2003 forward (including up to 1 year overlap with our earlier review) to be confident that we did not miss studies between the first review and this update, and we compared new and earlier reference lists to ensure that we did not unnecessarily overlap with the literature reviewed earlier. Editorials, letters to the editor, and case reports were excluded.

Across all databases searched, our initial searches yielded 2,855 citations Appendix A). We reviewed our search strategy with the TEP and further supplemented our electronic searches by hand searching pertinent excluded articles, including other reviews.

We imported all citations into an electronic database (EndNote X.3) for a final unduplicated yield of 3,496 articles.

Study Selection Process

Inclusion and Exclusion Criteria

For each KQ, we developed detailed eligibility criteria with respect to population, intervention, comparison, outcomes, time frames, and settings (the PICOTS framework).58 The final criteria include the following:

KQ 1. Relationship of health literacy levels to utilization, outcomes, costs, and disparities

Population: Individuals and caregivers of all races and ethnicities.

Intervention: Not applicable.

Comparison: Different levels of health literacy or numeracy skills.

Outcomes: For studies of outcomes by levels of health literacy, relevant health or cost outcomes with the exception of knowledge; the relationship between literacy and health-related knowledge was considered well-established through the earlier review. For studies of outcomes by numeracy levels, relevant health or cost outcomes and knowledge.

Time: Cross-sectional or longitudinal studies, with varying lengths of time for followup, and with no restrictions for when the studies or data collection activities were done.

Setting: No exclusions by setting, so includes inpatient or outpatient settings in health care systems and institutions, various community-based settings, or homes.

KQ 2. Effective interventions to improve utilization or health outcomes or to affect costs or disparities among low literacy individuals

Population: Populations including individuals and caregivers of all races and ethnicities with low health literacy. Although the ideal populations to answer our question would include only individuals with low health literacy, much of the research about interventions designed to mitigate the effects of low health literacy has been done in populations that include a combination of low and high health literacy individuals and failed to perform separate analyses in these subgroups. Instead of excluding a large portion of the intervention literature, we decided to permit inclusion of populations with a combination of low and high literacy individuals (but no subgroup analysis), knowing that they may provide only indirect information about the effect of interventions on an exclusively low literacy population.

Intervention: All interventions specifically designed to mitigate the effects of low health literacy by improving the use of health care services or health outcomes in low-health-literacy or low-numeracy individuals; this includes, but is not limited to, interventions designed to simplify information presentation, circumvent poor reading skills (e.g. video), facilitate patient/provider communication, circumvent barriers to health care, improve self-efficacy or health-related skills.

Comparison: Any comparator designated by the investigators. A comparator is not necessary for studies with pre/post-intervention measures.

Outcomes: Any health-related health care utilization, outcome, or cost.

Time: Studies (controlled and uncontrolled trials and observational studies) with varying lengths of time for followup and with no restrictions for when the studies or data collection activities were done.

Setting: No exclusions by settings.

Based on the final KQ s specified above, we generated a list of inclusion and exclusion criteria (Table 3). We included prospective and cross-sectional observational studies of health outcomes, trials of materials developed for low-health-literacy populations, and trials of interventions that compared materials designed to be “easier to read or understand” with standard materials. We limited studies to those with outcomes related to health and use and costs of health services. Because this is an update to our original report, we limited our searches to studies that would not have been considered during the earlier review (e.g., those more recently published or those for which numeracy was the exposure).

Table 3. Inclusion/exclusion criteria for studies considered in this update.

Table 3

Inclusion/exclusion criteria for studies considered in this update.

As described in Table 3, we excluded studies for several reasons, including lack of any outcome of interest or results limited to the readability of materials. We also excluded studies that focused on literacy or health literacy as an outcome rather than an exposure, as is seen, for instance, in studies of physician office-based programs designed to improve children's literacy or studies of sociodemographic characteristics more likely to be associated with differences in health literacy level. We also excluded studies that used cognitive impairment or dementia as an outcome of interest because we would not be able to determine whether health literacy levels were causing or being affected by the condition.

Process for Considering Abstracts and Full Articles for Inclusion

Once we had identified articles through the electronic database searches, review articles, and reference lists, we examined abstracts of articles to determine whether the studies met our criteria for inclusion. Each abstract was independently, dually reviewed for inclusion or exclusion. If one reviewer concluded that the article should be included in the review, we obtained the full text. If two reviewers independently determined that the abstract did not meet eligibility criteria, we excluded it.

In the full article review, two team members again read each article and decided whether it met our inclusion criteria, using a Full-Text Inclusion/Exclusion Form (Appendix C). Reviewers discussed any disagreements, and, if they could not resolve them, the disposition of the article was decided by discussion among the larger team. Excluded articles are listed in Appendix H.

Literature Synthesis

Development of Evidence Tables and Data Abstraction Process

The senior staff members for the systematic review jointly developed the design of the evidence tables. Evidence tables were designed to provide sufficient information to enable readers to understand the study and to determine study quality. In our design, we gave particular emphasis to essential information to answer our KQ s and to determine study quality. The format of the tables, which was based on successful designs used for many prior systematic reviews from this EPC (not just the review of health literacy and outcomes), varied slightly by KQ; the tables for KQ 2 have additional columns that describe the control group, the intervention group, and specifics of the intervention.

We trained abstractors by having them abstract several articles into evidence tables and then reconvened as a group to discuss the results, including the utility of the table design. The abstractors repeated this process several times until everybody was capable of working with the tables, instructions, and other elements of the process.

Abstractors entered data directly into evidence tables. The first abstractors entered all relevant information into the evidence table. Second reviewers subsequently checked each abstraction for accuracy and completeness against the original articles. Abstractors reconciled all disagreements concerning the information reported in the evidence tables.

Abstractors, at the time of initial data abstraction, also performed a quality review (internal validity including risk of bias relevant to the study design) and rating of each study, using a separate quality review form for this process (Appendix C). As with data abstraction, second reviewers independently conducted a quality review and rating of each article. When ratings conflicted, each pair of reviewers discussed the problem; issues they could not resolve were brought to a third party for resolution.

The final evidence tables for KQ 1 (health literacy and numeracy separately) and KQ 2 are presented in their entirety in Appendix D. Entries for all evidence tables are listed alphabetically by the last name of the first author; multiple articles by the same team of authors are entered alphabetically by second or later authors. A list of abbreviations used in the evidence tables appears at the beginning of the appendix.

Quality Rating of Individual Studies

To assess the quality (internal validity including risk of bias) of studies, we used predefined criteria based on those developed for the earlier review. We adapted criteria from the US Preventive Services Task Force, the National Health Service Centre for Reviews and Dissemination, the AHRQ's Evidence-based Practice Center Systematic Review Manual, and a report on the quality of observational studies developed by the RTI-UNC EPC.59 We specifically addressed methodological issues including selection bias, measurement bias, confounding, and power.

Unlike our previous review, we rated the overall quality of studies qualitatively. In general terms, a “good” study has the least bias and results are considered to be valid. A “fair” study is susceptible to some bias but probably not enough to invalidate its results. A “poor” rating indicates significant bias (stemming, e.g., from serious errors in design or analysis) that may invalidate the study's results. Studies rated as “poor” were excluded from the analysis. A copy of the form used for quality rating a study is included in Appendix C.

As described above, two independent reviewers with no conflict of interest assigned quality ratings to each study. Disagreements were resolved by discussion and consensus or by discussion with the larger study team. Studies that met all criteria were rated good quality. Studies received a quality rating of fair when they presumably fulfilled all quality criteria but did not report their methods to an extent that answered all our questions or did not adequately fulfill all quality criteria. Thus, the fair-quality category includes studies with quite different strengths and weaknesses. Studies that had a fatal flaw (defined as a methodological shortcoming that leads to a very high probability of bias) in one or more categories were rated poor quality and excluded from our analyses. Poor-quality studies and reasons for that rating are presented in Appendix E. In situations where we concluded different quality ratings for different outcomes within the same study, we provide the quality rating for each.

Data Synthesis

We synthesized the data in our review qualitatively. We did not have a sufficient number of studies with similar outcomes or similar interventions to consider quantitative analysis (meta-analysis or statistical pooling) of data. Furthermore, we primarily considered only information from the current searches. Given changes in our evidence tables and quality forms, we reviewed individual studies from the 2004 review in depth only if new evidence would seem to change overall conclusions. Because the structure of analysis for KQ 2 changed for this current review, we reorganized the 2004 review findings from KQ 2 to be consistent with our current organizational structure for results.

As part of data synthesis, we paid particular attention to a few issues. First, we closely examined whether studies accounted for relevant confounders in their analyses. Because the goal of etiologic research focuses on understanding the relationship between exposures and outcomes of interest, it is important that confounders are controlled for to determine accurate estimates of effect. Second, we looked closely at studies that reported the relationship between both health literacy and numeracy and the same outcome. This allowed inferences about the relative strengths of the relationships between the variables and the outcome. Third, for intervention studies, we looked at common features of successful interventions and at the impact of interventions on multiple related outcomes. This allows inference about the effective components and mechanisms of health literacy interventions.

Grading the Strength of Available Evidence

We evaluated the strength of evidence based on the AHRQ Methods Guide for Comparative Effectiveness Research.60 To determine overall strength, we first examined several key features contributing to evidence strength: risk of bias, consistency, directness, precision, and the presence of other modifying factors. We then combined these factors to grade the overall strength of evidence. As described in Owens et al., the evaluation of risk of bias includes assessment of study design and aggregate quality of studies.60 We judged good-quality studies with strong designs to yield evidence with low risk of bias. We graded evidence as consistent when effect sizes across studies were in the same direction and of similar magnitude. For studies addressing KQ1, when the evidence linked differences in health literacy skill level or interventions directly to health outcomes, we graded the evidence as being direct. For studies addressing KQ2, the evidence was graded as direct when at least one study for any given type of intervention or outcome included low literacy specific analyses. We graded evidence as being precise when results were in the same direction and had a narrow range.

Consistent with EPC policy, we independently dually evaluated the overall strength of evidence for each outcome based on a qualitative assessment of strength of evidence for each of the key features listed above. We then reconciled all disagreements through discussion by senior members of the team. The levels of strength of evidence as specified by AHRQ are shown in Table 4. Full results of our strength of evidence reviews are presented in Appendix F.

Table 4. Strength of evidence grades and definitions.

Table 4

Strength of evidence grades and definitions.

Applicability of the Evidence

We evaluated the applicability of the evidence based on a qualitative assessment of the population, intensity, or quality of treatment, outcomes, and timing of followup. Specifically, we considered whether enrolled populations differ from target populations, whether studied interventions are comparable with those in routine use, whether measured outcomes are known to reflect the most important clinical outcomes, and whether followup was sufficient.

Peer Review Process

Among the more important activities involved in producing a credible evidence report is conducting an unbiased and broadly based review of the draft report. External reviewers are clinicians, researchers, representatives of professional societies, and potential users of the report, including TEP members (see Appendix G). Peer reviewers provided comments on the content, structure, and format of the evidence report and completed a peer review checklist. We revised the report, as appropriate, based on comments from peer reviewers.


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (6.4M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...