U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Institute of Medicine (US) Roundtable on Health Literacy. Measures of Health Literacy: Workshop Summary. Washington (DC): National Academies Press (US); 2009.

Cover of Measures of Health Literacy

Measures of Health Literacy: Workshop Summary.

Show details

3Approaches to Assessing Health Literacy


Lauren McCormack, Ph.D., M.S.P.H.

RTI International

RTI International is developing and testing a new measure of health literacy. The objective of this R01 project funded by the National Institutes of Health (NIH) is to create a publicly available health literacy instrument that can be used for population-based surveillance and for measuring an individual’s health literacy in intervention and research studies. In addition to the research team, there is an external panel advising the project. Specific project tasks include developing a conceptual framework, developing health literacy items, cognitively testing these items, pilot testing the items in a survey, and conducting psychometric analyses of the pilot data.

As discussed previously, existing measures of health literacy have limitations. For example, a major limitation of the Test of Functional Health Literacy in Adults (TOFHLA), the Rapid Estimate of Adult Literacy in Medicine (REALM), the Wide Range Achievement Test (WRAT), and the Ask-Me-3 is that these instruments largely measure reading ability or print literacy. The National Assessment of Adult Literacy (NAAL), as has been discussed, is not publicly available, and there is uncertainty about when the next round will be fielded.

The project team began by reviewing existing definitions of health literacy. The Office of Disease Prevention and Health Promotion defines health literacy as not simply a function of basic literacy skills, but as “dependent on individual and system factors, including communication skills of lay persons and professionals, lay and professional knowledge of health topics, culture, the demands of the healthcare and public health systems, and the demands of the situation/context” (http://www.health.gov/communication/literacy/quickguide/factsbasic.htm). “Health literacy varies by context and setting and is not necessarily related to years of education or general reading ability,” according to the National Network of Libraries of Medicine (http://nnlm.gov/outreach/consumer/hlthlit.html). The Institute of Medicine (IOM) states that, “Even well-educated people with strong reading and writing skills may have trouble comprehending a medical form or doctor’s instructions regarding a drug or procedure” (2004). Thus, the literature review supports an increasing recognition of the importance of context and setting when assessing health literacy. The project team adopted a slightly modified version of the Ratzan and Parker (2000) definition.

Conceptual Framework and Skills-based Approach to Measurement

The next step in the project was to develop a conceptual framework (see Figure 3-1). As Pleasant said earlier, a conceptual framework is critical as a foundation of measurement. An important component of this framework is the feedback loop from health-related outcomes back into skills; people learn from their experiences, and that affects their skills for the future.

FIGURE 3-1. Conceptual framework for individual health literacy.


Conceptual framework for individual health literacy. SOURCE: McCormack, 2009.

There is an increasing call in health care for consumer activation, consumer empowerment, and consumer involvement. Under these circumstances a skills-based approach to measuring health literacy is warranted. Therefore, the approach under development will include assessments of people’s ability to use different types of health information to make informed decisions as well as the skills needed across the life course in periods of health and periods of illness. Issues addressed range from disease prevention to treatment and self-management. The assessment will be based on the U.S. health care system, which means that the measurement process reflects current health insurance issues and care provided in public and private systems. One challenge in creating a skills-based approach in which data are collected via a computer is keeping up with technological advancements and changes in health-related materials that are used in the measurement process.

The measures will cover several health literacy domains, including print (both prose and document), numeracy skills, communication (including listening, speaking, and negotiating), and information seeking or navigation. A hierarchical approach was taken to determine the measures. First the skill or task was identified. Second, stimuli that enabled measurements of the skill or task were selected. Finally, the mode of administration was chosen. Although some of these questions could be conducted over the telephone or in person, a web-based approach is the preferred mode at this time. The following criteria were used to identify the skills to include in the measures:

  • Understanding health-related concepts and terms (in writing and verbally);
  • Interpreting tables, charts, symbols, maps, and other visuals;
  • Making inferences based on available data;
  • Applying information to new situations; and
  • Using arithmetic manipulations.

Criteria used for selection of stimuli included

  • Sufficiently related to the health of the public;
  • Widely applicable, balanced content;
  • Accessible to many subgroups (gender neutral, culturally sensitive);
  • Clinically important and not controversial;
  • Appropriate length of content;
  • Mixture of public- and private-sector materials;
  • Likely to stand the test of time;
  • Variety of formats/channels;
  • Wide range of difficulty; and
  • Has face validity.

In developing criteria for the survey items themselves, the project team determined that prior knowledge should not be required to answer the questions. Another criterion is that there must be only one correct response, but there also have to be reasonable distractors (alternatives) that are neither too obvious nor too difficult. The questions must be independent of each other, that is, respondents should not have to get the first question correct in order to get the second question correct. Finally, the questions must include a range of difficulty and must cognitively test well.

Survey Items

The following are examples of stimuli and survey items that the project team is considering for the assessment. Final decisions about the stimuli and items will be based on the pilot work and assuming approvals are granted from the organization that created the stimuli. One possible stimulus is “Signs of a Stroke” (Figure 3-2). A few survey questions are associated with each stimulus.

FIGURE 3-2. Signs of a stroke.


Signs of a stroke. SOURCE: McCormack, 2009.

Other items in the survey require reading an article to obtain information and then answering questions based on information provided in the article. Other questions are based on short videos such as the public service announcement The Faces of Influenza, sponsored by the American Lung Association and posted on YouTube.com. There are also questions about symbols. For example, the question appearing in Figure 3-3 is about medication adherence.

FIGURE 3-3. Caution symbols on medication bottles.


Caution symbols on medication bottles. SOURCE: McCormack, 2009.


A number of issues and challenges remain as health literacy measures are developed, including identifying skills that can be measured, selecting appropriate stimuli and items, and assessing the trade-offs associated with different modes of administration. Another issue is how emerging technologies will allow improvement in measurement of health literacy, especially oral literacy. Additional questions include, What are the advantages and disadvantages of using real-world stimuli versus stimuli developed for assessments? On which national surveys would health literacy items and scales best fit? How do we deal with the need for stimuli to be updated and/or changed over time?


Moderator: George Isham, M.D., M.S.


One audience participant asked whether there is enough knowledge and new technology today (e.g., with the personal health record and the new health initiative measures) that one could develop a measure, be it of knowledge, skills, or function, that would take 5 minutes and that could be used to rapidly move the field forward. McCormack said many in the field would like to have a 5-minute short form instrument to measure health literacy, and one could be created eventually. A first step is creating a longer form of the instrument and using psychometric and other analyses to determine which items reflect the core of the instrument, then eliminating items that contribute less. One possible model for measuring health literacy is to take an approach like the Patient Reported Outcomes Measurement Information System (PROMIS)1 for quality-of-care measurement. PROMIS uses a large bank of items that are rotated over time but still measure the same construct.


Elizabeth Hahn, M.A.

Northwestern University

The bilingual assessment of health literacy project at Northwestern is funded by the National Heart, Lung, and Blood Institute of the NIH. The project has four goals:

  1. Develop English- and Spanish-language item banks for reading-related health literacy skills;
  2. Evaluate the feasibility, validity, and acceptability of computer-based methods for assessment of health literacy;
  3. Develop computer-adaptive testing (CAT) of health literacy in clinical settings; and
  4. Evaluate the associations among health literacy, sociodemographic/clinical characteristics, and health outcomes in primary care patients.

There is a continuum in health literacy that goes from low health literacy to high health literacy (Figure 3-4). The project intends to develop items that span the continuum and to make sure that for each English item on the continuum, there is a corresponding item in Spanish that sits at the same place on the continuum. To have equivalence across English and Spanish, the items must mean the same thing.

FIGURE 3-4. Item response theory.


Item response theory. SOURCE: Hahn, 2009.

There will be a bank of questions that identifies the underlying trait to be measured. The definition of the trait and the meaning of each item will be the same across all participant characteristics. If that were not the case, differences due to measurement bias could be interpreted incorrectly as real differences between groups.

Item bank

A well-constructed item bank will enable development of computer-adaptive tests or creation of short forms of the test. In other words, individuals could answer different questions in the item bank but, because it is known exactly where on the continuum each question is located, it will still be possible to estimate a health literacy score for each individual with good precision. The Talking Touchscreen2 (la Pantalla Parlanchina) will be adapted and used, providing those with low literacy an opportunity to self-administer questions by having some text on the screen read out loud.

The definition of health literacy used in this project has essentially two parts: capacity and application. First, an individual must have the capacity to process and understand health-related information. He or she must then be able to apply that information in the management of her or his own health. The capacity to obtain information, which is part of other definitions of health literacy discussed earlier, is a navigation skill that is not included in this health literacy tool. Instead, the focus is on comprehending and interpreting information provided and understanding what an appropriate health care decision based on that information should be. Whether the patient actually implements an appropriate health care decision and related behavior is also beyond the capability of this assessment tool.

The following are examples of items in this tool. All items are in English for this presentation, but there are comparable items in Spanish. Figure 3-5 shows a prose item. There is a short paragraph with text drawn from real-world documents. This is followed by a sentence with a missing word. Options are then given for the respondent to choose what to use to fill in the blank.

FIGURE 3-5. Prose item. SOURCE: Hahn, 2009.


Prose item. SOURCE: Hahn, 2009.

A second type of item included in this assessment tool is a document item. There is a stimulus (in Figure 3-6 a prescription label is the stimulus), followed by a question that asks about the stimulus. This particular item also has sound (the respondent would click on the “talking head” in the figure) so that information can be relayed orally.

FIGURE 3-6. Medications for Mr. Beta. SOURCE: Hahn, 2009.


Medications for Mr. Beta. SOURCE: Hahn, 2009.

The third type of item (Figure 3-7) involves a quantitative or numeracy skill. Again, the respondent can click to have the information delivered orally. All of the items have four response choices with only one correct answer.

FIGURE 3-7. Sample body mass index chart. SOURCE: Hahn, 2009.


Sample body mass index chart. SOURCE: Hahn, 2009.

Item Testing

All of the items were pilot tested with 97 English-speaking participants and 134 Spanish speakers. The characteristics of the pilot test participants can be seen in Table 3-1.

TABLE 3-1. Characteristics of Pilot Test Participants.


Characteristics of Pilot Test Participants.

Most of the testing was done with paper and pencil, but the printed paper looked just like the Talking Touchscreen view will look when those components are completed. There were also research assistants present who could read the questions out loud for participants. Cognitive interviewing was conducted with some participants, who were shown the different types of items and then asked to describe how they would go about answering the questions.

The participants were recruited mainly in primary care clinics, which are also where the ultimate calibration testing is being conducted. To obtain sufficient numbers for the pilot test, some testing was conducted at community-based organizations that provide general education development (GED), literacy tutoring, or job training.

The pilot test showed that nearly all (>90 percent English, 100 percent Spanish) correctly described the steps needed to answer each type of question. Participants were also asked if they felt anxious, nervous, or uncomfortable completing this health literacy test. Only one English-speaking participant and three Spanish-speaking participants were uncomfortable or anxious.

Once the participants completed the computer-based test, cognitive interviews were conducted with 25 English-speaking participants. Most reported that the test was easy to use and commented favorably on the screen design and the availability of audio. Some evidence shows that even people with high literacy skills found comprehension was aided with sound as well as the visual prompt. Participants also commented favorably on the items, even when acknowledging that some of them were difficult to answer.

A large number of items are needed for a good bank of items. Ultimately, people will answer only a small number of items, but the pilot tested 98 English items and 127 Spanish items. Some items were eliminated, such as those that everyone completed correctly. Such items are not useful for measurement. The items that were left have a range of difficulty. A small number of items are at the easy and difficult ends of the range, and the bulk of the items are in the middle.

A 10-item short form was developed for the pilot test and is being used in other ongoing projects. Calibration testing is under way for the final set of 90 English items and 90 Spanish items. Those items are being tested with 600 English speakers and 600 Spanish speakers who are primary care patients. The analysis plan is to accomplish the following:

  • Examine the extent to which items measure a single latent trait;
  • Calibrate items on the health literacy continuum using the most parsimonious model that displays a good fit;
  • Evaluate the possibility of differential item functioning (DIF) across language, gender, age, education, and health care experience;
  • Convene an expert advisory panel to create ability classifications; and
  • Develop an algorithm for the CAT.


In conclusion, Hahn said, these new health literacy items have good content validity and cover a variety of topics relevant to primary care patients and their health care providers. The Talking Touchscreen (la Pantalla Parlanchina) is easy to use and acceptable for self-administration of a health literacy test. A computer could be placed in the waiting room of a clinic, and people coming in for an appointment could fill out the assessment and immediately receive a score. That score could then be used in the same clinical encounter.

A bilingual, computer-adaptive test of health literacy will enable clinicians and researchers to more precisely determine at what level low health literacy begins to adversely affect health and health care use. This tool will also provide better opportunities to determine the independent effects of limited English proficiency and limited health literacy. By using novel computer-based methods for health literacy assessment in clinical settings, the tool could also increase the access of underserved populations to new technologies and contribute information about the experiences of diverse populations with new technologies.


Moderator: George Isham, M.D., M.S.


Scott Ratzan, one of the authors of the definition of health literacy used in the IOM report (2004), clarified the way in which that health literacy definition was developed. It was not a consensus project, Ratzan said. The National Library of Medicine (NLM) conducted a review of some 6,000 abstracts and articles to see if the definition would be inclusive for all kinds of research. That definition was then published through the NLM NIH process.

What is important, Ratzan continued, is that the field today does not become too epistemological or ontological on the issue of the definition of health literacy, resulting in the perfect becoming the enemy of the good. Most health literacy research does aim to help America become a more health-literate society.

One participant asked for clarification on the conceptual framework being used to develop the bilingual assessment of the health literacy tool. A portion of the presentation suggested that the project was using an information gain-type model—that is, what somebody knows now that he or she did not know before. But another part of the presentation suggested that the model being used is far more comprehensive.

Hahn said the framework is for the purpose of understanding the big picture. When work began it was assumed that the framework would be a framework just for health literacy. However, the framework that the project ended up using is more of a continuum of what health literacy can impact, taking it all the way to health outcomes. That is where one sees some of the blending of the information gain and the skills.

Another participant said that the tools Hahn is developing are going to be very useful because there is little information on Spanish-language literacy among Spanish speakers in the United States. The few data that do exist indicate that the average literacy level may be lower among Spanish speakers than among English speakers. Has the project considered those who speak English as a second language and how appropriate the English-language instrument is for measuring those people’s English literacy?

Those taking the assessment will choose which language they want to use, Hahn said. There will also be a short acculturation scale that asks questions such as, When you are talking to your family, what language do you usually use? When you are talking to your friends, what language do you usually use? What is your country of origin? What languages do you speak at home? Using the answers to these questions, psychometric analyses will be conducted to determine whether items are working differently for those who are fully acculturated in English and those who are not.

One participant asked whether Hahn has considered bilingualism as a language. For example, many Latina mothers obtain information in English as well as Spanish. Their knowledge about medical issues is a mixture of English and Spanish. So information may be given in both languages in a pediatrician’s office. However, when testing for health literacy, the test is usually given in one language or the other. Furthermore, when looking at health literacy in children, it has been found that testing in both languages actually provides a better picture of what the children understand.

Hahn said it would be great if the resources were available to conduct testing in both languages; that is certainly something the project could consider for the future.

Another person said that more rigorous definition is needed for what a Spanish speaker is. This participant’s group conducted an assessment of the quality of Spanish translations and found that there is a difference between what monolingual Spanish speakers understand and what those who also speak English understand from the same document. The participant went on to say that the translator is only half of the equation. The other half is the introduction of the use, purpose, and context for the materials.

Hahn replied that the project has had a very rigorous translation methodology. A team of people from multiple countries and regions who speak Spanish have been involved in translation for 15 years. The project recognizes that one cannot just translate the words but must also capture the meaning that would be understandable and appropriate for people who speak Spanish across the United States.

A participant said one concern she has is that people can often parrot back the correct response but cannot actually demonstrate what needs to be done. For example, when patients are given the instruction “take two tablets by mouth twice daily,” most patients might say that means they should take two pills two times a day. But in one study, only about a third could actually demonstrate what that meant—that is, only about a third could actually count out four pills (Davis et al., 2006).

As another example, the participant continued, one might be able to read the ideal body weight chart, he or she might be able to say what is ideal, but can that person stand on a scale, read his or her own weight, and then tell whether it is within the acceptable range? Has Hahn’s project considered developing test items that would determine whether participants could demonstrate the skill needed?

Hahn replied that with item response theory item banks, once one has a well-calibrated bank and knows where the continuum is and whether the items are on that continuum, it is possible to add items at any time. One can add other languages too. If this assessment works as intended and has a unidimensional construct, then it will be possible to add other item types to it and to add other languages.

Hahn said she is currently engaged in another project (in addition to the one described in her presentation) that is using the Talking Touch-screen to administer questionnaires that measure health literacy and deliver patient education materials to newly diagnosed cancer patients. All of that technology can be fed into the electronic medical record. The challenge is that none of the settings in which this project is being conducted has a true electronic medical record.

One participant asked Hahn how long the assessment takes, how easily it can be modified, and whether community health workers could use this tool. Hahn replied that participants currently take 30 to 45 minutes to answer the 30 questions, which is too long. But this stage of testing is for calibration purposes. Once the instrument testing phase is completed, one can customize the test by adding items or expanding it to other languages.


Lisa D. Chew, M.D., M.P.H.

University of Washington

Persons with limited health literacy are those individuals who read at a sixth-grade level or less. They often misread the simplest materials, including medication bottles and appointment slips. Persons with marginal health literacy are those individuals who read between a seventh- and eighth-grade level. They are able to perform better on simple tasks than those with limited health literacy, but they have difficulty reading and understanding more complicated materials such as educational brochures and informed consent documents. Persons with adequate health literacy are those individuals who read at a ninth-grade level or above and who are able to complete successfully more tasks required to function in the health care setting.

Approximately 90 million American adults have limited health literacy and lack the needed literacy skills to navigate the health care environment (IOM, 2004). Growing scientific evidence has shown an association between limited health literacy and poorer health outcomes, such as high rates of medication nonadherence (Kalichman et al., 1999), higher hospitalization rates (Baker et al., 1998, 2002), and poorer self-reported health (Baker et al., 1997; Weiss et al., 1992).

Despite the important implications of limited health literacy for patient care, health care providers are often unaware of patients’ reading abilities (Bass et al., 2002). Concerns about the implications of limited health literacy on the care of patients have led some experts to advocate for screening for limited health literacy.

Two commonly used formal health literacy assessment instruments are the TOFHLA and the REALM, both of which have been discussed previously. The TOFHLA is a comprehension test that has a short version and a full version. Its administration time ranges from 7 to 22 minutes. The REALM is a word recognition test with average administration time of 2 to 3 minutes. The TOFHLA and the REALM are often used for research purposes.

Despite the existence of these health literacy assessment instruments, there are major barriers to routine screening. Patients are sometimes ashamed of their limited health literacy, and many will attempt to conceal their reading impairments from others. In addition, the length of the formal health literacy instruments limits their clinical use. Finally, although there is an association between educational attainment and literacy level, certain questions that simply ask patients about their reading ability and educational attainment do not always accurately predict reading ability. Therefore, a self-report measure that could quickly and accurately screen patients for limited health literacy would help increase the feasibility of assessing a patient’s health literacy in busy settings.

An ideal self-report measure for health literacy would, Chew said, have the following characteristics:

  • Quickly identify patients with limited health literacy;
  • Be easy to administer so that it could be routinely integrated into busy settings;
  • Be acceptable to patients and not cause shame and embarrassment; and
  • Accurately identify patients with limited health literacy.

In evaluating the performance of any measure used to screen for a certain condition, whether it is limited literacy, colon cancer, or alcohol use, one often looks at sensitivity and specificity. Sensitivity is the true-positive rate. The higher the sensitivity, the better the measure or question is able to identify patients with this condition. Specificity is the true-negative rate. The higher the specificity, the better the measure or question is able to rule out patients with this condition.

An initial effort at self-report measures of health literacy included three questions: (1) Can you read a newspaper? (2) Can you read forms and other written materials obtained from the hospital? (3) Do you usually ask somebody to help you read materials you receive from the hospital? (Williams et al., 1995). Each of these questions had a dichotomous response of yes or no. Although the specificity was high, the sensitivity—being able to identify the portion of patients with limited health literacy—was low.

A separate study was conducted to determine if one could develop questions that better detect patients with limited health literacy. This study involved 332 patients seeking care at a Veterans Administration (VA) pre-operative clinic. The gold standard to determine if a patient had limited health literacy was a Short Test of Functional Health Literacy in Adults (STOFHLA). Of the 332 participants, the mean age was 58.2 years. Participants were mostly men and white. Thirty-eight percent had 12 years or less of education. Some 4.5 percent of patients had limited health literacy, and another 7.5 percent had marginal health literacy (Chew et al., 2004).

The content of the questions was based on important domains identified in a prior qualitative study of patients with limited health literacy. That study (Baker et al., 1996), which involved using focus groups in one-on-one interviews, reported five problem areas that patients with limited health literacy experienced when interacting with the health care system:

  • Navigating the health care system;
  • Completing medical forms;
  • Following medication instructions;
  • Interacting with providers; and
  • Reading appointment slips.

In addition, previous studies reported the frequent use of a surrogate reader as a common coping mechanism for patients with limited health literacy. These six domains guided the development of the health literacy screening questions.

In anticipation of the underreporting of reading difficulties due to the shame associated with limited health literacy, several methods were used to attempt to increase patient reporting. Questions were developed for other sensitive areas, such as alcohol use, and were phrased as “how often” or “how confident” the patient was in each domain rather than asking if he or she had problems. Second, response options were scaled from 0 to 4 to allow patients to report even rare problems with reading. Finally, to avoid restrictions in patient reporting, no time frame or visit setting was specified for reading difficulties.

Of the 16 questions included in this study, the three strongest screening questions for detecting limited health literacy were the following:

  • How often do you have someone help you read hospital materials? (Help Read) (five possible responses ranging from never to always)
  • How confident are you filling out medical forms by yourself? (Confident with Forms) (five possible responses ranging from not at all to extremely)
  • How often do you have problems learning about your medical condition because of difficulty understanding written information? (Problems Learning) (five possible responses ranging from never to always)

The graph in Figure 3-8 represents the Receiver Operating Characteristic (ROC) curves of these three screening questions for detecting limited health literacy. The ROC curves plot the sensitivity versus 1 minus the specificity and allows us to graphically portray the trade-offs involved between improving a question’s sensitivity or its specificity. The area under the curve is a useful summary of the overall accuracy of a question and can be used to compare the accuracy of two or more questions. An ideal question is one that reaches the upper left corner of the graph with an Area Under the Receiver Operating Characteristic (AUROC) of 1.0. A poor-performing question is one that follows the diagonal from the lower left to the upper right corner with an AUROC of 0.5. The difference in the performances of these questions was not statistically significant.

FIGURE 3-8. Receiver Operating Characteristic curves for detecting limited health literacy.


Receiver Operating Characteristic curves for detecting limited health literacy. SOURCE: Chew, 2009.

In identifying patients with limited health literacy, the Help Read question had a sensitivity of 93 percent and a specificity of 65 percent at the threshold of the occasionally or greater response. The self-report screening questions were less effective in identifying patients with marginal health literacy. Combining the questions did not improve the screening performance of detecting limited health literacy or limited marginal health literacy.

Results of this study show that each of the three health literacy screening questions listed above was effective in identifying VA patients with limited health literacy, potentially offering a practical, inexpensive, and unobtrusive method to identify those at risk for reading difficulties.

There have been three studies validating the performance of these three questions in other populations. Wallace and colleagues (2006) conducted a study among 305 English-speaking adults at a university-based primary care clinic and found a 17.7 percent prevalence of limited health literacy. Another study (Wallace et al., 2007) among 100 English-speaking adults at a university-based vascular surgery clinic found an 18 percent prevalence of limited health literacy. More recently, a study of 1,796 English-speaking adults at four VA medical centers found a 6.8 percent prevalence of limited health literacy (Chew et al., 2008). In all three validation studies, the question “how confident are you filling out medical forms by yourself?” appeared to be the strongest-performing question.

Another recent study conducted in a primary care clinic among 225 patients with diabetes found a 15.1 percent prevalence of limited health literacy (Jeppesen et al., 2009). The responses to the three questions were combined with demographic information (highest education attained, sex, race) into one predictive model. The three questions were

  • How would you rate your reading ability?
  • On a scale of 1 to 10 where 1 is “not at all” and 10 is “a great deal,” how much do you like reading?
  • How often do you need to have someone help you when you read instructions, pamphlets, or other written material from your doctor or pharmacy?

The raw data were not presented, so it is not possible to determine the screening performance of these questions.

There have been conflicting opinions about whether the performance of these self-report health literacy questions is better than demographic information (e.g., education level, age) alone in detecting limited health literacy. Preliminary and unpublished data from the Minneapolis VA show no differences in the two measures of self-reported education attained and the Confident with Forms question. This suggests that further research may be needed to determine whether education level and other demographic characteristics in certain populations may perform as well as the self-report measures discussed here.

The strengths of the self-report measures include the finding that limited health literacy may be detected with a single question. The measures are easy to administer and can be administered by anyone with minimal training. These measures may also be more acceptable to patients than a formal test, causing less shame and embarrassment. Because the measures are quick and easy to administer, they are practical to use in different settings. The measures may be a useful tool for identifying patients who may need more formal health literacy testing and for allocating resources to those patients at highest risk.

The weakness of these measures is that their generalizability is unknown. The development and validation studies were conducted at either a VA health care center or at university-based clinics, where the prevalence of limited health literacy is lower than what would be anticipated at a public hospital. It may be that different questions produce varying results in different populations.

What is also unclear is how these measures would perform in other languages. Furthermore, although these questions are able to detect patients with limited health literacy, the ability to detect marginal health literacy was less optimal. Finally, more studies are needed to determine whether the predictive value of these questions is better than demographic variables alone for limited health literacy.

In conclusion, Chew said, future research is needed to answer additional questions about the use of these measures. First, how do these questions perform in populations and languages other than those studied? Second, do self-report measures perform better than demographic variables, and could combining demographic characteristics and self-report measures improve screening performance? Third, how can these measures be integrated into systems of care and what are appropriate, practical, and feasible interventions for patients who screen positive? Finally, do screening and interventions improve the health outcomes of patients with limited health literacy?


Moderator: Cindy Brach, M.P.P.

Agency for Healthcare Research and Quality

One participant asked Chew how much of the work she has done has spread throughout the VA. Is there any uptake in using the tool and applying it to change interventions for individuals identified as having marginal health literacy? Chew said one of the challenges is how to integrate health literacy questions into systems of care. Some small pilot studies have been conducted with pharmacy care management of patients with chronic illness. Also, an electronic record system called MyHealtheVet is an online tool for VA patients to use. It was developed with a consumer focus and guided by the belief that knowledgeable patients are better able to make informed health care decisions, stay healthy, and seek services when needed than those without adequate knowledge.

One participant, Dr. Cecil Garcia from Harlingen, TX, said he used Chew’s study to conduct research on health literacy in Spanish-speaking patients. According to the Dartmouth Atlas data, this area of Texas has high costs but a low quality of care. Garcia translated Chew’s questions into Spanish and examined 116 patients. The prevalence of inadequate literacy was 45 percent in these Spanish-speaking patients. Questions were asked orally because the reading level was at the third- or fourth-grade level. If patients are given information that is above their reading levels, the information is not going to help.


Sandra Smith, M.P.H.

University of Washington School of Public Health

A very different approach to assessing health literacy from those discussed previously is one that focuses on function. The National Literacy Act of 1991, Public Law 102-73, 102nd Congress, 2nd session (July 25, 1991), marked a significant evolution in the understanding of literacy in America. That legislation aimed to broaden the concept of adult literacy and to differentiate it from the concept of basic literacy skills. The Act defined literacy as the “ability to read, write, and speak in English, and compute and solve problems at levels of proficiency necessary to function on the job and in society, to achieve one’s goals, and develop one’s knowledge and potential.”

Hourigan (1994) used the term “academic literacy” to differentiate students’ mastery of cognitive skills from adults’ functional literacy. Academic literacy is focused on reading, writing, and arithmetic. Functional literacy, however, involves putting into real-world practice a wide range of cognitive and noncognitive problem-solving communications, interpersonal, and lifelong learning skills. Functional literacy is about what adults do rather than what they are capable of doing.

Academic literacy skills are considered to be individual, static, and transferable across settings. In contrast, functional literacy skills are social, evolve over time, and are content specific. An adult may, therefore, have many functional literacies. For example, computer literacy enables a person to use a computer but not necessarily to read the manual or understand programming language. Similarly, functional health literacy enables people to use the health care system and take care of themselves but not necessarily to read insurance documents and understand medical terminology.

Academic literacy becomes apparent in reading and comprehension test scores. A good score suggests capability to function in other settings. Functional literacy, however, is about social practices instead of individual abilities. It manifests in actions, behaviors, and relationships. Functional literacy requires authentic assessment, which means assessment of performance or practice in the real world.

Measures matter because they drive intervention. What is measured and how it is measured determines what is discovered about what works and what is worth doing. Researchers and policy makers have dropped the function from functional health literacy and have switched the focus back to academic skills and reading tests. Nearly all studies have operationalized health literacy as reading skills in a medical setting and measure those skills by standardized reading tests. The focus has been on understanding information. As a result, interventions primarily have been aimed at making information easier to understand by reducing the cognitive demand.

Such work is important and must continue, but it does have its limitations. What has been learned from intervention studies is that improved information does improve knowledge for readers with both higher and lower reading skills. However, skilled and unskilled readers alike still struggle to use their acquired knowledge. Reading and understanding information are important parts of functional health literacy, but they offer an incomplete picture and they are insufficient to promote appropriate use of health services, good self-care, and improved health.

The problem with focusing on academic skills and information is that, like money, one needs it. But what one really needs is not the money itself, but what the money enables one to do. Similarly, it is not really the information that patients need for health, it is what the information enables them to do and how it enables them to function, and that is the function in health literacy.

A number of literacy scholars characterize types of levels or layers of literacy. Donald Nutbeam (2000, 2008) applied the work of Freebody and Luke (1990) to characterize three types of health literacy. One type is functional health literacy, defined as reading and writing associated with tasks. In this usage it is associated with literacy tasks—read the list of words, then pass the comprehension test. This is the stage of the field in its measurement of health literacy. Nutbeam also refers to functional health literacy as fundamental or basic literacy that is associated with everyday tasks.

Interactive literacy is another type of literacy. It requires social skills such as listening and speaking to complete more complicated interactive tasks. Such tasks might be making an appointment, getting to the appointment, describing symptoms, and listening to treatment instructions. This type of health literacy is analogous to oral health literacy.

Critical reflection is the third and higher level literacy that is needed to manage one’s health. As an example, a mother goes to the pediatrician and hears that her baby should sleep on his back to avoid Sudden Infant Death Syndrome. She hears from her grandmother that the baby should sleep on his stomach to avoid aspiration. She needs critical literacy, reflective literacy, to differentiate the sources of information to reconcile the conflicting advice, to manage the power differentials, to control the sleep position for her son, and to thereby manage his health.

The current conceptualization and measures of health literacy miss much of this deeper meaning and purpose of literacy for health.

What does this conceptualization have to do with measurement? One might infer from the use of the term “functional” in this model that interactive health literacy and reflective health literacy are not functional. That is not the case. It is possible to extend the idea of function to all three types of health literacy. Functional health literacy, then, becomes a concept that describes the practical application of a wide range of cognitive and noncognitive skills in real life, rather than a single literacy skill in a clinical setting. Functional health literacy is the outcome of intervention rather than the independent variable. It captures how people use literacy for health as patients and also as family members, workers, and citizens. It captures social capital.


How can one measure the function in functional health literacy? A good example of promoting many aspects of family functioning can be found in the work of public health nurses in maternal and child health home visitation programs. These visiting nurses link disadvantaged parents to health care services and community resources. They provide social support, practical assistance, information, and health education. Many of these programs use an instrument called the Life Skills Progression™ (Wollesen and Peifer, 2006) to monitor parents’ progress toward higher levels of functioning.

To measure functional health literacy, two scales were derived from the Life Skills Progression instrument: a Functional Healthcare Literacy Scale (Figure 3-9) and a Functional Selfcare Literacy Scale (Figure 3-10). The Functional Healthcare Literacy Scale rates parents’ use of health information and services for both parent and child. Each of the items is a Likert scale that identifies behaviors, practices, and characteristics that indicate progressive levels of function that range from dysfunction to optimal functioning. Scores greater than 4 indicate adequate to optimal functioning. One might think of this scale as a map and the items as pathways toward optimal functioning in a health care system.

FIGURE 3-9. Functional Healthcare Literacy Scale.


Functional Healthcare Literacy Scale. SOURCE: Wollesen and Peifer, 2006. Reprinted by permission from Paul H. Brookes Publishing Co., Inc.

FIGURE 3-10. Functional Selfcare Literacy Scale.


Functional Selfcare Literacy Scale. SOURCE: Wollesen and Peifer, 2006. Reprinted by permission from Paul H. Brookes Publishing Co., Inc.

Both of these assessments function in the same manner. To monitor progress, a home visitor completes the instrument on a parent at intake, every 6 months, and at end of service. The comparison of these sequential measures allows one to track progress over time and to see points of regression. The data are immediately available for intervention planning.

The elegance in measuring function is that it provides for solutions along with the identification of problems. One can choose to intervene on a need, which would be indicated by a low score on the left in Figure 3-10. One can also choose to intervene by building on a strength, which is indicated by a high score on the right. Subsequent measurement allows one to see the impact of the interventions.

These scales demonstrated good reliability. Validity testing is under way.


This project was a 2-year, quasi-experimental, multicohort intervention study with multiple waves of measurement. The total database from seven home visitation programs has 2,532 parents. The data below are on about 1,800 of those parents. One can see in Figure 3-11 that the intervention worked quickly in the first six months. Parents demonstrated statistically significant linear stepwise progress over time, regardless of their reading level.

FIGURE 3-11. Home visitation promotes parental functional health literacy.


Home visitation promotes parental functional health literacy. SOURCE: Smith, 2009.


Measuring function is important in assessments of health literacy, Smith said, because measuring function captures the impact of efforts to reduce the risk of low literacy skills as well as the efforts to promote functioning directly. It allows the integration of the social determinants of health, it guides interventions, it informs practice, and it is patient centered. The Life Skills Progression method presented here could be adapted for clinical use, particularly for adults with chronic conditions that require frequent visits, clinical encounters, and significant self-care.

Is the instrument clinically feasible? It takes an experienced user about 5 minutes to complete both of the scales, and the data are immediately available for intervention planning. There is a limitation in that the assessments were implemented in home visitation programs, a hallmark of which is that the visitor and the family build a relationship over time. Therefore, the degree to which the clinical practice environment limits relationship development could affect use of the instrument.

In conclusion, Smith said, focus on function.


Kathleen Mazor, Ed.D. University of Massachusetts Medical School

The focus of this R01-funded project is on understanding spoken communication. The project team was multidisciplinary, and the research was carried out within the Cancer Research Network (CRN). The CRN is a consortium of 14 health plans around the country that cover approximately 10 million enrollees (about 4 percent of the population).

Not much attention has been paid to oral communication in the health literacy field. It is often said that if people cannot read, one needs to speak with them or let them listen to an audio version of the information. But do people actually understand even if they hear information? It is not just a question of hearing the words—the listener must know what the words mean and the context within which the words are spoken and be able to act on the information provided. Better measurement of oral communication is important in improving health literacy.

The project has three aims, although this presentation will discuss only the first aim, which is to develop and validate a psychometrically sound test of oral health literacy. The project also aims to investigate the relationship between oral health literacy and cancer prevention behaviors by comparing scores from the instrument with actual health behaviors. The third aim is to develop and test recommendations for improving oral communication about cancer prevention and screening.


The first step in developing the measures for assessing oral health literacy was to specify the test blueprint. Because the assessment was administered in the CRN, the focus was on common cancers (breast, cervical, colorectal, general, lung, prostate, and skin), cancer prevention, and screening. Not included were factors such as diagnosis, treatment, follow-up, and survival. The blueprint also specified the context within which messages are received (e.g., media or clinical), the style of the communication (i.e., narrative, statistical or numeric, factual), the purpose of the communication (i.e., instruction, information, or query), and the content of the information (i.e., prevention, screening methods, or risk factors).

The next step was to collect and examine messages about cancer that one might receive from different media, including television, radio, the Internet, and patient education materials. Of great importance was the need to include only those clips that contained accurate information. Furthermore, a variety of clips that represented the kinds of thing one might encounter in everyday life were included.

The selected clips of oral communication varied in content. Some clips showed a person describing his or her cancer experience, or the experience of quitting smoking. These personal stories were identified as narratives. Another set of clips presented factual information such as the type of cancer or the stage of the cancer. Such information can be delivered in three different ways: (1) One can simply provide information, (2) one can potentially provide information and then ask something about it, or (3) one can give someone an instruction with the intention that the person take action related to that information.

The third step was to develop some clinical vignettes. To construct such vignettes, physicians agreed to participate in role playing, which would be audiotaped and then transcribed. A professional writer helped create the scripts, which clinicians then reviewed and revised. After that, the project team conducted its own review and revisions. The project team then produced videos of those vignettes.

The next step was to construct the items for the test. Unfortunately, while there is literature on different approaches to use in developing items, there is not a great deal of literature on how to measure comprehension. The approach the team chose to use is the sentence verification technique (SVT).3 This is a method that is used to examine comprehension of text messages. The first step is to select a portion of the transcript that contains what one is trying to measure. Next, the sentence(s) is paraphrased so that the wording is changed but the meaning remains constant. The information is also rewritten so that the wording is similar to the original sentence(s) but with a different meaning.

For example, an original sentence said “overall HPV [human papillomavirus] prevalence among females in the United States, ages 14 to 59 years of age, was 26.8 percent, and that means one in four women are infected with HPV.” When the original material was paraphrased, the result was written as, “A quarter of women ages 14 to 59 are infected with HPV.” When the material was rewritten for a meaning change, the sentence read, “One in four women in the United States are infected with cervical cancer,” which is not true because only certain HPV strains develop into cervical cancer.

A respondent would hear the original statement and then hear either the paraphrase or the changed-meaning statement. Then the respondent would be asked, “Is the meaning of the statement about the same as the content of the original sample, or is it different?”


Pilot testing is currently under way. The test is administered using a touchscreen laptop. No reading is required; everything is spoken. Instructions are given at the beginning, and the test takes about an hour. Currently, the test has 16 videos and 66 questions. It is in English only, which is a limitation imposed because of resources available.

Participant feedback on the test is that it is user friendly, even for those unfamiliar with a computer; it is engaging and informative; it has clear instructions; and participants are not fatigued at the conclusion of the test. Once pilot testing is completed, the items will be revised and the test will then be administered to about 1,000 adults at four sites.


Results to date have shown that the comprehension of spoken measures is variable, Mazor said. Measuring comprehension of spoken messages is challenging because many factors affect comprehension and all of those factors cannot be evaluated fully in a single study.


Moderator: Cindy Brach, M.P.P.

Agency for Healthcare Research and Quality

One participant said it appears that what both the assessment of functional health literacy described by Smith and the assessment of oral health literacy described by Mazor are attempting to do is to develop more authentic health literacy measures. For the future, in terms of functional health literacy, what are some of the factors that may predict whether people score high or low on a functional scale? Smith said that major predictors are likely to be self-efficacy, confidence, and social support. Sometimes having a child will galvanize one’s motivation and interest, and parents become very ready to learn and to change.

Another participant commented that Smith’s measure has not focused on prediction, but rather on intervention and how one moves forward with that. It is very exciting to think about obtaining information that one can use to intervene and improve health outcomes.

One participant stated that in terms of what had been presented as measures for health literacy, what was missing was a focus on measurement specifically related to either parents or children. The way in which health literacy is measured in adults is very different from what parents understand about taking care of their children’s health. Furthermore, measuring health literacy in children, from young children through adolescence, is exceedingly complex. The main social support for a child is the parent or caregiver. But as the child moves from childhood through adolescence to adulthood, there is dynamic change in whose knowledge and whose management determines health actions.

Smith responded that the functional health literacy measure is designed specifically to address literacy in parents and how it affects children’s health. Ratzan suggested that a framework for health literacy could follow a life course determinant model. One of the things such a model would do is address the issue raised earlier about the lack of measures of parent and child health literacy.

One participant referenced the levels of intervention in the health care system discussed in the report Crossing the Quality Chasm (IOM, 2001). The discussion describes opportunities to intervene at different levels, such as at the level of the individual patient, the team, the organizational level, and the specific context or environment. These different approaches to measuring health literacy appear to be moving back and forth among these levels. At some point, it might be helpful to develop a table or graph that sets out the domains of activity and organizes the measures at different levels within these domains. Developing such a table might help clarify which factors contribute to problems in health literacy at different levels. On the other hand, such a table might help determine which interventions are likely to work at different levels.

Smith responded that the Life Skills Progression Instrument and the functional health literacy measure derived from it are used at all levels. The data are rich with information for the individual level, the particular practice level, and the organizational level. The assessment looked at seven different programs with a number of visitors in each one. Analysis of differences among sites is ongoing. Not every site achieves the same progress. Differences in program emphasis and differences in individual visitor practices create different levels of progress for the families.

One participant said he appreciated the focus on functionality, which is important to incorporate into the testing of health literacy. Also important are constructs from behavioral science, such as the self-efficacy construct used by Smith. Were there other constructs from behavioral theories that could be suitably incorporated into measurement of functional health literacy? Smith said health belief model theories can be incorporated.

One participant said Mazor appeared to be looking at oral health literacy in a static way rather than taking advantage of the simple ability in an interpersonal situation to ask a question for clarification. Is that being factored into the analysis? Mazor responded that it is not because of the resource limits and constraints of the testing situation—administering a test that did not require someone to score it. It is an important piece missing from the study, but there is still value in learning whether people understand information when they hear it the first time.

If one could get at the interaction effect, the participant continued, there is an opportunity to weave health literacy into the care model. The care model4 is a systems approach to health care that involves productive interactions among activated patients and informed providers. If one could weave health literacy into that model, it could be a very important way to evaluate whether those interactions have been productive.

Mazor said that one will probably find that people do not understand a lot of what are fairly simple bits of interactions. It is important to know that what is said in interactions is not understood in the same way that print literacy is understood.

Another participant from Health Literacy Consulting said that her experience has shown that difficulty in understanding is increased at the moment of encounter when the provider speaks English but the patient speaks English as a second language. Assessment of oral health literacy would benefit from looking at this issue.

One participant asked whether there is any assessment that observes what happens during an interaction between the patient and clinician, either with a peer observation of the process, a patient exit interview, a doctor exit interview, or another method. Cindy Brach responded that John Hopkins University has a project where interactions are videotaped in order to study the nature of the interaction.

One participant asked, for the assessment of oral health literacy taking place in the CRN, are there any plans to study people’s ability to understand information under distressing conditions? For the most part, it appears these assessments are being conducted under ideal conditions. But how will patients perform when they have just been given distressing news, such as that they have cancer or that they need to come back for a second mammogram?

Mazor said that issue is very important, but it is not something that is covered by the study described. An assessment conducted under stressful conditions would require different measurement questions from the kind of standardized instrument being tested in the assessment of oral health literacy.

Another participant said it appears that for each of the items included in the assessment described by Mazor, one can examine the difference in difficulty of each of the messages as well as the performance of individual participants. There are informed providers (i.e., the messages) and activated patients (i.e., participants). One should be interested not only in the performance of the participants—the test takers—but also the performance of the items that represent the context.

Mazor said she agreed completely. Looking at how difficult the items are—the items within a clip and the individual items associated with each clip—is important. It is planned that during the final year of the project work can focus on modifications to messages that allow one to test whether it is easier to give a message in one way versus another. That would, hopefully, lead to recommendations to providers as well as public health communication personnel about how better to construct messages.

One participant said that the study under discussion is measuring both the understanding of the stimulus and the understanding of the question. But is the study trying to match these in terms of level of difficulty? Mazor said that each of the demonstrations will have a number of restatements associated with it. One could conceivably write easy ones or hard ones. One does not want everything to be too easy because that would not allow discrimination of levels of health literacy. The level of difficulty is really a function of both the original statement and the item associated with that statement.

Another participant said that one of the differences between oral communication and print communication is that the printed material can be taken home while the oral communication exists in the interaction only. Additionally in terms of the immediacy of measurement, there is also the delay factor. Has thinking been given to exploring not only what is understood in the office, but what is understood once the patient returns home?

Mazor said that one of the reasons print materials are valuable is because one can take them and review them later. Patients want materials to take home. This underscores the fact that attention must be paid to print material. In the study of oral health literacy, there are measures of cognitive function as well as memory.



PROMIS “is an NIH Roadmap network project intended to improve the reliability, validity, and precision of PROs and to provide definitive new instruments that will exceed the capabilities of classic instruments and enable improved outcome measurement for clinical research across all NIH institutes” (http://aramis​.stanford​.edu/downloads/2005FriesCERS53.pdf). Accessed April 9, 2009.


“The TT [Talking Touchscreen] is a practical, user-friendly data acquisition method that provides greater opportunities to measure self-reported outcomes in patients with a range of literacy skills” (Hahn et al., 2004).


“The Sentence Verification Technique (SVT) is a procedure that non-psychometricians can use to develop reading and listening comprehension tests that can be based on a wide variety of text materials” (http://www​.readingsuccesslab​.com/publications​/Svt%20Review%20PDF%20version.pdf). Accessed April 12, 2009.


“The Care Model is a population-based model that relies on knowing which patients have the illness, assuring that they receive evidence-based care, and actively aiding them to participate in their own care. . . . Effective outpatient chronic illness care is characterized by productive interactions between activated patients (as well as their family and caregivers) and a prepared practice team. This care takes place in a health care system that utilizes community resources” (http://www​.tachc.org​/HDC/Overview/CareModel.asp). Accessed April 12, 2009.

Copyright © 2009, National Academy of Sciences.
Bookshelf ID: NBK45378


Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...