NCBI » Bookshelf » Health Services/Technology Assessment Text (HSTAT) » AHRQ Evidence Reports » Diagnosis and Treatment of Swallowing Disorders (Dysphagia) in Acute-Care Stroke Patients: Evidence Report/Technology Assessment Number 8
 
hserta
AHRQ Evidence Reports
public health

Chapter  8:  Diagnosis and Treatment of Swallowing Disorders (Dysphagia) in Acute-Care Stroke Patients: Evidence Report/Technology Assessment Number 8

A11680

THIS EVIDENCE REPORT IS OUTDATED AND IS NO LONGER VIEWED AS GUIDANCE FOR CURRENT MEDICAL PRACTICE. IT IS MAINTAINED FOR ARCHIVAL PURPOSES ONLY.

Prepared for:
Agency for Health Care Policy and Research

U.S. Department of Health and Human Services
2101 East Jefferson Street
Rockville, MD 20852
http://www.ahcpr.gov

Contract No. 290-97-E020

Prepared by:
ECRI
Plymouth Meeting, Pennsylvania

AHCPR Publication No. 99-E024

July 1999

THIS EVIDENCE REPORT IS OUTDATED AND IS NO LONGER VIEWED AS GUIDANCE FOR CURRENT MEDICAL PRACTICE. IT IS MAINTAINED FOR ARCHIVAL PURPOSES ONLY.

Prepared for:
Agency for Health Care Policy and Research

U.S. Department of Health and Human Services
2101 East Jefferson Street
Rockville, MD 20852
http://www.ahcpr.gov

Contract No. 290-97-E020

Prepared by:
ECRI
Plymouth Meeting, Pennsylvania

AHCPR Publication No. 99-E024

July 1999

Preface

The Agency for Health Care Policy and Research (AHCPR), through its Evidence-based Practice Centers (EPCs), sponsors the development of evidence reports and technology assessments to assist public- and private-sector organizations in their efforts to improve the quality of health care in the United States. The reports and assessments provide organizations with comprehensive, science-based information on common, costly medical conditions and new health care technologies. The EPCs systematically review the relevant scientific literature on topics assigned to them by AHCPR and conduct additional analyses when appropriate prior to developing their reports and assessments.

To bring the broadest range of experts into the development of evidence reports and health technology assessments, AHCPR encourages the EPCs to form partnerships and enter into collaborations with other medical and research organizations. The EPCs work with these partner organizations to ensure that the evidence reports and technology assessments they produce will become building blocks for health care quality improvement projects throughout the Nation. The reports undergo peer review prior to their release.

AHCPR expects that the EPC evidence reports and technology assessments will inform individual health plans, providers, and purchasers as well as the health care system as a whole by providing important information to help improve health care quality.

We welcome written comments on this evidence report. They may be sent to: Director, Center for Practice and Technology Assessment, Agency for Health Care Policy and Research, 6010 Executive Blvd., Suite 300, Rockville, MD 20852.

John M. Eisenberg, M.D.Douglas B. Kamerow, M.D.
AdministratorDirector, Center for Practice and Technology Assessment
Agency for Health Care Policy and ResearchAgency for Health Care Policy and Research
The authors of this report are responsible for its content. Statements in the report should not be construed as endorsement by the Agency for Health Care Policy and Research or the U.S. Department of Health and Human Services of a particular drug, device, test, treatment, or other clinical service.

Structured Abstract

Diagnosis and Treatment of Swallowing Disorders (Dysphagia) In Acute-Care Stroke Patients

Objectives

This report was requested by the Health Care Financing Administration, which sought an evidence-based assessment of methods for diagnosing and treating swallowing disorders (dysphagia) in elderly individuals with neurologic diseases, and specifically those methods associated with the services provided by speech-language pathologists. About 6,228,000 Americans over age 60 have dysphagia, and about 300,000 to 600,000 people are affected by dysphagia resulting from neurologic disorders each year.

This report addresses four questions: (1) How does diagnosis of dysphagia or aspiration affect treatment courses and patient outcomes? (2) What are the indications for diagnosing patients using a full bedside exam (BSE), modified barium swallow, fiberoptic endoscopy, or other instrumented exams? (3) Is one diagnostic technology more effective than any other? and (4) When is noninvasive therapy appropriate, and when should feeding tubes be used in certain patient populations?

Search Strategy

We used broad-based literature searching to ensure that we found all relevant information. Thus, we searched 23 electronic databases, including MEDLINE and Embase. Search terms included those relevant to the disorder, diagnostics, epidemiology, etiology, and treatment of dysphagia. Search dates depended on the database, but ranged from 1945 to July 1998. We also searched 23 World Wide Web sites, conducted hand searches of article bibliographies, and sought published and unpublished information from experts in the field.

Selection Criteria

We adopted broad criteria to ensure retrieval of all relevant information. Articles were retrieved if they were clinical studies of any design containing information derived from 10 or more patients. Two researchers independently requested articles. Disputes were resolved in favor of obtaining the article.

Data Collection and Analysis

Data from clinical trials on diagnostic test performance and/or patient outcomes were entered directly into evidence tables. This information included study size, design, diagnostic method(s), patient selection criteria and characteristics, and patient outcomes.

Studies were separated by their design, but their quality was not necessarily ranked according to it. We did not rigidly adhere to a formal quality ranking scheme because some randomized controlled trials (RCTs) had flaws serious enough to render their results unreliable. Similarly, we did not always consider case-controlled studies to provide more reliable information than case series because of certain flaws in the former.

To reach our conclusions, we performed two cost-effective analyses, a meta-analysis of the efficacy of the 3-ounce water test, a similar meta-analysis of the BSE, an exploratory meta-analysis (so called because the control data were derived from historical sources) on the effects of dysphagia diagnosis and treatment programs on patient outcomes, an illustrative meta-analysis that addresses the technical issue of the low statistical power of vote-counting procedures, and numerous other quantitative analyses. The need to perform these original calculations is a reflection on the relatively poor reporting in this literature.

Main Results

We focused on dysphagia diagnosis and treatment programs because assessment of dysphagia diagnosis and treatment independent of one another provides only limited information. The emphasis of our analysis was on stroke victims, which reflects the preponderance of the literature. Stroke victims also comprise the largest group of patients with neurologic disease who have dysphagia. Outcomes of particular interest were aspiration pneumonia, malnutrition, dehydration, and quality of life (QOL), but the most data were available for aspiration pneumonia.

The evidence regarding the first question, which is the most significant of our key questions, suggests that dramatic reductions in the occurrence of pneumonia are observed when a systematic program of diagnosis and treatment of dysphagia in an acute stroke management plan is implemented. Because these data are from historically controlled studies rather than RCTs (which may be unethical in this context), the exact magnitude of this reduction in pneumonia rates is difficult to determine. Also for this reason, it is equally difficult to determine the contribution of the dysphagia-specific aspects of the management programs to these rate reductions (as opposed to those aspects of the stroke management program not related to diagnosis and treatment of dysphagia). However, because the effects observed in these studies are substantial, it would be imprudent to ignore them. Therefore, these results must be taken as evidence of efficacy of these programs. Malnutrition, dehydration, and QOL were not addressed in available studies.

Regarding the second question, the risk for developing aspiration pneumonia cannot be accurately predicted from any single clinical sign or symptom. There is a clear-cut need to optimize a brief initial exam that employs several key signs and symptoms to accurately detect patients with possible unsafe swallows and who therefore need more extensive testing.

The results of the studies of diagnostic test performance that address the third question are erroneous because treatment necessarily follows diagnosis of a swallowing disorder. This treatment makes it impossible to distinguish between the true- and false-positive results of these tests. Because of this, the purely diagnostic abilities of the various tests cannot be determined; thus, the tests can only be compared by assessing how they direct treatment and influence adverse patient outcomes, as addressed in our first question above.

We constructed two cost-effectiveness models to address this issue, and their results suggest that use of the full BSE in dysphagia programs reduces costs. These models, which are based on published protocols and findings, assume that patients will receive a preliminary bedside assessment, and that one result of this assessment is that no more than about 39 percent of patients will be referred for further evaluation, as occurred in the best study we evaluated. Under these conditions, our cost-effectiveness analysis suggests that dysphagia diagnosis and treatment programs that employ the full BSE would either save money or have very little net cost if they reduced pneumonia rates by amounts similar to those obtained in certain published studies. In addition, our results indicate that the slightly higher costs of videofluoroscopy would be offset if it provided slightly less than an additional 10 percent reduction in pneumonia rates, again assuming that no more than about 39 percent of patients are referred for videofluoroscopy or the full BSE.

Evidence related to the fourth question consists of limited data from a single RCT concluding that a soft mechanical diet including thickened liquids results in lower morbidity than does a pureed diet. Although numerous other studies have been conducted on dysphagia treatments, their designs make it impossible to assess the efficacy of individual treatments. A single RCT with low statistical power reported inconclusive results about the effect of treatment intensity level on patient outcomes.

A well-designed trial that compares dysphagia management programs using different diagnostic modalities is needed. This trial should take into account the current lack of a demonstrated gold standard diagnostic test, have acceptable statistical power, and consider the fact that treatment of patients distorts measures of diagnostic test performance. Because of this urgent need, we provide a detailed description of the design and analysis of a trial that would answer several major unanswered questions.

Conclusions

Available evidence on the diagnosis and treatment of dysphagia is extremely limited. Nevertheless, current data suggest that implementation of dysphagia management programs for stroke patients in the acute-care setting is accompanied by a reduction in pneumonia rates. Use of the full BSE in these programs appears to be cost-effective. The limitations of available evidence do not allow us to determine the extent to which videofluoroscopy or fiberoptic endoscopy reduce pneumonia rates compared with the full BSE despite the fact that it is reasonable to believe the additional information they provide would lead to improved patient outcomes. Although the added benefits provided by instrumented exams are likely to be small, videofluoroscopy may be cost-effective if used in programs in which no more than about 39 percent of all patients are referred to this test, and if its use leads to an additional 10 percent reduction in pneumonia rates, compared with the full BSE. There is a clear-cut need to optimize a brief initial exam that accurately detects patients with possible unsafe swallows who may therefore need more extensive testing. Study designs used in the evaluation of noninvasive therapy have made it impossible to assess the efficacy of individual treatments. However, there is evidence supporting the use of a soft mechanical diet over a pureed diet for preventing aspiration pneumonia in stroke patients with dysphagia who have a history of aspiration pneumonia. The evidence on whether treatment that is more intensive yields better patient outcomes than less intensive treatments is inconclusive. Well-designed studies, particularly those that compare diagnostic modalities, are needed.

This document is in the public domain and may be used and reprinted without permission, except for those copyrighted materials noted for which further reproduction is prohibited without specific permission of copyright holders.

Suggested Citation

Diagnosis and Treatment of Swallowing Disorders (Dysphagia) in Acute-Care Stroke Patients. Evidence Report/Technology Assessment No. 8. (Prepared by ECRI Evidence-based Practice Center under Contract No. 290-97-0020.) AHCPR Publication No. 99-E024. Rockville, MD: Agency for Health Care Policy and Research. July 1999.

Summary

Overview

This report has two goals: (1) to examine, in an evidence-based fashion, the efficacy and cost-effectiveness of methods for diagnosing and treating swallowing disorders in elderly individuals with neurologic disorders, and (2) to suggest important directions for future dysphagia research.

In addressing the first goal, we concentrate on the broad area of speech-language pathology and, more specifically, on the diagnostic and treatment methodologies associated with the services provided by speech-language pathologists. Services provided by gastroenterologists or other medical professionals are not emphasized. Partly because of this, we focus on oropharyngeal dysphagia and do not consider esophageal dysphagia to any great extent. Thus, those services provided to cancer patients, patients who have undergone esophageal surgery, or patients with esophageal reflux are not discussed here.

The available evidence further defined the scope of this report. We attempted to address all neurologic disorders that affect the elderly and that commonly cause dysphagia. However, the preponderance of published data focuses on stroke victims, making this, by necessity, the focus of this report. This restriction is not severe because most cases of dysphagia occur in these patients. However, it does mean that we focus on the diagnosis and treatment of dysphagia in a patient population that contains some patients who may ultimately spontaneously recover their swallowing function and we do not focus on patients with neurodegenerative diseases, whose swallowing function may progressively worsen.

In addressing the second goal of this report, we offer detailed recommendations for conducting a clinical trial. To make the reasons for the details of this trial clear, we comment throughout this report on the strengths and weaknesses of available studies. In doing so, we develop two major themes. The first is that in many studies on diagnostic tests for dysphagia, there is a relationship between diagnosis and treatment that makes it impossible to distinguish between the true- and false-positive results of these tests in terms of predicting risk for pneumonia (by detection of aspiration). Successful treatment will turn some true positives for pneumonia risk into apparent false positives. This, in turn, means that the diagnostic test sensitivities and specificities for pneumonia risk reported in many studies are incorrect, and the true values cannot be known without unethically withholding treatment from some patients diagnosed with dysphagia and aspiration. Because these true values cannot be known, it is not possible to compare tests on the basis of these values. Thus, the only ethical way to compare diagnostic tests is to compare the patient outcome of prevention of pneumonia following diagnosis and treatment, not the sensitivities and specificities for detection of pneumonia risk. In such comparisons of pneumonia outcomes it must be ensured that patients given different diagnostic tests receive similar treatment.

The second theme we develop is one of statistical power. Virtually all of the studies in this area that attempt to compare the efficacy of diagnostic tests or treatment are too small. In general, they contain no more than 150 patients when 10 times this many are needed to obtain statistically significant differences.

Dysphagia is commonly associated with several neurologic disorders. Oropharyngeal dysphagia can be quite serious if it results in leakage of food, drink, or oral secretions into the lungs (aspiration); this can lead to aspiration pneumonia. Aspiration pneumonia can be fatal, and the elderly are at particular risk for death. Dysphagia can also be serious if the patient is unable to eat enough food to maintain a healthy weight. Malnutrition may occur if the problem is not diagnosed correctly, and malnutrition in turn can weaken the immune system, leaving the patient susceptible to illness. Finally, patients with dysphagia may suffer from dehydration and have a reduced quality of life (QOL).

Information on the incidence and prevalence of dysphagia is sparse. To estimate the incidence of dysphagia and consider the impact of this disease on society, we performed 99 original calculations based on existing data. From these calculations, we estimate that approximately 300,000 to 600,000 people per year are affected by dysphagia resulting from neurologic disorders. Only about 51,000 of these cases are from neurologic disorders other than stroke. On the basis of data from the stroke literature, we estimate that approximately 43 percent to 54 percent of stroke patients with dysphagia experience aspiration, approximately 37 percent of these patients will develop pneumonia, and 3.8 percent of these will die of pneumonia if they are not part of a dysphagia diagnosis and treatment program. Up to 48 percent of all acute-care stroke patients with dysphagia will experience malnutrition.

Formal diagnosis of oropharyngeal dysphagia is commonly carried out using a full bedside exam (BSE) or videofluoroscopy (also called the modified barium swallow, or MBS), but other diagnostic methods are available, including several variants of fiberoptic endoscopy. Common treatments include both noninvasive therapies (such as diet modification and swallow therapy) and invasive therapies such as percutaneous endoscopic gastrostomy (PEG). PEG is the most common invasive intervention for neurogenic oropharyngeal dysphagia, and is often used when dysphagia and aspiration are serious enough to be life threatening.

Reporting the Evidence

In examining the efficacy and cost-effectiveness of methods for diagnosing and treating dysphagia, we address four key questions:

  • 1

    How does diagnosis of dysphagia or aspiration affect the subsequent course of treatment and outcomes?

  • 2

    What are the appropriate indications for having patients diagnosed using a full BSE, the MBS, fiberoptic endoscopy, or another instrumented exam?

  • 3

    Is there any evidence that one diagnostic technology provides more useful information than any other diagnostic?

  • 4

    When is noninvasive swallow therapy appropriate? Does it work particularly well or particularly poorly in any particular patient population? What can the evidence tell us about this therapy? Are feeding tubes useful or a last resort that might be avoided for some patients by dysphagia diagnosis and therapy?

We ranked each of these questions according to their presumed significance to patients and society using a six-level hierarchy. Thus, questions with the greatest significance are more general and encompass the patient outcomes obtained using the combination of dysphagia diagnosis and treatment, a combination that mirrors actual clinical practice. Those questions with presumably lesser significance focus either on treatment alone or only on diagnostic efficacy.

According to this hierarchy, Question 1 has the greatest significance to patients and society because it directly addresses the benefits that accrue to patients as a result of dysphagia diagnosis and treatment. Neither Questions 2 nor 3 specifically address treatment, and Question 4 does not consider the benefits that may accrue to patients as a result of diagnosing dysphagia.

Despite the relative importance of these four questions, we devote considerable effort to determining the relative efficacy of different diagnostic tests because patients afforded the most efficacious test are the most likely to benefit the most. Of primary interest in our assessment, therefore, were the efficacies of the commonly used instrumented and noninstrumented technologies. Instrumented tests include the videofluoroscopic swallow study (VFSS), the primary variant of which, in this field, is the MBS. Investigators in dysphagia research consider the MBS the gold standard against which to compare new diagnostic technologies. Also of particular interest were two newer technologies: the fiberoptic endoscopic exam of swallowing (FEES) and the fiberoptic endoscopic exam of swallowing with sensory testing (FEESST).

The noninstrumented diagnostic test of primary interest in this report is the full BSE. This is a formal, structured test that often incorporates a questionnaire, noninvasive oral-pharyngeal physical examination, and functional swallowing tests. There are many variants of this diagnostic test, and sometimes it consists of only of a simple water swallow test. To avoid confusion of this exam with a preliminary BSE, which also has many variants, we refer to the formal, structured exam as the full BSE.

The focus of our analysis of these diagnostic tests, which is again dictated by the preponderance of the literature, is on the ability of these tests to predict aspiration and aspiration pneumonia. We were, however, interested in all patient outcomes, including mortality and any morbidity directly associated with dysphagia or aspiration. These include pneumonia, malnutrition, dehydration, QOL, feeding tube complications, and death, and short-term outcomes such as energy intake, weight change, and successful feeding levels. We examine these to the extent that the literature will allow.

The treatments on which we focused were primarily noninvasive; however, minimally invasive therapy, PEG, was also considered because some patients may have severe dysphagia that precludes oral feeding. There are two basic categories of noninvasive treatment: swallow therapy and diet modification. Within the category of swallow therapy are three basic subcategories: compensatory techniques (which teach the patient postural maneuvers to compensate for swallowing difficulty or provide more sensory information to the patient through changes in bolus characteristics), indirect swallow therapy (which serves to teach the patient exercises to strengthen impaired or weakened muscles), and direct swallow therapy (in which patients are taught exercises to perform during the swallow to force the bolus to pass through the pharynx correctly).

Diet modification is sometimes used if the patient aspirates only certain substances while swallowing. This modification is individualized to the patient's needs, depending on the viscosity or volume of the boluses aspirated. Most patients who aspirate because of neurogenic dysphagia aspirate thin liquids, so patients are often put on a diet in which liquids are thickened, usually with a commercial thickening agent. Less common is difficulty with solids, but if a patient were to experience this, solid food would be pureed before it is eaten.

Methodology

We searched 23 electronic databases both to retrieve clinical trials for analysis and to review articles to gauge current opinion in the field. The databases searched and general keywords used are listed below:

American Speech-Language-Hearing Association (ASHA) Database (1945 to 1996)

CANCERLIT (through September 4, 1997)

CATLINE (through August 25, 1997)

The Cochrane Database of Systemic Reviews (through 1998 Issue 3)

The Cochrane Registry of Clinical Trials (through 1998 Issue 3)

The Cochrane Review Methodology Database (through 1998 Issue 3)

Combined Health Information Database (CHID) (through July 29, 1998)

Current Contents (through July 1998)

The Database of Reviews of Effectiveness (Cochrane Library) (through 1998 Issue 3)

DIRLINE (through November 1997)

ECRI Health Devices Alerts (1977 through July 1998)

ECRI Health Devices Sourcebase (through July 1998)

ECRI International Health Technology Assessment (IHTA) Database (1990 through July 1998)

ECRI Healthcare Standards Database (1975 through July 1998)

EMBASE (Excerpta Medica) (1974 through February 6, 1998)

Health Care Financing Administration (HCFA) Coverage Manuals CD-ROM (through July 1998)

HealthSTAR (Health Services, Technology, Administration, and Research)
(1990 through May 20, 1998)

Incidence and Prevalence Database (1988 through August 25, 1997)

MEDLINE (1964 through July 24, 1998)

NIH Grants Database (through December 12, 1997)

Nursing and Allied Health (NAHL) (1988 through April 30, 1998)

PsycINFO (1967 through September 10, 1997)

Sociological Abstracts (1963 through November 1997)

The search strategies employed a number of free-text keywords, as well as controlled vocabulary terms, including but not limited to:

Diagnostic modalities: barium sulfate; barium swallow; barium; fluoroscopy; cineradiology; videofluoroscopy; FEES (fiberoptic endoscopic evaluation of swallowing); ESE (endoscopic swallowing evaluation); FEED (fiberoptic endoscopic evaluation of dysphagia); FEESST (fiberoptic endoscopic evaluation of swallowing with sensory testing); VEED (videoendoscopic evaluation of dysphagia)

Disorder: deglutition disorders (exploded); deglutition (exploded); dysphagia; swallowing

Epidemiology: epidemiology; research design; epidemiologic study characteristics; epidemiologic methods; epidemiologic studies; evaluation studies; incidence; prevalence; statistics and numbers; aspiration pneumonia; neurodegenerative diseases (exploded); Parkinson disease; silent aspiration; stroke

Etiology: aging; Alzheimer disease; dementia; multiple sclerosis; Parkinson disease; stroke

Miscellaneous: cachexia; wasting; weight loss; quality of life; QOL; life satisfaction; satisfaction

Treatment: speech therapy; speech-language pathology; electrical stimulation; enteral nutrition; intubation, gastrointestinal; nasogastric; nasointestinal; NG; percutaneous endoscopic gastrostomy; PEG tube feeding; rehabilitation, geriatric; rehabilitation; speech and language; rehabilitation, patients; elder care; mobile health units

World Wide Web Searches

In addition to the above searches, we conducted searches of the World Wide Web. These searches were conducted using various search engines including (but not limited to) AltaVista, Hotbot, and Yahoo. Searches focused on the areas of dysphagia, aging, neurologic disorders, and pneumonia. Twenty-three Web pages were accessed for this project.

Hand Searches of Journal and Nonjournal Literature

We searched Current Contents - Clinical Medicine on a weekly basis as well as more than 1,600 journals and supplements maintained in our collections. In addition, we conducted a hand search of the Cumulated Index Medicus (1960-1964) for the terms deglutition, deglutition disorders, speech therapy, and speech disorders. We also screened nonjournal publications and conference proceedings from professional organizations, private agencies, and government agencies.

Other Mechanisms

Other mechanisms were used to retrieve additional relevant information, including review of bibliographies/reference lists from peer-reviewed journals and gray literature. (Gray literature includes reports, studies, etc., produced by local government agencies, private organizations, educational facilities, corporations, etc., that do not commonly appear in the published peer-reviewed journal literature.) Published and unpublished information was also solicited from a panel of experts in the field.

Information Retrieved

The use of these search methodologies, as well as personal communications from several technical experts, resulted in the identification of 4,485 items of information in the form of journal articles, book chapters, manuscripts, monographs, Web pages, personal communications, and other miscellaneous items. Two primary analysts blinded from each other independently reviewed the titles and abstracts of all electronic search results and ordered articles using the following inclusion criteria:

  • 10 or more subjects

  • Human studies

  • In vivo studies

  • English language

Our broad-based literature searching and broad inclusion criteria ensured retrieval of all relevant information. To further this goal, literature searches were provided to two analysts who independently requested information. Discrepancies in requests were resolved in favor of obtaining the information. Further, literature requested by each analyst was delivered to both analysts who, during initial review of the retrieved literature, reviewed the bibliographies of each article for additional information that might not have been found during the database searches.

In certain instances, more specific inclusion criteria were used. These criteria are described in the appropriate sections of this report.

As a result of our searches, a total of 1,808 articles were retrieved. These included 1,467 clinical trials, 183 review articles, and material from nine World Wide Web sites. In addition, we obtained 32 unpublished articles and received 28 personal communications.

We adopted a flexible scheme for assessing the quality of the literature, which consists of several types of study designs. Among these designs were randomized controlled trials (RCTs), historical prospective case series, case-controlled studies, and case series. In general, we considered data from RCTs to be more reliable than that from studies of other design and considered data from historical prospective case series to be of the second highest level of reliability. We did not, however, rigidly adhere to this scheme because some RCTs had flaws serious enough to render their results unreliable. Similarly, we did not always consider case-controlled studies to provide more reliable information than case series because of certain flaws in the former. These flaws were particularly evident when case controls were used to measure the sensitivity and specificity of diagnostic tests. Such designs artificially set the prevalence and severity of disease, rendering measures of test performance derived from them unreliable.

Because of the scarcity of controlled trials on the effectiveness of dysphagia diagnosis and treatment programs or of treatment per se, a standard meta-analysis that used improvement in patient outcomes as the dependent variable was not possible. Nevertheless, the analytical approach used throughout this evidence report is heavily quantitative. It contains two cost-effectiveness analyses, a meta-analysis (in the form of a summary receiver operator characteristic curve) on the efficacy of the 3-ounce water test, a similar meta-analysis on the efficacy of the BSE, an exploratory meta-analysis (so called because it derives control group data from historical sources) on the efficacy of dysphagia diagnosis and treatment programs, and a meta-analysis that addresses the more technical issue of the low statistical power of vote-counting procedures. In addition to these analyses, our analysis also contains numerous other original calculations. Noteworthy among these is that we calculated all of the 95 percent confidence intervals (CIs) around all of the proportions in the supplemental analyses and Results section of this report. We performed these calculations using a formula that corrects for the spurious results that common formulae for CIs yield when a proportion substantially deviates from 0.5. We also computed all of the Fisher's exact tests shown in this report and used this test as opposed to the 2 test or odds ratios because the latter two tests yield suspect results when a zero (or a near zero number) is in one of the cells of the 2 X 2 tables on which these statistics are conducted. We also adjusted for the varying lengths of followup used by different studies (and hence the varying risks of contracting pneumonia) by using a curve fitting procedure and we provide original calculations (or independently verified the published calculations) for all sensitivities, specificities, and positive and negative predictive values shown in the evidence tables of this report. In our suggested clinical trial we provide original statistical power calculations based on our own number needed to treat analysis. Finally, the appendices and supplemental analyses provided with this evidence report contain numerous other original calculations.

This need to perform a substantial number of original calculations is a reflection of the relatively poor reporting and the relatively poor quality of the literature related to this evidence report. Without our original calculations, it would have been extraordinarily difficult to reach meaningful conclusions. We recognize, however, that our attempt to come to conclusions based on studies of relatively poor design is uncommon. However, it must also be recognized that many of today's pressing healthcare questions must be answered in the absence of strong evidence, and that failure to do so imposes serious limitations on the practical applications of evidence-based medicine.

Findings

The findings outlined here are applicable only to acute-care stroke patients. Insufficient information was available on patients with other neurologic diseases or conditions, and it is entirely possible that at least some of the findings we present are not applicable to patients with these other diseases and conditions.

1. How does diagnosis of dysphagia or aspiration affect the subsequent course of treatment and outcomes?

In answering this question, we focus on dysphagia diagnosis and treatment programs in general. We address the specific diagnostic tools and the specific treatments that might be used to optimize these programs in Questions 3 and 4, respectively.

Current evidence suggests that implementation of a systematic program of diagnosis and treatment of dysphagia in an acute stroke management plan yields dramatic reductions in pneumonia rates. Because these data are derived from historically controlled studies rather than RCTs, the exact magnitude of this reduction in pneumonia is difficult to determine. Also for this reason, it is equally difficult to determine the contribution of the dysphagia-specific aspects of the management programs to these rate reductions (as opposed to those aspects of the stroke management program not related to diagnosis and treatment of dysphagia). However, because the effects observed in these studies are substantial, it would be imprudent to ignore them. Therefore, these results must be taken as evidence of efficacy of these programs.

Partly for the reasons noted in the preceding paragraph, it seems prudent to include dysphagia-specific management with formal diagnosis and treatment as part of the standard protocol of stroke management in acute-care settings, despite the sparse available data. Also, these programs appear to have a minimal potential to harm patients. Finally, dysphagia diagnosis and treatment programs may be cost-effective. Thus, withholding such programs would deny potential benefits to patients, perhaps at some additional costs.

2. What are the appropriate indications for having patients diagnosed using a full BSE, MBS, fiberoptic endoscopy, or another instrumented exam?

The risk for developing aspiration pneumonia cannot be accurately predicted from any single clinical sign or symptom. There is a clear-cut need to optimize a brief initial exam that accurately detects patients with possible unsafe swallows and who therefore need more extensive testing. An optimum combination of signs and symptoms for such an initial test has not been determined, and further appropriate research is needed. This research should seek to devise an initial test that has high sensitivity and moderate specificity. This would ensure that the great majority of patients with unsafe swallows are appropriately referred for further testing and that only a minimum number of patients with safe swallows are erroneously referred for more extensive testing.

3. Is there any evidence that one diagnostic technology provides more useful information than any other diagnostic?

Neither videofluoroscopy nor fiberoptic endoscopy can serve as a perfect gold standard for detection of aspiration, because each yields false-negative and false-positive results. Without a third, better reference standard, the ability of these two methods to detect aspiration cannot be compared with each other.

Full BSEs can have sensitivities for aspiration near 80 percent, with specificities near 70 percent. Epidemiological evidence indicates that about half of the patients with dysphagia who aspirate do so silently (without a cough). These two points, taken with the very low pneumonia rates observed in dysphagia management programs that used full BSEs, indicate that these exams are capable of detecting most aspiration, even silent aspiration (or that any undetected aspiration does not contribute greatly to the pneumonia rate).

The ability of the full BSE to detect silent aspiration should be optimized in future research. Whether it can be conclusively stated that an optimized full BSE can entirely replace imaging exams such as videofluoroscopy and fiberoptic endoscopy depends partly on the degree to which the BSE is optimized, and partly on the additional benefit that results from the direct internal visual information provided by the imaging exams.

In all studies that attempted to predict pneumonia from the results of a diagnostic test, measurements of the ability of a diagnostic test to predict pneumonia, when expressed in terms of test sensitivity and specificity, were distorted by intervening treatment that prevented some pneumonia from occurring. This distortion cannot be circumvented by withholding treatment, because this would be unethical. Therefore, it does not appear possible to ethically measure either the absolute or relative sensitivities or specificities of dysphagia diagnostic modalities for predicting pneumonia. This also appears to be true for other important patient outcomes such as malnutrition, dehydration, and QOL. This means that the only ethical method of comparing various diagnostic modalities is to conduct controlled trials that measure the combined effects of diagnosis and treatment on the rate of pneumonia and/or other patient outcomes. This limitation on study design provides one reason why the superiority of any given diagnostic method is not conclusively shown by currently available data.

Another reason that current studies do not conclusively show the superiority of any diagnostic test relates to their small size. These studies do not have enough statistical power to allow one to detect the effects of interest, which may be small.

The differences in the ability of various diagnostic tests to predict pneumonia are likely to be small because the two studies using BSEs were so successful (nearly all pneumonia was prevented). It is difficult to obtain a statistically significant improvement upon their results.

Although currently available data do not allow one to determine the degree to which, or even whether, use of videofluoroscopy or other instrumented exams lead to lower pneumonia rates than the full BSE, it is entirely reasonable to expect that their use might lead to lower pneumonia rates. This is because instrumented exams provide more information than the full BSE. Partly for this reason, and even though no data currently demonstrate a difference in effectiveness among diagnostic modalities, one should not conclude that our results prove that the BSE and instrumented exams are equivalent. In terms of patient outcomes, we have found no evidence of a difference between these technologies, not evidence of no difference. Research that appropriately addresses this issue is needed.

Because no data satisfactorily compare the abilities of the full BSE and videofluoroscopy to prevent pneumonia, we constructed two cost-effectiveness models to address this issue. These models, which are based on published protocols and findings, assume that patients will receive a preliminary bedside assessment, and that one result of this assessment is that no more than 39 percent of patients will be referred for further evaluation. Under these conditions, our cost-effectiveness analysis, although based on less than perfect data, suggests that dysphagia diagnosis and treatment programs that employ the BSE would either save money or have very little net cost if they reduced pneumonia rates by amounts similar to those obtained in certain published studies. In addition, our results indicate that the slightly higher costs of videofluoroscopy would be offset if it provided slightly less than an additional 10 percent reduction in pneumonia rates, again assuming that no more than 39 percent of patients are referred for videofluoroscopy or the full BSE.

Some epidemiological evidence suggests that dysphagia patients who aspirate have about a 50 percent greater risk of developing aspiration pneumonia than dysphagia patients who do not aspirate during videofluoroscopy exams. However, the relationship between aspiration and pneumonia is not perfect. Other patient characteristics such as oral hygiene and immune system strength may play equal or even greater roles in causing pneumonia. This loose relationship between aspiration and pneumonia means that aspiration should not be considered a definitive marker for the patient outcome of pneumonia.

Large RCTs measuring patient outcomes such as aspiration pneumonia, malnutrition, dehydration, and QOL are needed to determine the comparative effectiveness of the various dysphagia diagnostic and treatment methods. These trials should enroll patients in reasonably homogeneous disease groups or stratify patients into appropriate subsets. We have included detailed suggestions for such studies along with some estimates of the number of patients needed.

4. When is noninvasive swallow therapy appropriate? Does it work particularly well or particularly poorly in any particular patient population? What can the evidence tell us about this therapy? Are feeding tubes useful or a last resort that might be avoided for some patients by dysphagia diagnosis and therapy?

Most study designs used in the evaluation of noninvasive therapy have made it impossible to assess the effectiveness of individual treatments.

The results of a single RCT supported the use of a soft mechanical diet over a pureed diet for preventing aspiration pneumonia in stroke patients with dysphagia who had a history of aspiration pneumonia.

A single RCT reported inconclusive results about the effect of treatment intensity level on patient outcomes. The statistical power of this trial may have been too low to detect the appropriate differences.

Future Research

Research in the areas of diagnosis and treatment of dysphagia in neurologic patients has thus far largely focused on stroke patients, who comprise the largest proportion of patients with dysphagia, and has mostly been conducted as case reports and small case series. Only a few randomized trials have been attempted, and a few additional studies included historical controls. Future research needs to be conducted on all aspects of dysphagia. Our suggestions for study design and methodology are outlined below.

Results from patients whose dysphagia is caused by different diseases, disorders, or conditions should not be combined. When they are combined, it is probable that the results of a study will be influenced as much, if not more, by patient characteristics than by treatment.

Patient outcomes should be analyzed by stage of disease. For example, combining results obtained from patients soon after a stroke and results obtained from patients weeks or months after a stroke is inappropriate. Here, too, patient characteristics can have an undue influence on the overall results of a study.

Case series provide extremely limited information. RCTs are preferable, but where they are not ethical, controlled studies of other design should be employed. Randomization of patients to receive diagnosis by different methods is ethical because no single diagnostic has yet been proven superior to any other. This is not necessarily the case in the treatment of dysphagia, where randomizing patients to an untreated group may be unethical. If a control group is deemed unethical, the comparison group should be a similar patient population receiving a different treatment or different level of treatment.

Studies should focus on assessing overall dysphagia management programs because of the difficulty of separating diagnosis and treatment in this field.

Researchers should specify the kind of pneumonia they are studying. Results from patients with aspiration pneumonia should be reported separately from results from patients with pneumonia resulting from other causes (understanding, however, that the distinction between these two diagnoses is generally a hypothesis, rather than certainty, because of difficulty in ascertaining the etiology of pneumonia).

More study of whether dysphagia causes malnutrition and/or dehydration is required. If such a causal relationship is confirmed, investigators should begin to report these measures as outcomes.

Researchers need to report the specific causes of patient mortality, thus making it clear whether a patient died from complications resulting from dysphagia, or from the primary disease.

Additional information on the QOL of patients with dysphagia, and on the degree to which treatment of dysphagia improves QOL, is needed. QOL should be measured using standard methods.

Studies on feeding tube efficacy should separately report results obtained in patients with dysphagia and should not combine such results with feeding tubes placed for other reasons.

Patients should be followed for longer times than those used in available studies. Followup times should be standardized within any given study so that all patients are followed for approximately the same length of time.

Investigators comparing two diagnostic methodologies should ensure that patients in different groups do not receive different treatment, or comparisons between the two are no longer feasible.

Because of the shortcomings in available research, we suggest a multiarm, randomized trial to evaluate the efficacy of different dysphagia management programs. The primary outcome of this trial should be aspiration pneumonia rates, but other outcomes, such as those related to malnutrition and dehydration, should also be measured. The trial can consist of two to four groups, but for the purposes of illustration, we will describe it here as if it consisted of four. In this trial, the control group would consist of patients randomized to receive a full BSE alone; the three experimental groups would consist of patients randomized to receive the full BSE plus one of three possible instrumented examinations. Readers of the instrumented examinations would be blinded to the results of the BSE. The results of the BSE would be used to determine whether combinations of signs and symptoms predicted aspiration pneumonia and other outcomes. Treatment choices in the group that received only the BSE and in the groups that received the instrumented exams would be appropriately based on each of these exams. The results of this trial could be extended using decision analysis and cost-effectiveness analysis. Only if data from such a trial are available will it be possible to conclude definitively whether one exam is superior to another in reducing the incidence of aspiration pneumonia and/or other adverse outcomes.

Introduction

Purpose of This Report

The purpose of this report is to assess, in an evidence-based fashion, the efficacy and cost-effectiveness of methods for the diagnosis and treatment of oropharyngeal dysphagia (swallowing disorders) in elderly individuals with neurologic diseases. Dysphagia is commonly associated with many neurologic diseases that often occur in the elderly, including cerebrovascular accident (acute stroke, CVA), Parkinson's disease, Alzheimer's disease, and motor neuron disease. Dysphagia in these patients most often manifests itself in the oral and pharyngeal stages of swallowing (for further discussion of these stages, see the section entitled Physiology and Symptomatology of Dysphagia). Dysphagia can be serious if it results in leakage of food or drink, oral secretions, or stomach contents below the true vocal cords into the lungs (aspiration). Aspiration of these materials can lead to aspiration pneumonia. Because pneumonia is a leading cause of death in the elderly, current diagnostic and treatment technologies to help prevent patients with dysphagia from contracting pneumonia are important to determine.

Dysphagia, if untreated, can also potentially lead to malnutrition or dehydration if the patient is unable to ingest enough food or drink to maintain healthy weight or nutritional status. Malnutrition, in turn, can weaken the immune system; dehydration can lead to dementia-like symptoms. The accurate diagnosis of dysphagia and determination of whether current dysphagia programs prevent malnutrition and dehydration are important.

In addition to pneumonia, malnutrition, and dehydration, dysphagia may cause substantial deficits in quality of life (QOL). Although literature on the impact of diagnosis and treatment of dysphagia on QOL do not allow formal evaluation, this issue will be discussed in the present report.

Several instrumented and noninstrumented techniques for diagnosing oropharyngeal dysphagia exist; the bedside exam (BSE) and videofluoroscopy (VF) are the most common of those currently used. Once dysphagia is diagnosed, several treatment options are available: noninvasive therapies include diet modification and swallow therapy, which may include postural changes or specific exercises. If these are not adequate or possible, minimally invasive parenteral (intravenous) or nasogastric (NG) feeding may be instituted temporarily. These interventions are frequently carried out even before a formal diagnosis of dysphagia has occurred. Invasive therapy may be necessary when dysphagia and aspiration are serious enough to threaten health on a long-term basis; the most common invasive therapy for neurogenic dysphagia is a percutaneous endoscopic gastrostomy (PEG). This report evaluates the current evidence regarding these diagnostic and noninvasive treatment techniques, along with some discussion of parenteral and NG feeding and evaluation of PEG.

Neurogenic dysphagia is encountered in most care settings, including hospital acute-care, rehabilitation units, nursing homes, and community outpatient and home care settings. Because of limitations in the published literature, we were able to carry out a formal evaluation of only the hospital acute-care setting; however, some of these findings can reasonably be extrapolated to other settings, and, where appropriate, we discuss this possibility in this report.

Our evaluation of these diagnostic and treatment techniques centers on four major questions. The question of greatest interest to patients and society is the most global: How does the presence of a dysphagia management program (including both diagnosis and treatment) ultimately affect long-term patient outcomes (morbidity and mortality incidence)? Other specific analyses in this report address:

  • Signs and symptoms predictive of serious morbidity and mortality that may help select patients for extensive diagnostic testing

  • The comparative sensitivity and specificity of different diagnostic tests

  • The use of diagnostics to guide treatment

  • The relative efficacy of different treatment programs

The Results section of this report addresses these issues. The Introduction provides a background on neurogenic swallowing disorders, an overview of diagnostic and treatment methods, and a discussion of the epidemiology and burden of illness caused by neurogenic dysphagia. The Methodology section gives a technical overview of the questions we addressed and of the processes we used to identify and select all relevant literature. It also includes a discussion of the evidence model and hierarchy of evidence around which we have structured the analyses.

In the Results section of this report, we summarize our findings. The Future Research section makes specific recommendations for improvements in future clinical studies, and suggests a multicenter randomized trial that would answer several open questions in this field. A detailed description of the design and statistical analysis of this trial is provided.

A supplementary analysis is also included. This analysis consists of a model of the process of diagnosing and treating dysphagia using a decision tree. This tree compares the cost-effectiveness of a directed dysphagia program for hospitalized stroke patients with hospital care without such a directed program. This tree, however, is limited by available data. In Appendix F is another decision tree that is meant to serve as a guidepost for future research. Much of it is built upon the clinical trial that we suggest.

Elderly Defined

Elderly is usually not well defined. The ages encompassed by this classification vary from researcher to researcher. While Medicare may consider the elderly to be those over 65 years, many others have included those as young as 55. The oldest old are those 85 years old and over. For the purposes of this report, we attempt to compare results of different studies in such a way that the ages included are comparable. Results for different subgroups of the elderly are presented when available; the results of the oldest old, who are likely to be the most disabled, are presented separately.

Dysphagia Defined

Table 1. Examples of Definitions of Dysphagia
AuthorYearDefinition
Kahrilas 1989Implies difficulty in swallowing
Dorland's Illustrated Medical Dictionary1994Difficulty in swallowing
American Academy of Otolaryngology-Head and Neck Surgery1995Feeling of difficulty passing food or liquid from the mouth to the stomach
Buchholz 1996Deficiency in the achievement of the purposes of eating: (1) pleasure, both personal and social, and (2) nutrition and hydration
Boyce 1997Sensation of delay in passage of a food bolus within 10 seconds of initiation of a swallow
Different authors define the term dysphagia (from the Greek dys, meaning disordered, and phagein, to eat; Winstein, 1983) in various ways, as illustrated by the examples shown in Table 1 (which is not exhaustive).

These definitions vary along a continuum from the specific and quantitative (e.g., Boyce, 1997) to those with subjective and qualitative elements (e.g., Buchholz, 1996). Differences among them are not insignificant, and have implications for the scope of the diagnosis and treatment of dysphagia. For example, Boyce (1997) confined the disorder to sensations of delay in food passage, thus presumably excluding silent aspiration (aspiration resulting in no symptoms and thus not detectable to the patient or on BSE) as an outcome of dysphagia. The definition by Buchholz (1996) would include not only silent aspiration, if classified as a deficiency in nutrition and hydration, but also deficiencies in eating pleasure that arise from oral problems such as poorly fitting dentures.

For the purposes of this assessment, we will define dysphagia here very generally, following Kahrilas (1989), as disordered swallowing. This definition does not require the patient to consciously perceive the disorder and, therefore, includes silent aspiration. However, the focus from all disorders that may arise during feeding is narrowed to only those that originate after the bolus has passed through the initial oral stages of mastication (oral dysphagia) and is at least at the point when transit from the mouth to the pharynx is initiated reflexively (oropharyngeal dysphagia). We will not consider esophageal dysphagia in this report, which falls more into the realm of gastroenterology than within the realm of disorders dealt with by speech-language pathologists (SLPs) and otolaryngologists who sometimes identify these problems and bring them to the attention of gastroenterologists. In our discussion and analyses, we generally accept each study's definition of dysphagia but discuss any outcome differences that may have resulted from these different definitions and diagnostic technologies used. In particular, the differences in diagnosed occurrence among dysphagia diagnosed at BSE versus VF or fiberoptic methods [videofluoroscopic swallowing studies (VFSS), modified barium swallow (MBS), fiberoptic endoscopic examination of swallowing (FEES), and fiberoptic endoscopic examination of swallowing and sensory test (FEESST)] are noted, because BSE only detects dysphagia with overt symptomatology, and instrumented methods may detect cases that are asymptomatic but will not necessarily cause a problem to the patient.

Physiology and Symptomatology of Dysphagia

This section will focus on the physiological dysfunctions and the particular symptoms demonstrated by patients with dysphagia due to neurologic disease. While the Epidemiology section of this report focuses on the overall occurrence of dysphagia in each of several neurologic disorders, this section will center specifically on what complaints these patients have, as well as what particular abnormalities in physiological processes during oropharyngeal dysphagia are present during instrumented exam (usually VFSS or MBS).

Physiology: Background

Table 2. The Oral and Pharyngeal Stages of a Normal Swallow Table 2a. Oral Stage
FunctionResults
Tongue progressively elevates anteriorly to posteriorly.Carries material through the oral cavity.
Groove is created on the midline of the tongue.Guides material down the tongue.
Material reaches the anterior faucial arch region.The pharyngeal swallow is triggered, which ends the oral stage of the swallow.
Table 2b. Pharyngeal Stage
FunctionResults
Velopharyngeal musculature elevates, closes nasal cavity, and approximates the pharyngeal wall.Prevents material from entering the nasal cavity.
Tongue base retracts toward the posterior pharyngeal wall.Creates pressure to drive the bolus through the pharynx.
Posterior pharyngeal wall contracts toward the tongue base.Creates pressure to drive the bolus through the pharynx.
Hyoid elevation occurs.Lowers the epiglottis and lifts the larynx.
Larynx closes at three levels: True vocal folds False vocal folds Base of epiglottis to aryepiglottic folds.Protects airway.
Larynx elevates and moves forward.Protects airway; pulls open cricopharyngeal and upper sphincter.
Cricopharyngeal sphincter relaxes and is pulled open.Allows material to pass into esophagus; pharyngeal stage of swallowing ends.
Although a discussion of the physiology of a normal swallow is beyond the scope of this report, a certain degree of understanding is necessary to discuss abnormalities demonstrated by patients with neurogenic dysphagia. Swallowing is generally discussed as occurring in three stages: oral, pharyngeal, and esophageal (some authors include oral preparatory as a fourth stage) (Hardy and Robinson, 1993). Tables 2a and 2b describe the oral and pharyngeal stages of normal swallowing.

Swallowing is a complex function that involves several nerve and muscle groups. Six of the 12 cranial nerves and 4 cervical nerves play intrinsic roles in the swallowing process; they mediate communication between the brain stem and the muscles of the oral and pharyngeal regions to perform a complex set of processes necessary for a safe swallow. The muscles of the oral and pharyngeal region send messages to the brain stem reticular formation, medullary integrative center, and frontal cortex through the 7th (facial), 9th (glossopharyngeal), and 10th (vagus) cranial nerves; messages are received from the brain through the 9th, 10th, and 12th (hypoglossal) cranial nerves (Bass and Morrell, 1992; Lugger, 1994); these nerves also innervate muscles involved in swallowing. In addition to these four cranial nerves, two other cranial nerves (trigeminal and spinal accessory) also innervate muscles used in chewing and swallowing, while four cervical nerves are integral to the pharyngeal phase of swallowing. Thus, damage to any of these nerves, muscles, or lesions in the brain stem, medulla, or cortex can lead to disorders of swallowing.

Neurogenic Disorders of Oropharyngeal Swallowing

As discussed above, a safe swallow requires an intact and functional nervous system. Thus, swallowing disorders commonly occur in many neurologic disorders. The physiologic mechanisms of the most common oropharyngeal swallowing disorders that are often neurogenic in origin are discussed briefly below (Logemann, 1983c). The frequency with which each of these occurs in specific neurologic diseases is discussed in the subsequent section, Swallow Disorders of Neurologic Diseases.

Delayed pharyngeal swallow

After the bolus passes through the mouth and approaches the entrance to the pharynx, it pools briefly at the valleculae, a crevice between the base of the tongue and the epiglottis. A swallow reflex can be triggered at any point when the bolus is between the anterior faucial arches and the point where the tongue base crosses the lower rim of the mandible (Logemann, 1998a). Until this reflex is triggered, the larynx remains open and material can enter the airway. In patients experiencing a delayed pharyngeal swallow (or vallecular stasis; the most common swallow disorder in many neurologic patients, including stroke), the chances of aspiration are increased. The chances are especially high with thin liquids, which flow more easily. If the movement of the material into the airway triggers a cough reflex, the material will be expectorated. However, if the cough reflex is absent (as sometimes occurs in these patients), the material enters the lungs.

Absent pharyngeal swallow

If the pharyngeal swallow is completely absent, material will fill the valleculae, overflow, and spill into the airway. Again, the result is expectoration if the cough reflex is functional, and entry into the lungs if it is not.

Inadequate velopharyngeal closure

Velopharyngeal muscles elevate during a swallow to protect the nasal cavity so that materials are not regurgitated through the nasal passages. If the closure of this cavity is not complete, material may enter. This is not usually severe enough to cause serious nasal regurgitation but may warrant diet modification.

Reduced pharnygeal wall contraction

When the pharyngeal swallow stage is initiated, the base of the tongue retracts toward the pharyngeal wall, and the pharyngeal wall contracts toward the base of the tongue. The pharyngeal constrictor muscles squeeze, and this contraction helps propel the bolus through the pharynx. Reduction in this squeezing action decreases the efficiency of bolus transport and slows the pharyngeal transit time (time from faucial arches to esophagus). Residual material may be left between the valleculae and pyriform sinus after the bolus has entered the esophagus. If the residue is excessive, material may fall into the airway after the swallow is completed when the larynx reopens to resume respiration. Adequate laryngeal sensation, if present, will prevent this. If both pharyngeal function and laryngeal sensation are impaired, silent aspiration will occur. This symptom is often found in patients with neuromuscular diseases.

Unilateral pharyngeal paralysis

In some patients with neurologic disorders, especially those with a unilateral cerebral infarct, the muscles on one side of the pharynx do not function, and, therefore, material that passes down that side of the pharynx may not clear into the esophagus, thus increasing the risk for post-swallow aspiration. If patients with this disorder tilt their heads toward the nonparalyzed side or turn their heads toward the paralyzed side, the bolus material will flow down the nonparalyzed side, perhaps decreasing risk of aspiration.

Cricopharyngeal dysfunction

The cricopharyngeus muscle is always at a tonic state, so that the individual does not breathe air into the esophagus; it relaxes only during a swallow, initiated by elevation and moving forward of the larynx, which in turn causes laryngeal anterior superior movement, opening the sphincter. If the cricopharyngeus does not relax or relaxes too early or too late, the bolus will remain in the pyriform sinus(es) and potentially overflow into the trachea. The mechanism for cricopharyngeal hypertonicity is not well understood.

Reduced laryngeal elevation

Laryngeal elevation closes the airway and prevents aspiration of material into the trachea. Reduced laryngeal closure results in aspiration during the swallow, not as an after-effect as with pharyngeal paralysis or reduced peristalsis.

Reduced laryngeal closure

Laryngeal closure prevents aspiration of materials into the trachea. It normally occurs at three different levels: the true vocal folds, the false vocal folds, and the epiglottic aryepiglottic folds. If any of these levels is affected by neurologic damage, the larynx may remain open and aspiration can occur during the swallow as the bolus passes through the larynx. It is not an after-effect of the swallow.

Summary

There are several anatomical levels at which neurogenic swallow dysfunction can occur during oropharyngeal swallowing. These specific dysfunctions are important in that they can cause aspiration of materials into the lungs. Aspiration can occur before, during, or after the swallow:

Before: results from poor tongue bolus control, resulting in material spillage, or from a delayed or absent swallow reflex

During: results from reduced laryngeal elevation or closure

After: results from pharyngeal contraction, unilateral or bilateral pharyngeal paralysis, reduced closure of the airway entrance, cricopharyngeal dysfunction, or reduced tongue base motion.

Swallow Disorders of Neurologic Diseases

Each of the particular dysfunctions discussed above may occur at varying rates in patients with different neurologic disorders, depending on the region(s) of the central nervous system affected. The frequency of these disorders as detected by diagnostic instrumentation such as VF and fiberoptic endoscopy (FE) will be discussed. Most studies thus far have been conducted using VF. Of interest here are recuperative disorders (stroke), in which patients usually recover most function spontaneously over time (depending on the severity), and degenerative disorders (e.g., Parkinson's, Alzheimer's, multiple sclerosis), in which most people's neurologic functioning deteriorates over time.

Stroke

A stroke can be termed a recuperative neurologic disorder, because most patients will gradually recover at least some functions that were impaired or lost at the time of the event, depending on the severity of the stroke. Dysphagia, which occurs in 20 to 90 percent of stroke patients (as discussed in the Epidemiology section of this report), depending on the diagnostic method and criteria used, has been found to gradually disappear in most patients, so that 6 months after the stroke very few stroke patients still demonstrate any major dysfunction (Barer, 1989; Smithard, O'Neill, England et al., 1997).

Delayed triggering of the swallow reflex has been the most often reported feature of dysphagia in stroke patients, occurring in up to 91 percent of patients within 3 months after stroke (Horner and Massey, 1988; Veis and Logemann, 1985). Reduced pharnygeal wall contraction is also common, as well as reduced lingual control. Most patients exhibit more than one swallowing dysfunction; delayed pharyngeal swallow is most often seen in conjunction with reduced pharnygeal wall contraction (Veis and Logemann, 1985).

Pharyngeal transit time has been demonstrated on VF to be significantly increased in stroke patients as opposed to normal controls (Johnson, McKenzie, Rosenquist et al., 1992; Robbins, Levine, Maser et al., 1993). This can result in pooling of residual material in the pyriform sinus, which may then be aspirated into the lungs postswallow.

Laryngeal sensory deficit, as measured by fiberoptic sensory testing, has been found significantly more often in stroke patients without dysphagic complaints than in normal controls, with stroke patients demonstrating a higher sensory threshold response to an air pulse. There are, therefore, silent sensory deficits in this population that may be predictive of aspiration pneumonia (Aviv, Sacco, Thomson et al., 1997), although this is not firmly established.

One study found that patients with right hemispheric strokes demonstrated longer pharyngeal response duration than patients with left hemispheric strokes and a higher incidence of aspiration. In addition, patients with anterior lesions showed greater dysfunction in transit duration than patients with posterior lesions (Robbins, Levine, Maser et al., 1993). However, other studies have reported no correlation between lesion location and symptomatology (Johnson, McKenzie, Rosenquist et al., 1992; Veis and Logemann, 1985).

Parkinson's disease

Dysphagia has been demonstrated in 63 to 81 percent of patients with Parkinson's disease (see the subsection entitled Epidemiology and Appendix A for more details). Delayed swallow reflex and vallecular pooling, as in stroke, is the most commonly reported disorder, occurring in nearly all patients with dysphagia in some studies (Ali, Wallace, Schwartz et al., 1996; Bushmann, Dobmeyer, Leeker et al., 1989; Robbins, Logemann, and Kirshner, 1986). Pooling at the pyriform sinuses is also very common (up to 80 percent of dysphagics). Reduced pharnygeal wall contraction has been reported in 40 to 65 percent of patients (Ali, Wallace, Schwartz et al., 1996; Blonsky, Logemann, Boshes et al., 1975).

Oral-pharyngeal transit time has been demonstrated to be significantly longer in patients with Parkinson's disease than in controls (Blonsky, Logemann, Boshes et al., 1975; Nilsson, Ekberg, Olsson et al., 1996; Robbins, Logemann, and Kirshner, 1986). The severity of dysphagia has not been correlated with severity of disease (Ali, Wallace, Schwartz et al., 1996; Fuh, Lee, Wang et al., 1997; Nilsson, Ekberg, Olsson et al., 1996).

Alzheimer's disease

Many of the feeding problems in patients with Alzheimer's disease come as a result of the dementia; patients lose comprehension necessary for self-feeding and require assistance or cues to complete this process (Priefer and Robbins, 1997; Volicer, Seltzer, Rheaume et al., 1989). These are not dysphagia problems. Few researchers have documented specific dysphagia disorders in patients with Alzheimer's disease, possibly due to the difficulty in communication required to apply a diagnostic test such as VF.

One study on 25 patients with Alzheimer's disease reported that 84 percent demonstrated swallowing abnormalities on VF (Horner, Alberts, Dawson et al., 1994); the most common disorders reported were delayed swallowing reflex, followed by hesitancy of oral preparation, and deficient pharyngeal clearance (pooling of residue in the pyriform sinus). Six patients aspirated (24 percent).

One other study reported on measures of swallow duration. These researchers reported that patients with Alzheimer's disease demonstrated significantly longer oral transit duration with solids, pharyngeal response duration, and total swallow duration with thin liquids. In elderly patients with Alzheimer's disease, additional dysfunctions were noted, such as delayed pharyngeal swallow on solids and longer pharyngeal response duration and total swallow duration with solids (Priefer and Robbins, 1997).

No study has demonstrated a correlation between severity of disease and severity of dysphagia; one study found a trend between occurrence of aspiration and severity of disease (Horner, Alberts, Dawson et al., 1994).

Other neurologic disorders

Literature on the nature of dysphagia in other neurologic disorders is scarce. One study reported on dysphagia symptoms in patients with motor neuron disease (Leighton, Burton, Lund et al., 1994); the most common symptom reported in 70 untreated patients was poor bolus formation, followed by dysfunction of the cricopharyngeus musculature. However, these researchers may have selectively chosen patients with particular disorders to report. Correlation between disease severity and severity of dysphagia was not reported, but researchers did observe a correlation between the type of motor neuron disease and the frequency of dysphagia, with bulbar palsy most often demonstrating dysphagia, followed by progressive muscular atrophy, and then amyotrophic lateral sclerosis patients.

Two studies have been published on dysphagia in patients with Huntington's disease (Kagel and Leopold, 1992; Leopold and Kagel, 1985). Both indicated that oral bolus retention (squirreling) was a commonly observed disorder. Impaired bolus formation and voluntary swallow initiation were reported to occur in all subjects in one study (Leopold and Kagel, 1985); pharnygeal wall contraction was also common. The other study reported pooling in pyriform sinuses, supraglottic penetration, and delayed pharyngeal swallow (Kagel and Leopold, 1992). Both studies reported that hyperextension of head and neck during swallow contributed to many of these problems, including some cases of aspiration. Correlation between disease severity and dysphagia severity was not reported in these studies.

One study reported dysphagia dysfunctions in patients with progressive supranuclear palsy (Litvan, Sastry, and Sonies, 1997). Swallow duration was significantly longer in such patients than in controls. Specific commonly occurring disorders noted were delayed pharyngeal swallow in 70 percent of patients, impaired tongue motility and pooling in the valleculae in 63 percent, and premature dripping into the pharynx in 59 percent. A significant correlation existed between severity of disease (as measured by the Hoehn and Yahr Scale) and severity of dysphagia.

Clinical Symptoms and Reported Complaints

Table 3. Detectable Symptoms of Oral-Pharyngeal Dysphagia During a Physical Exam or BSE
Neurogenic symptoms: oralNeurogenic symptoms: pharyngeal
Drooling Difficulty chewing Difficulty initiating swallow Dry mouth Compensatory strategies Retention of food in lateral sulcus, anterior sulcus, beneath tongue, or hard palate Spitting food out Slowed oral transit Excessive lingual movementThroat clearing Wet voice Hoarseness Nasal regurgitation Coughing Choking Laryngospasm Pneumonia Stridor Feeling food catch in throat Excessive saliva or mucus Delayed elevation of hyoid bone
A variety of symptoms manifest themselves as a result of the above-mentioned physiologic abnormalities. Table 3 lists some of the general clinical signs and symptoms of swallowing disorders readily detectable during a physical exam or a BSE. It is also necessary for the clinician to talk to the patient and ask questions, because the patient is often aware of problems he or she is having. The patient may report an inability to chew, a feeling of food catching in his or her throat, problems in holding food in the mouth, collection of food low in the throat, or coughing or choking before or during the swallow (Logemann, 1983a).

Some of these observable symptoms may be indicative of a more serious problem, such as aspiration. The Results section discusses the predictive value of some of these signs and symptoms in predicting aspiration and pneumonia.

Morbidity and Mortality Resulting from Dysphagia

Dysphagia itself is not a life-threatening disorder; it is the morbidities that may arise as a result that can potentially lead to death. Dysphagia alone, however, can seriously affect the QOL of the patient. This section discusses the serious morbidities that can result from dysphagia or aspiration.

Malnutrition

Patients unable to ingest food safely will not eat and therefore will not be able to maintain a healthy weight and nutritional status. Malnutrition can cause weakening of the immune system, leaving the patient susceptible to viral and bacterial illness (Chandra, 1990). This condition is only relevant if the dysphagia is not diagnosed early; once dysphagia is diagnosed, steps are taken to ensure that a patient is able to ingest food and/or drink. If the condition is serious enough that malnutrition or dehydration is feared, the patient may be placed on a feeding tube, and then malnutrition is unlikely. Some cases of dysphagia, however, may not be detected until malnutrition has already occurred, and then interventions must be made to alleviate this condition before serious resultant morbidity occurs.

However, a causative relationship between untreated dysphagia and malnutrition has not conclusively been made yet, having only been explored in four published studies (Davalos, Ricart, Gonzalez-Huix et al., 1996; Keller, 1993; Keller, 1995; Thomas, Verdery, Gardner et al., 1991), and results have not been consistent. More research is required in this area to determine whether malnutrition is in fact an outcome of importance in this field.

Dehydration

Because many patients with neurologic impairments have difficulty swallowing thin liquids, dehydration is a risk that must be addressed. However, because intravenous drip is commonly used in hospitalization regardless of the reason for admission, the actual risk of aspiration in patients with dysphagia is difficult to assess.

Pneumonia

Pneumonia can result if certain types of substances are aspirated into the lungs of susceptible individuals. Pneumonia of this etiology is termed aspiration pneumonia, although this specific diagnosis is difficult to make without invasive measures (Limeback, 1998). Therefore, clinicians often make the inference that pneumonia is due to aspiration if aspiration is also observed in the patient.

The terms aspiration pneumonitis and aspiration pneumonia, although synonymous (Dorland's Illustrated Medical Dictionary, 1994), are not always used in a consistent manner in the literature. Both terms encompass pulmonary consequences of acute aspiration of stomach contents, as well as chronic aspiration of oral microbes, food, and drink (Bartlett and Gorbach, 1975). The former may or may not have a microbial component, as stomach acid alone can in some cases lead to serious morbidity and possibly death, although microbes of oral origin are usually associated with gastric contents (Bartlett, 1974; Bartlett and Gorbach, 1975; Smith, 1927; Smith, 1928). Aspiration of stomach contents is a catastrophic event with immediate and possibly fatal results, both from the stomach acid and the oral microorganisms that are found in the stomach. This type of aspiration is frequently associated with patients of all ages who experience an acute aspiration event because of temporary swallow function disruption from anesthesia, alcohol, or drug intoxication (Awe, Fletcher, and Jacob, 1966; Bynum and Pierce, 1976; DeMeester, Bonavina, Iascone et al., 1990; DePaso, 1991; Greenfield, Singleton, McCaffree et al., 1969; Jorgensen, Byer, and Gould, 1989; LoCicero, 1989; Wynne, 1982). We do not address this type of aspiration in this report. In addition, people of any age who chronically reflux stomach contents are at increased risk for aspiration pneumonia. We will not directly address this condition either, except to point out that this condition is more likely to lead to pneumonia in people with other dysphagic symptoms than in those with an otherwise normal swallow. Our decision not to address the relationships between chronic GER and dysphagia or aspiration pneumonia is not because of their lack of importance. Rather, these relationships overlap with the very large gastroenterology literature on GER, and it would be impractical to expand our assessment into that area. We also do not address acute choking episodes, known as restaurant aspiration. While people with dysphagia are clearly more susceptible to this phenomenon, which may be fatal, this type of event is directly observable and does not require diagnostic tests. While choking in some patients with dysphagia may be prevented with prior diagnosis and treatment of dysphagia, an event in progress cannot be treated except by immediate removal of the object with the Heimlich maneuver or instruments capable of directly grasping the object (Ekberg and Feinberg, 1992).

In this report, we will be concerned with aspiration pneumonia caused by the repeated aspiration of either oral secretions or food and drink. People who aspirate repeatedly as a result of oral-pharyngeal dysphagia will invariably aspirate the microbes in oral secretions, and, if fed orally, will aspirate food and drink. This can lead to two different, but overlapping, pulmonary conditions. There are descriptions in the literature of the consequences of chronic aspiration of food and drink into the lungs (Emery, 1960; Pinkerton, 1928; Vidyarthi, 1967), and Schwartz (1973) introduced the term asylum pneumonitis to describe the course of institutionalized patients who progress from mental symptoms that affect feeding abilities at admittance to pulmonary lesions. Autopsy studies indicated a predominance of legume material in the lungs of these patients (Schwartz, 1973). It is not entirely clear from these studies, but it seems possible that the sheer mechanical obstruction by these substances, and the deposits that developed around them, could have caused serious morbidity and death, even in the absence of microbial manifestations. However, it seems likely that such massive amounts of food aspirate would be accompanied by sufficient oral fauna to also cause microbial pulmonary symptoms. The aspiration of food and drink can potentially be prevented entirely by therapy to prevent aspiration or, if that is not possible, by parenteral or enteral feeding accompanied by the elimination of oral intake (NPO).

It has been known for about 70 years that microorganisms typically found in the lungs of elderly and frail pneumonia patients are the same as those found in oral secretions and that derive from the gingiva (Bartlett, 1974; Bartlett and Gorbach, 1975; Palmer, 1987; Smith, 1927; Smith, 1928). Recent research continues to support the importance of the relationship between oral hygiene, dysphagia, and aspiration pneumonia (Langmore, Terpenning, Schork et al., 1998). It may have been underappreciated that, although parenteral and enteral feeding can prevent the aspiration of food and drink, these interventions alone cannot prevent aspiration of oral secretions. This may partially explain the disappointing results of such interventions in the prevention of aspiration pneumonia that we will discuss in later sections. Much of this type of aspiration may also not be prevented by diet modification or therapy by SLPs although it is theoretically possible that some of this type of aspiration could be minimized or prevented in patients who can learn and consciously use positional maneuvers to safely swallow their own secretions, or whose swallowing function can be improved with skill-building and strengthening exercises.

Aspiration pneumonia from the repeated aspiration of oral secretions, food, and drink due to neurogenic dysphagia is the primary outcome of interest in this assessment. The epidemiology and burden of disease caused by aspiration pneumonia is discussed in the Epidemiology section of this report.

Quality of Life (QOL)

In addition to the above hard outcomes, pneumonia, and malnutrition, dysphagia causes substantial deficits in QOL. Not only are the physical pleasures of eating disrupted, but important mealtime social interactions are also disrupted (Gustafsson, 1995; Gustafsson and Tibbling, 1991). In addition, for patients capable of laryngeal sensation, the experience of repeated choking and aspiration is accompanied by substantial anxiety and even terror (Gustafsson and Tibbling, 1991). Indeed, some patients describe choking and aspiration as the most traumatizing experience in their lives. QOL, therefore, may be the most important outcome to the patients themselves. We note these issues in the present report, but the literature was not sufficient to allow formal evaluation.

Diagnosis of Oropharyngeal Dysphagia

Oropharyngeal swallowing disorders are diagnosed with three levels of diagnostic methodologies. First, there is a preliminary clinical examination of patients with suspected dysphagia or with a condition considered to put the patient at risk for dysphagia. Second, in some centers, patients with clinically suspected dysphagia are then given a formal BSE by an SLP or other specialist with training in dysphagia diagnosis and treatment. Third, based on the clinical or BSE and input from support staff involved in feeding activities, the patient may be referred for an imaging study. We describe these diagnostic steps below. Comparison and evaluation of the diagnostic efficacy of these exams is deferred to the Results section of this report; however, we discuss here some peripheral issues such as the physical limitations and convenience factors involved in these exams. For these latter issues, literature was not sufficient for a formal evaluation.

Clinical and Bedside Exams

Approaches for clinical and BSEs range from an informal clinical evaluation to a very structured assessment involving multiple defined steps. An informal assessment may be carried out by a clinician of any specialty who has a patient who appears to have difficulty swallowing or evidence of aspiration, or who has a condition, such as a stroke, that is believed to place a patient at risk for dysphagia and aspiration. In some cases, such as with comatose or severely impaired patients, the clinical assessment alone will be the basis for the decision to suspend oral feeding. In such cases, patients are fed parenterally (intravenous line) or enterally with an NG feeding tube or a PEG or jejunal (PEJ) tube.

There are several structured clinical exams: various versions of the formal BSE (Splaingard, Hutchins, Sulton et al., 1988), the 3-ounce (50-ml3 or 50 cc) water test (DePippo, Holas, and Reding, 1992; Wade and Hewer, 1987), the Burke Dysphagia Screening Test (BDST) (DePippo, Holas, and Reding, 1994) (which includes the 3-ounce water test), and others (Fleming, 1987; Gordon, Hewer, and Wade, 1987; Martens, Cameron, and Simonsen, 1990; Sonies, Weiffenbach, Atkinson et al., 1987). There is also a bedside test using a citric acid spray to elicit a reflexive cough (see the section entitled Other Dsyphagia Analysis Techniques in Development for further discussion). The simpler versions of these exams may be administered by nurses (Odderson, Keaton, and McKenna, 1995), and the more comprehensive versions may be administered by any clinician with dysphagia training, but are often administered by SLPs.

Most BSEs have three parts. The first part consists of taking a detailed history that accounts for any physical conditions, surgeries, or medications that might contribute to dysphagia. The second part is a physical exam that involves listening to the patient's speech, observing facial expressions, and performing a hands-on physical examination of the mouth and throat. The final part involves observing the patient attempting to swallow various consistencies and sizes of foods and liquids and noting the presence of cough, throat clearing, or change in voice quality following these consistency tests.

A structured BSE requires time and training to administer. The 3-ounce water test is an attempt to minimize these limitations by taking a few of the most salient features of the BSE. It is intended mainly to assess the potential for aspiration. Some are concerned that encouraging patients to drink this much water could be dangerous, if it is aspirated; however, we found no reports of such problems.

All of the above clinical exams have the limitation that they cannot directly examine the pharynx and larynx, and thus cannot directly assess whether aspiration is occurring. Sensory or motor deficits in the larynx may cause the patient to be incapable of volitional or reflexive coughing and unable to even sense that aspiration has occurred. This is known as silent aspiration (Garon, Engle, and Ormiston, 1996). One readily assessable phenomenon, the gag reflex, was assumed in the past to be an indicator of laryngeal competence; however, in recent studies it has not been found to be a sensitive or specific indicator of aspiration (Bleach, 1993). This is understandable, because different cranial nerves control the gag reflex and pharyngeal-laryngeal areas (Bleach, 1993; Kim, Goodhart, Aviv et al., 1998). Ongoing research concerning physical signs or symptoms of aspiration that can be assessed at the bedside without instrumentation is being done (see Results section for a full discussion); however, a lack of consensus on such clinical signs and symptoms has been a major impetus toward instrumented exams designed to detect aspiration, as well as provide information on its cause and treatment.

Instrumented and Imaging Exams
Videofluoroscopic swallowing studies

The VFSS is recognized as the reference standard for assessment of oropharyngeal swallowing, although it has not been shown to be a perfect gold standard. This technique was originally developed for observation of the esophageal region. It was later modified for assessment of the oropharyngeal component of swallowing (Ekberg, 1982; Logemann, 1983b), and is thus called the MBS. The MBS is often carried out in a radiological suite, although portable units are available. Recently, the MBS has begun to be conducted in mobile clinics set up for fluoroscopic exams, or with transportable units that can be taken into facilities lacking an inhouse fluoroscopic unit. The MBS is administered by a team consisting of a radiologist to perform the fluoroscopy, an SLP to assess swallowing function and administer therapy, and often a nurse or paramedical attendant to assist the patient. Administration of radiation is legally limited to medical doctors; however, a radiologist per se is not a legal requirement, and mobile units may use other medical doctors.

The MBS is often administered to the patient in an upright seated position, although the patient's typical eating position can also be used if it is different. The patient attempts to swallow barium-impregnated boluses of different consistencies, progressing from solids to pudding to thick liquids and ending with thin liquid (although some practitioners reverse this order). If aspiration is observed, patient maneuvers or various diet consistencies are tried to see if aspiration can be eliminated or minimized. The exam is ended if unpreventable or dangerous aspiration is observed. The patient is observed from the front and from the side, with the side view being the most useful. Because of limitations on radiation exposure, the entire test usually is conducted within approximately 5 minutes. This exam allows direct observation of not only aspiration and timing, but also of other structural and functional anomalies of the swallow. The dynamic fluoroscopic images are captured on videotape and can be viewed repeatedly and in slow motion following the exam.

While such VFSS are generally considered the most direct and comprehensive method for assessing swallowing function, these exams are not without limitations. In a study involving 11 clinicians and 182 patients with dysphagia in 10 nursing homes on the U.S. East Coast and Midwest, Hageman (Hageman, unpub.) recently found that 55 percent of these patients with dysphagia of typical etiologies were not able to undergo VFSS. The top nine reasons in order were: status too mild, poor patient cooperation, transportation difficulty, status too severe, insufficient oral motor ability, patient was combative, physician denial, payer denial, and patient refusal. Some of these reasons are the results of the healthcare system, and are not inherent limitations of the exam itself. Other authors (Bastian, 1993; Kaye, Zorowitz, and Baredes, 1997; Kim, Goodhart, Aviv et al., 1998; Langmore, Schatz, and Olson, 1991; Spiegel, Creed, and Selber, unpub(a)) have also criticized the VFSS method for the above limitations, as well as for its inability to image soft tissue, observe dry swallows, observe a normal meal throughout its full time course, and its limitation to passive observation of bolus passage without the ability for direct sensory testing in the absence of a bolus.

It is an important aspect of VFSS and other imaging methods that they are not limited to passive observation of signs and symptoms, but instead provide immediate feedback to the patient and clinician in terms of maneuvers, therapies, and diet changes that may prevent or improve aspiration and other abnormal aspects of swallowing. Thus, they can be a major part of the planning, initiation, and monitoring of treatment.

In terms of safety, the radiation dosage involved in modern fluoroscopy is not considered a major concern (Beck and Gayler, 1990; Martin and Hunter, 1994; Potvin, Kaur, and Williams, 1991; Wright, Boyd, and Workman, 1998), particularly for the elderly patients discussed in this assessment. We found no evidence-based reports in the literature of morbidity or deaths caused by aspiration of barium compounds during oral-pharyngeal swallowing studies. There were case reports of two deaths (Gray, Sivaloganathan, and Simpkins, 1989) and one serious morbidity report (Penington, 1993) from barium aspiration; however, all of these appeared to be cases in which lack of attention to the possibility of aspiration allowed aspiration of large quantities of barium typically used in gastrointestinal (GI) tract barium swallows. This seems unlikely in exams intended to assess oral-pharyngeal dysphagia, because they begin with the specific purpose of observing whether there is any aspiration of small initial quantities of barium. Allergic reactions to barium sulfate appear to be quite rare and have been estimated to occur at a rate less than two per million (Muroi, Nishibori, Fujii et al., 1997).

Fiberoptic endoscopy

A more recent development is the FEES (Bastian, 1991; Bastian, 1993; Kidder, Langmore, and Martin, 1994; Langmore, Schatz, and Olsen, 1988; Selkin, 1984). This exam is administered by an otolaryngologist and an SLP; in some states, an SLP alone may administer the exam, while in other states, this exam is beyond the scope of an SLP's practice. It is performed with portable equipment at the patient's bedside. A fiberoptic device attached to a video camera is inserted nasally so that the pharynx and larynx can be observed from above. The image is observed on a small video monitor and is recorded on videotape. Patients are first observed swallowing their own secretions. Next, food and liquid of decreasing viscosity are swallowed. Various postural positions can be tried. Barium is not necessary, but dye is often added to the food to increase visibility (the image is in color). FEES cannot directly observe the oral region during swallowing, and in the pharyngeal and laryngeal regions, it is limited to observations immediately before and after a swallow, because pharyngeal closure obscures the exact moment of swallowing. However, aspiration can be clearly detected before the swallow, and aspiration during the swallow is usually detectable by post-swallow residue in the trachea or by the act of coughing up aspirated material. Because a fiberoptic exam can be carried out over the full course of a meal, fatigue factors can be noted that might be missed on the necessarily briefer VFSS evaluation. Also, the portability and repeatability of FEES allow followup assessment-therapy sessions as needed throughout the course of dysphagia therapy. This not only can provide feedback necessary for changes or reinforcement of therapy but may also allow the discontinuation of ineffective or unnecessary therapy.

The fiberoptic exam is not limited to passive observation. Sensory and reflex information can be obtained by touching the tip of the device to different pharyngeal structures or administering regulated air puffs (Aviv, 1997; Aviv, Martin, Keen et al., 1993; Kim, Goodhart, Aviv et al., 1998). If a pneumatic aspect is involved, it is called a FEES and sensory test. Advocates of FEES and FEESST believe the sensory measures, particularly bilateral deficits, are prognostic for intermittent aspiration that may not always be observable on short VFSS imaging tests (Aviv, unpub.). Because of the pneumatic equipment involved, FEESST is not as easily portable as FEES. In the present report, we will sometimes use FE as a generic term to refer to all of the above forms of fiberoptic endoscopic exams.

The MBS and FE methods can both provide patient feedback on swallowing performance. Both methods provide both overlapping and unique information; thus, it is not necessarily a matter of replacing one test with another. The issue is the proper use of both tests. However, some practitioners prefer to be able to directly view the swallow and possible aspiration with a VFSS, if this type of exam is available and the patient's condition is suitable.

Repetitive oral suction swallow (the ROSS test)

The ROSS test has only been used in Sweden (Nilsson, Ekberg, and Hindfelt, 1995; Nilsson, Ekberg, Olsson et al., 1998); however, it illustrates the use of simple noninvasive instruments to evaluate swallowing, and addresses oral function, a stage that cannot be well evaluated by either VFSS or FE methods. In the ROSS test, the patient repetitively sucks water through a straw from a glass that sits on a scale. The scale is connected to a strip-chart recorder that plots the removal of the water from the glass. A pressure detector in the straw records the magnitude of the suction through the straw. A Doppler probe is attached to the patient's neck just to the right of the midline at the level of the cricothyroid membrane, and it records Doppler shift at the mid-pharyngeal region. At the same level on the left side of the neck, a piezoelectric movement sensor records movement of the larynx. Finally, a temperature sensor placed in one of the patient's nostrils records airflow. These devices provide a simultaneous plot of the magnitude and timing of all of the above functions as the patient attempts to drink 200 ml of water.

Piezoelectric computerized laryngeal analysis (CLA)

In the CLA exam, a miniature piezoelectric strip is taped on the exterior of the neck at the laryngeal prominence. As the patient is asked to dry swallow or to swallow various consistencies and sizes of bolus, the speed, extent, and timing of laryngeal elevation are measured and recorded as a strip-chart readout on a laptop computer. Normal swallow references for all of these measurements are a part of the computer's program, so that instantaneous comparison to these standards are made, and the chart readout indicates with colors and markers where deviations from normal occur. While the information provided appears to be mostly limited to laryngeal movement, as discussed in the Physiology and Symptomatology of Dysphagia section above, this movement is an integral part of laryngeal protection. Delay, insufficiency, or aberrations of this movement are strong indicators of aspiration and other swallowing problems. Thus, while this exam is not as comprehensive as the MBS or a FE exam, it may provide sufficient information to detect and treat aspiration in some cases.

As with FEES, the CLA exam can be performed at the bedside, be used for dry swallows, assess swallowing during the entire course of normal meals, and be carried out repeatedly throughout the course of treatment to evaluate results. There is a less expensive and simpler version of the CLA that substitutes a sensing and signaling device for the computer readout; a patient's acceptable swallowing parameters can be programmed into the device so that it signals when the patient is not achieving adequate parameters. It can thus be used to monitor swallowing on a long-term, ongoing basis by the patients themselves or by minimally trained support staff and family members in nursing facilities and at home.

CLA is a relatively new technology, but a small, unpublished study compares it with VFSS for prediction of the important patient outcome of aspiration pneumonia. Because we did not comprehensively solicit unpublished studies from all dysphagia device manufacturers, we did not consider it fair to other manufacturers to present this CLA study in the Results section of this assessment. However, we describe this study here. In spite of this study's small size (16 patients), it was a prospective trial in which patients were each given both tests, and in which the readers of one test were blinded to the results of the other test. The endpoint in this trial was pneumonia, and patients were followed for 6 months. This would be a strong study design if patients positive for aspiration on either test were treated identically. Unfortunately, such was not the case for this study, because treatment was based only on the VFSS results. The results of CLA were not used in patient management. The fact that some patients were treated, which likely prevented some cases of pneumonia, means that sensitivity and positive predictive value (PPV) may have been underestimated for both tests. Furthermore, because only the patients who received a positive VFSS test result were treated to prevent pneumonia, the VFS sensitivity and PPV were likely underestimated to a greater degree. The results of this study cannot be reliably interpreted because of this design flaw; thus, the effectiveness of CLA for predicting patients at risk for pneumonia compared with VFSS is not currently known. However, further research on this new technology, as well as others described below, should be encouraged.

Other dysphagia analysis techniques in development

All of the above methods had at least one study that attempted to assess the impact of the diagnostic method on aspiration pneumonia frequency. Although the lack of comparative or controlled studies does not allow us to formally assess other instrumented methods in this evidence report, brief descriptions of some of these other methods are worthwhile. Among them are manometry, manofluorography, electromyography (EMG), electrical impedance tomography, scintigraphy, pulse oximetry, respiratory pattern analysis, echoplanar magnetic resonance imaging (MRI), cervical auscultation, and ultrasound.

EMG measures electrical activity of muscles. This activity increases during muscular contraction. EMG can be used to monitor the activity of the muscles involved in swallowing, particularly the middle pharyngeal constrictor and the cricopharyngeus (Hanson, Lawson, and Remacle, 1995). Lack of coordination of contraction of these two muscles is taken to be indicative of a swallowing disorder. Other abnormalities include increased resting contraction or decreased activity of one or the other of these muscles.

Manometry is used to measure pharyngeal or esophageal motility. A pressure-sensing device, or manometer, is inserted into the lumen, and the pressure exerted by the walls on various parts of the manometer during swallowing is measured. The timing of muscular contraction is noted. The presence or absence of peristalsis can be measured, as well as the timing of peristalsis (Johnston, Collins, McFarland et al., 1993). These measurements are influenced by the consistency of the bolus, body position, manometer placement, and gastric fullness, as well as the presence of any swallowing disorder. The utility of this method may be limited by difficulties in ascertaining the exact position of the manometer unless performed simultaneously with VF (Ergun, Kahrilas, and Logemann, 1993). Combining the two techniques (manofluorography) allows coordinated analysis of bolus transport and pharyngeal or esophageal pressure (Olsson, Nilsson, and Ekberg, 1994).

The citric acid cough test can be used to measure the cough reflex. The patient irritates his or her throat by inhaling nebulized citric acid dissolved in saline (Sekizawa, Ujiie, Itabashi et al., 1990). Greater concentrations of citric acid produce greater irritation. The concentration of citric acid required to produce cough is determined. Higher concentrations indicate a diminished cough reflex.

Oximetry has been used to indicate the occurrence of aspiration. Aspiration of food or liquid is thought to cause reflex bronchoconstriction and can cause a rapid alteration in the degree of oxygen saturation of arterial blood, which can be measured using a finger probe attached to a pulse oximeter (Zaidi, Smith, King et al., 1995). Simultaneous measurement of arterial oxygen saturation and assessment of dysphagia using VF suggest that pulse oximetry may detect the presence or absence of silent aspiration (Collins and Bakheit, 1997). Although there are several studies on pulse oximetry, none have addressed whether use of this technique has an impact on patient pneumonia rates and, for this reason, we do not formally assess this technology.

Respiratory inductance, respirometry, plethysmography, and nasal thermistor airflow recording are other methods of detecting cardiopulmonary adaptation during and after swallowing (Nilsson, Ekberg, Bulow et al., 1997; Rogers, Msall, and Shucard, 1993).

Scintigraphy during the swallowing of a radionuclide may detect aspiration (Muz, Mathog, Miller et al., 1987). This method allows determination of flow dynamics and quantification of the amount of liquid aspirated (Hamlet, Muz, Farris et al., 1992).

Critically ill patients are often fed a high-glucose formula through NG tubes. If this formula is aspirated, glucose will be found in the trachea. When these patients have a tracheostomy or translaryngeal intubation, the presence of glucose in the tracheal secretions can easily be assayed (Winterbauer, Durning, Barron et al., 1981).

Echoplanar magnetic resonance imaging, a form of MRI, allows superior temporal resolution compared with conventional MRI (Gilbert, Daftary, Woo et al., 1996). This rapid imaging may enable the clinician to distinguish the rapid events of swallowing better than conventional MRI.

Respiratory patterns during swallowing can be measured electronically (Selley, Flack, Ellis et al., 1989a). Sounds produced during swallowing and breathing can be recorded and timed. In neurologically impaired patients, these patterns differ from those of healthy controls (Selley, Flack, Ellis et al., 1989b). Use of this technique, the Exeter dysphagia assessment technique (EDAT), to assess oropharyngeal motor and sensory function may aid in diagnosis of dysphagia (Selley, Flack, Ellis et al., 1990). Simultaneous recording of swallowing behavior using VF and EDAT allow events recorded by VF to be related to events in the respiratory pattern as recorded by EDAT (Selley, Ellis, Flack et al., 1994).

Listening to the patient's breathing patterns through direct, stethoscopic auscultation may enable the physician to detect unusual sounds such as bubbling, which may indicate aspiration (Zenner, Losinski, and Mills, 1995).

The thickness of the esophageal wall and the width of the lumen can be measured using ultrasonography (Sobin, Nathanson, and Engstrom, 1996). Patients with esophageal motility disorders have been reported to show a prolonged swallowing time and widened liquid- and air-filled lumina. This method of measurement may prove useful in the diagnosis of dysphagia. Ultrasonography can also be used to detect abnormal tongue movements during swallowing that might be associated with dysphagia (Wein, Bockler, and Klajman, 1991). Abnormal movements of the hyoid bone and associated muscles can be detected using ultrasound duplex-Doppler imaging (Sonies, Wang, and Sapper, 1996).

Electrical impedance tomography measures conductivity of tissues. Pharyngeal conductivity changes during swallowing. Measuring the timing of this conductivity change may allow measurement of the speed and timing of a swallow (Hughes, Liu, Griffiths et al., 1996).

Summary of Instrumented Diagnostic Tests

Table 4. Reported Advantages and Disadvantages of Each Instrumented Diagnostic Test
TestReported advantagesReported disadvantages
Technologies supported by published clinical trials measuring pneumonia as the outcome
VFSS
  • 1

    Visualizes frequency and amount of aspiration.

  • 2

    Visualizes physiologic variables such as timing of swallow and paralysis.

  • 3

    Can test several amounts and viscosities of boluses.

  • 4

    Can test therapies during the diagnostic exam.

  • 1

    Not portable.

  • 2

    Requires patient to be able to sit upright and understand instructions.

  • 3

    Time limitation: cannot observe entire meal.

  • 4

    Cannot image soft tissue.

  • 5

    Cannot examine dry swallows.

  • 6

    Cannot measure accumulated secretions in laryngopharynx.

  • 7

    Only analyzes motor component of swallow.

  • 8

    Small radiation risk.

FEES
  • 1

    Portable to bedside of patient.

  • 2

    Can document entire course of meal to record fatigue effects.

  • 3

    No radiation risks.

  • 4

    Can assess sensory and reflex status of pharynx through endoscopic touch.

  • 1

    Cannot visualize exact moment of swallow or aspiration due to white out.

  • 2

    Cannot visualize oral region.

FEESST
  • 1

    Tests sensory aspects of swallow through quantitative pneumatic measures: identifies patients who may experience aspiration due to lack of laryngeal sensation.

  • 2

    Can document entire course of meal to record fatigue effects.

  • 3

    No radiation risks.

  • 1

    Not as easily portable as FEES.

  • 2

    Cannot visualize exact moment of swallow or aspiration due to white out.

  • 3

    Cannot visualize oral region.

Technologies supported by other clinical trials
Manometry
  • 1

    Allows quantitative analysis of pressure dynamics within the oropharynx.

  • 2

    Detects elevated resting pressure indicative of gastroesophageal reflux.

  • 3

    No radiation risks.

  • 1

    Variability precludes intercenter comparison of absolute measurements.

  • 2

    Does not measure the axial component of pharyngeal contraction.

  • 3

    Perfusion manometers may not register rapid pressure changes.

  • 4

    Does not provide information on the degree of esophageal sphincter opening.

Manofluorography
  • 1

    Allows quantitative analysis of pressure dynamics within the oropharynx.

  • 2

    Detects elevated resting pressure indicative of gastroesophageal reflux.

  • 3

    Fluorography allows more precise control of manometer placement, reducing variability of measurements.

  • 4

    Allows coordinated analysis of bolus transport and intraluminal pressure changes.

  1. Perfusion manometers may not register rapid pressure changes.

EMG
  • 1

    No radiation risks.

  • 2

    Surface EMG is noninvasive.

  • 1

    There is often no correlation between electrical events on the pharyngeal muscles and intraluminal pressure changes.

  • 2

    Needle electrodes may cause patient discomfort.

  • 3

    Surface EMG cannot distinguish the contribution of individual muscles to the signal.

CLA and other piezoelectric movement detectors
  • 1

    Portable to bedside of patient.

  • 2

    No radiation risks.

  • 3

    Noninvasive.

  • 4

    Requires little patient cooperation.

  • 5

    Can document entire meal.

  • 1

    Aspiration cannot be viewed directly, but rather is inferred by the degree of oral/laryngeal dysfunction.

  • 2

    Cannot measure accumulated secretions in laryngopharynx.

Pulse Oximetry
  • 1

    Portable to bedside of patient.

  • 2

    No radiation risks.

  • 3

    Noninvasive.

  • 4

    May detect silent aspiration.

  • 1

    Older patients, smokers, and patients with chronic lung disease may confound test results.

Scintigraphy
  • 1

    Enables detection and quantification of small amounts of aspirated material.

  • 2

    Allows determination of flow dynamics

  1. Cannot examine dry swallow.

  2. Radiation risk.

Technologies supported by published initial demonstrations
Citric Acid Cough Test
  • 1

    Measures cough reflex.

  • 1

    Only measures cough reflex.

ROSS Test
  • 1

    Noninvasive.

  • 2

    Provides objective information describing the oral phase of swallow.

  • 3

    Portable to bedside of patient.

  • 4

    No radiation risks.

  • 1

    Piezoelectric transducer is subject to distortion from patient movement.

  • 2

    Piezoelectric transducer measures only extent, but not direction of movement.

Tracheal Glucose Assay
  • 1

    Detects aspirated feeding formula.

  • 1

    Only useful in tube-fed patients.

Respirometry
  • 1

    Respirometry has been reported only as a supplement to VFSS. It therefore shares the advantages and disadvantages of this technique.

  • 1

    Nasal obstruction may hamper recording.

  • 2

    Respirometry has been reported only as a supplement to VFSS. It therefore shares the advantages and disadvantages of this technique.

Echoplanar MRI
  • 1

    Provides soft tissue imaging.

  • 2

    Enables visualization of the entire swallow.

  • 1

    The adduction-abduction procedure cannot be visualized entirely within a single axial slice.

  • 2

    Multislice acquisitions place restrictions on temporal resolution.

  • 3

    Limited spatial resolution allows visualization of gross surfaces only.

Stethoscopic Auscultation
  • 1

    Noninvasive.

  • 2

    Portable to bedside of patient.

  • 3

    No radiation risks.

  • 1

    No nomenclature exists to describe swallowing sounds.

  • 2

    No data are available ascribing specific sounds to specific swallowing events.

EDAT
  • 1

    Noninvasive.

  • 2

    Portable to bedside of patient.

  • 3

    No radiation risks.

  • 1

    Provides only limited data and is therefore a supplemental, rather than a replacement technology.

  • 2

    May be confounded by patient deficits unrelated to swallow, such as visual impairment.

Ultrasonagraphy
  • 1

    May detect abnormal tongue and hyoid bone movements.

  • 2

    Noninvasive.

  • 3

    No radiation risks.

  • 1

    Can be confounded by motion of the patient or the equipment.

  • 2

    Doppler imaging requires careful selection of the appropriate wavelengths.

Electrical Impedance Tomography
  • 1

    No radiation risks.

  • 2

    Noninvasive.

  • 3

    Portable to bedside of patient.

  • 1

    Poor spatial resolution.

  • 2

    Incorrect electrode placement may lead to misinterpretation of the temporal aspects of the signal.

  • 3

    May be subject to limitations based on the size and shape of the patient's laryngeal framework.

  • 4

    Can be confounded by patient movements unrelated to swallow.

  • 5

    Results become less reproducible with decreasing bolus size.

Table 4 summarizes the suggested advantages and disadvantages of each of the three major instrumented diagnostic tests. While the information provided by these tests overlaps, because of their differences they may each detect patients at risk that the other tests do not detect. It may ultimately not be a matter of deciding which test is the best one to use, but which patients should undergo which test based on symptoms. However, evidence is not yet available to selectively guide patients to different diagnostic methods. Similarly, evidence is not yet available to determine whether any single difference in the physiological parameters measured by any of these tests is important. For this reason, we do not attempt to assess whether any particular difference leads to improved patient outcomes.

The Relationship Between Diagnosis and Treatment of Swallowing Problems

A number of studies have attempted to determine the relative diagnostic usefulness of the BSE, 3-ounce water test, MBS and FEES, or FEESST. One outcome of interest is detection of aspiration, because it is believed that chronic aspiration will lead to a higher incidence of pneumonia. One group has reported that a small amount of aspiration may occur in normal subjects during sleep and apparently does not normally lead to pneumonia (Huxley, Viroslav, Gray et al., 1978). However, others have not found aspiration in sleeping or awake normal subjects (Winfield, Sande, and Gwaltney, 1973). It is possible that the lessening of oral secretions that commonly occurs during sleep helps minimize aspiration during sleep. The frequency of aspiration, amount and type of aspirate, and strength of the immune system strongly influence the incidence of aspiration pneumonia (Bartlett and Gorbach, 1975). The trace aspiration possibly experienced by normal people is quantitatively quite different than the aspiration observed on swallowing exams in some patients with dysphagia. In the Epidemiology section, we discuss the fact that patients with dysphagia and/or aspiration are more likely to get pneumonia than nondysphagic and nonaspirating patients. Nevertheless, not all people with dysphagia or aspiration acquire pneumonia. Thus, aspiration is a loosely correlated surrogate measure for untoward outcomes such as incidence of aspiration pneumonia, hospitalization for aspiration pneumonia, and death from aspiration pneumonia. But these latter methodologically preferred outcome measures are measured after therapy and treatment, meaning that the purely diagnostic aspects of dysphagia studies cannot be separated from the effects of treatment. This makes it impossible to distinguish between false-positive pneumonia risk predictions of a diagnostic test and the true-positive predictions that were prevented by treatment. Compounding this problem is that treatment to prevent aspiration typically begins during the initial diagnostic test (Linden, 1989; Logemann, 1993). Positional maneuvers and dietary modifications are tried at this time, and their effect on aspiration is observed. If aspiration cannot be prevented, nonoral feeding may be recommended temporarily or permanently.

Because of the relationship between diagnosis and treatment of dysphagia, the relative ability of a particular diagnostic to assist in preventing pneumonia can only be determined by comparing the outcomes of patients diagnosed with this method and another method, and only in a study in which patients from both groups receive similar treatment. This issue is further discussed in the Methodology and Results sections of this report.

Treatment of Oropharyngeal Dysphagia

Noninvasive Therapies

Numerous noninvasive approaches exist for treating oropharyngeal dysphagia, and the method used depends upon the type of underlying disorder and its manifestations. Logemann (1983, 1991, 1994; Logemann and Kahrilas, 1990) classifies these approaches into three categories: compensatory techniques, indirect therapy, and direct therapy.

Compensatory techniques

Compensatory techniques attempt to eliminate the symptoms of dysphagia, but not to change the actual swallow physiology. This is usually accomplished by teaching patients to position their heads and bodies to control the flow of food or liquid, by modifying the consistency and volume of food, and by modifying the rate at which the food is given. Thermal stimulation with special appliances or food, as well as prosthetics, are also sometimes used as compensatory measures (Logemann, 1991, 1994).

Table 5. Postural Techniques for Aspiration Elimination in Dysphagia
Observed disorder (VFSS)Posture appliedRationale
Inefficient oral transitHead backUtilizes gravity to clear oral cavity
Delayed swallow reflexChin downWidens valleculae to prevent bolus from entering airway
Reduced posterior motion of tongueChin downPushes tongue base backward toward pharyngeal wall
Unilateral laryngeal dysfunction (aspiration during swallow)Head rotated to damaged sidePlaces extrinsic pressure on thyroid cartilage, increasing adduction
Reduced laryngeal closure (aspiration during swallow)Chin down Head rotated to damaged sidePuts epiglottis in more protective position; narrows laryngeal entrance; increases vocal fold closure by applying extrinsic pressure
Reduced pharyngeal contractionLying down on one sideEliminates gravitational effect on pharyngeal residue
Unilateral pharyngeal paresisHead rotated to damaged sideEliminates damaged side from bolus path
Unilateral oral and pharyngeal weakness on the same sideHead tilt to stronger sideDirects bolus down stronger side
Cricopharyngeal dysfunctionHead rotatedPulls cricoid cartilage away from posterior pharyngeal wall, reducing resting pressure in cricopharyngeal sphincter
Postural techniques are usually used temporarily until the patient's swallow function recovers, or until direct therapy begins to have an effect. Occasionally, patients with extreme neurologic or structural damage must use these techniques permanently to eliminate aspiration. Postural techniques used to alter the flow of food are shown in Table 5.

Table 6. Diet Modification Techniques
Swallowing disorderEasiest food consistenciesFood consistencies to avoid
Reduced range of tongue motionLiquidThick foods
Reduced tongue coordinationLiquidThick foods
Reduced tongue strengthLiquidThick, heavy foods
Delayed pharyngeal swallowThick liquids and thicker foodsThin liquids
Reduced airway closurePudding and thick foodsThin liquids
Reduced laryngeal movement; cricopharyngeal dysfunctionLiquidThicker, higher viscosity foods
Reduced pharyngeal wall contractionLiquidThick, higher viscosity foods
Reduced tongue base posterior movementLiquidHigher viscosity foods
Food consistency alterations commonly used for different dysfunctions are shown in Table 6. (Logemann, 1994).

Indirect therapy

In indirect swallow therapy, the patient is given exercises to improve neuromuscular control over chewing and swallowing (Logemann, 1991). These are especially useful in patients who lack tongue control in any of the following ways (Logemann, 1983b):

  • Lateralization of tongue during chewing.

  • Elevation of tongue to hard palate.

  • Cupping of tongue around bolus.

  • Elevation of tongue against palate to hold bolus.

  • Range of anterior to posterior movement.

  • Coordination of anterior to posterior movement.

Exercises include (Logemann, 1991):

  • Range of motion or resistance exercises for tongue and jaw.

  • Tongue coordination and chewing exercises utilizing a material (such as gauze controlled by the clinician) with which the patient practices movements.

  • Laryngeal adduction exercises.

  • Bolus control exercises: the patient manipulates food or liquid in the mouth without actually swallowing it.

Direct therapy

Direct therapy attempts to change swallow physiology through special swallowing techniques, or medical or surgical management. Three swallowing maneuvers are commonly used to alter physiology during the swallow:

  • Mendelsohn Maneuver: Patients are instructed to feel their larynx elevate during the swallow and attempt to prolong the period of maximal elevation. The rationale for this is that maximal cricopharyngeal opening occurs during maximal elevation of the larynx and hyoid (Logemann and Kahrilas, 1990). This technique is useful for patients with reduced hyolaryngeal movement (Logemann, 1991).

  • Supraglottic Swallow: This technique is used to minimize aspiration. Patients voluntarily hold their breath before and during the swallow, thereby closing the true vocal folds. The patient then coughs when the swallow is completed, to clear any residual material from the pharynx.

  • Super-Supraglottic Swallow: In a variation on the supraglottic swallow, patients apply increased effort to holding their breath before the swallow.

No medical (i.e., pharmaceutical) treatments are known to specifically improve oropharyngeal swallowing in neurologic patients. Drugs given to treat the underlying disease, however, may result in some improved swallow function.

Invasive Therapies

There are several types of surgical interventions to treat oropharyngeal dysphagia, including surgical reconstruction after head or neck surgery, tracheal resection or removal, vocal cord medialization, Teflon, gelfoam, or collagen injection into the vocal folds to facilitate vocal fold closure, and cricopharyngeal myotomy. A discussion of these procedures is beyond the scope of this report.

Enteral Feeding (Feeding Tubes)

A feeding tube is indicated when oral feeding is not safe due to severe aspiration, or for those who cannot chew or ingest enough food to maintain adequate weight and nutrition. In current practice, it is most often used as a last resort.

The two most common types of feeding tubes are NG tubes (NGT) and gastrostomy tubes. NGTs are indicated for short-term use by cognizant patients unlikely to pull the tube out. Gastrostomy tubes are generally for long-term use, because long-term use of NGTs may cause nasopharyngeal erosions, sinus pain, and laryngeal injury (Arrowsmith, 1996).

Percutaneous endoscopic gastrostomy

PEG involves placing a feeding tube through a small incision in the abdomen and stomach walls. The tube is guided into place with an endoscope. A variant of the PEG is percutaneous endoscopic gastrostomy/jejunostomy (PEG/J), in which the inner end of the feeding tube is extended into the small intestine, to help avoid esophageal reflux of stomach contents. Alternatively, the tube can be placed directly into the jejunum through a PEJ.

A common alternative method to endoscopy is insertion under fluoroscopic control. Indications for PEG and its variants include (Ciocon, 1990; Shike, 1995):

  • Severe dysphagia

  • Coma or delirium

  • Persistent anorexia

  • Inability to consume sufficient amounts of food

  • Malabsorption secondary to decreased absorption in the GI tract

  • Repeated aspiration with NG tube

  • Head and neck surgery

  • Physical impairment

  • Hypermetabolic state

  • Massive small-bowel resection.

Contraindications to PEG include (Arrowsmith, 1996; Larson, Fleming, Ott et al., 1983; Liddle and Yuill, 1995):

  • GI obstruction or fistula

  • Nonfeasibility of bringing anterior gastric wall against anterior abdominal wall (such as in morbidly obese patients)

  • Esophageal obstruction

  • Current chest infection

  • Ascites

  • Portal hypertension

  • Active gastric ulcer

  • Total gastrectomy

  • Uncorrected coagulopathy.

The above indications can originate from many diseases. The most common diseases, conditions, and disorders leading to PEG are:

  • CVA

  • Head and neck cancer

  • Head injury

  • Degenerative neurologic diseases: multiple sclerosis, motor neuron disease, Parkinson's disease, Alzheimer's disease

  • Head/neck surgery

  • Malnutrition

  • Neck burns, inflammatory disorders, strictures.

Complications of PEG, both minor and major, occur in 5 to 50 percent of patients. Minor complications include wound infection, the most common complication, which is often caused by the collection of secretions around the incision. Infection can be avoided by using antibiotics for several days after tube insertion and by daily cleansing of the area. Other minor complications include leakage of gastric contents (which can cause skin erosion) and excessive tension at the connection point that anchors the tube against the abdominal and stomach walls (which can cause underlying skin to become macerated) (Kirtley, Willis, and Thomas, 1987). It is important to watch for slow gastric emptying, especially in the elderly (Campbell-Taylor and Fisher, 1987) and for tube migration (which occurs if the tube is not anchored properly), diarrhea (most often caused by antibiotics, sorbitol in elixirs, and antacids), and constipation from low-residue feeding formula (Henderson, 1991).

More potentially severe complications include esophageal reflux of stomach contents which can lead to aspiration (Campbell-Taylor and Fisher, 1987) and upper GI bleeding due to stress ulcers, gastritis, or reflux esophagitis (Henderson, 1991).

Insertion success ranges from 95 to 100 percent. Mortality rates directly related to PEG have been reported ranging from 0.6 to 8.1 percent, depending on the patient population; underlying disease and age are the most important factors.

Natural History of Dysphagia

Epidemiology

Dysphagia is not a condition that occurs in isolation; rather, it can be caused by many different disorders, including stroke, degenerative neurologic conditions, head and neck cancer, and head injury. It can also occur as a natural part of the aging process. Ideally, patients should be selected for testing on the basis of clinical signs and symptoms detected during physical exam or noninstrumented swallow exam. However, as will be discussed in the Results section of this report, few clinical signs and symptoms have been found that predict dysphagia-related morbidity such as aspiration or pneumonia. This means that the only currently available means of determining which patients are most likely to benefit from a dysphagia diagnosis and treatment program involves determining whether patients with any given disease or condition are particularly likely to become dysphagic. If so, it may then be possible to selectively choose patients for extended diagnostic testing based on the disease from which they suffer. Additionally, if most cases of dysphagia can be shown to result from a particular disease or condition, then treating patients suffering from this condition will alleviate most of the morbidity and mortality due to dysphagia. For these reasons, we discuss in this section not only the incidence of dysphagia overall, but also the incidence of dysphagia associated with a variety of neurologic conditions.

Arriving at the exact incidence and prevalence of dysphagia is, however, not possible. This is because dysphagia is not a single disease, but a cluster of symptoms, not all of which may be detected with current diagnostic technology (see the section entitled Diagnosis of Oropharyneal Dysphagia), and because of variations in definitions and interpretation of what constitutes dysphagia. It is therefore not possible to discuss the incidence or prevalence of dysphagia; rather, we can only discuss diagnosed occurrence because no current diagnostic test is 100 percent accurate. Also, precise data on the number of patients affected by dysphagia-related morbidity and mortality are not available. Because of this, the present section and the section on Burden of Illness contain numerous original calculations. Appendix H contains further details about these calculations.

Table 7. Range of Reported Epidemiological Findings for Each Disease: Prevalence/Incidence of Disease Overall and Occurrence of Dysphagia Within Each Disease
Overall prevalence/incidence (per 100,000)Dysphagia (percent)
ConditionWhole populationElderly populationWhole populationElderly population
OverallNANABSE: 12-17 VFSS: NRBSE: 14 -34 VFSS: NR
StrokeAnnual incidence:145-290Annual incidence: 65-74: 142-235 85+: 1,390 -2,574BSE: 19-59 VFSS: 50-9047
PDPrevalence: 107 1 Annual incidence: 13-65Prevalence: 790 to 2,250BSE: 23-77 VFSS: 63-81NR
ADPrevalence:260 1Prevalence: 65+: 2,634 -6,240 85+: 18,018-28,850VFSS: 84 1NR
MS70-171 Prevalence:265 1Questionnaire: 13-33 1 BSE: NR VFSS: NRNR
MND ALSPrevalence: NR Annual incidence: 6 Prevalence: 3.12 1 Annual Incidence: 1.14 - 1.8Annual incidence: 18-25Questionnaire: 71 1 BSE/VFSS: 51.2 BSE/VFSS: 29NR
PSPPrevalence: 1.39 1 Incidence: 1.1 1Prevalence: NR Incidence: 7 1BSE/VFSS: 56 1NR
HDPrevalence: 1.9 1 Incidence: 0.2 1NRBSE: NR VFSS: 100 2NR
Table 8. Epidemiological Data From the Published Literature: Neurologic Diseases and the Rate of Dysphagia Within Each
DiseasePrevalence (per 100,000)Incidence (per 100,000)StudyReasonDiagnosed occurrence of dysphagia (%)StudyReason
StrokeNA145 289Brown, Whisnant, Sicks et al., 1996; Modan and Wagener, 1992Mayo Clinic Mayo Clinic seemed low: this provides an upper estimateVFSS: 74.6 BSE: 41.7Daniels, McAdam, Brailey et al., 1997; DePippo, Holas, and Reding, 1992Median of VFSS studies Median of BSE studies
Parkinson's disease106.913Mayeux, Marder, Cote et al., 1995Only number on general population that included elderlyVFSS: 69.1Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997Mean of 2 studies in which L-dopa was withheld
Alzheimer's disease259.8NRBeard, Kokmen, Offord et al., 1991Only published numberVFSS: 84Horner, Alberts, Dawson et al., 1994Only published number
Multiple sclerosis170.8NRWynn, Rodriguez, O'Fallon et al., 1990Only number; Mayo ClinicNRNANA
Motor neuron disease170.86.2Lilienfeld, Sprafka, Pham et al., 1991Only published number51.2 (method not reported)Leighton, Burton, Lund et al., 1994Exam, not survey
Amyotropic lateral sclerosisNR1.8McGuire, Longstreth, Koepsell et al., 1996Exam, not survey29 (method not reported)Leighton, Burton, Lund et al., 1994Only published number
Progressive supranuclear palsy1.391.1Golbe, Davis, Schoenberg et al., 1988;Bower, Maraganore, McDonnell et al., 1997Only published numberVFSS: 55.6Litvan, Sastry, and Sonies, 1997Only published number
Huntington's disease1.90.2Kokmen, Ozekmekci, Beard et al., 1994Only published numberVFSS: 100Kagel and Leopold, 1992Only published number
Table 7 shows a summary of all studies located on the epidemiology of major neurologic diseases and the rate of dysphagia within each of these diseases. These studies are further described in Evidence Table 1 through Evidence Table 9. Table 8 displays the best studies out of all those reviewed, according to our evaluation. When choosing the best studies, we took into consideration the research methods, populations included, and generalizability to the U.S. population as a whole. These methods are described in Appendix A. Unfortunately, many of these studies were chosen because they were the only published study that reported a particular statistic. A critical discussion of this literature is also found in Appendix A. The rates shown in Table 8 were used in the Burden of Illness section to determine the number of individuals in the United States affected by these diseases.

It is currently difficult to determine whether patients with any specific disease should automatically undergo extensive diagnostic testing for dysphagia. It appears, as shown in Table 8, that stroke, Alzheimer's disease, and Huntington's disease all have high enough rates of dysphagia to warrant diagnostic testing for all patients with these conditions. However, in the case of Huntington's and Alzheimer's diseases, only a single study on dysphagia was available for each; therefore, current data cannot be considered reliable. Further, these rates were determined by VFSS, which does not necessarily detect only cases of dysphagia that actually cause problems, but also potentially many cases that never will.

With stroke, there is a large body of literature (found in Evidence Table 3) on the diagnosed occurrence of dysphagia, and the reported rates of dysphagia range from 20 to 90 percent. Table 8 shows the median as determined by each diagnostic method. It is not possible to determine if these rates are accurate and what proportion of these cases are actually a risk for serious morbidity.

Because of these limitations in available evidence, we cannot make a recommendation about diagnosing patients for dysphagia solely on the basis of the neurologic disorder from which they suffer.

Burden of Illness

To determine the burden of illness on society resulting from dysphagia in these disorders, we need to determine the total number of people affected. An important question is how many of these patients would suffer serious morbidity or death if preventive treatment for dysphagia were not available? Dysphagia may lead to malnutrition because the patient is unable to take in adequate amounts of food. However, it is often not the dysphagia, per se, that leads to serious morbidity (pneumonia) and death, but aspiration. It is therefore necessary to determine what proportion of patients with dysphagia aspirate, what proportion of aspirators are expected to contract pneumonia if not treated, and what proportion of elderly patients with pneumonia die of the illness. Through these steps, we can determine the number of patients with dysphagia who will subsequently die of resulting morbidity without directed dysphagia management (if data directly making this link are not available). Knowing the raw number of individuals affected yearly by morbidity may help determine priorities in clinical management and insurance coverage. It would be interesting also to link the occurrence of dysphagia to the incidence of malnutrition, but the data do not exist in the literature to support this calculation.

It is important to point out that the figures we discuss in this section are not rates of morbidity or mortality that are found in the absence of treatment; rather, we have limited the discussion to pneumonia and to those studies in which there was no apparent dysphagia-oriented treatment. For this purpose, we have not limited the publication date of the studies we examined; historical data are particularly important in addressing these questions to find morbidity and mortality rates before current dysphagia treatment was implemented.

The focus of this discussion will be on stroke patients, as they make up the greatest proportion of neurologic dysphagia patients, and because most of the published literature focuses on stroke patients. We are currently unable to document the rate of aspiration or pneumonia in patients with neurologic diseases other than stroke, because the data do not exist. We can, however, calculate the number of patients whose dysphagia was caused by a certain disease and/or the rate of dysphagia caused by certain diseases.

This discussion will first review the literature on the rates of illness in stroke or patients with dysphagia; we then translate the rates discussed here and in the Epidemiology section above into total numbers of patients affected in the United States.

Malnutrition

There is little information on the incidence of malnutrition in elderly patients with dysphagia. Two groups of researchers have looked at malnutrition in the nursing home elderly and its possible link to swallowing and chewing problems (Keller, 1993; Keller, 1995; Thomas, Verdery, Gardner et al., 1991). One group found a significant association between dysphagia and malnutrition, while the other did not. Neither of these researchers looked specifically at any neurologic disorder.

Only one study has been published examining the rate of malnutrition in acute-care stroke patients (see Evidence Table 10). Davalos et al. (1996) followed 104 stroke patients who were admitted within 24 hours of stroke. At admission, 16.3 percent were malnourished. At one week, this increased to 26.4 percent of 91 survivors, and at two weeks, it increased to 35 percent in 43 patients remaining in the hospital (Davalos, Ricart, Gonzalez-Huix et al., 1996). This last figure is likely skewed because only the most debilitated patients would stay in the acute-care hospital for that long.

Malnutrition in dysphagia

This same study reported that 43 (41.3 percent) of these patients had swallowing problems at admission. While there was no difference in the rate of malnutrition between patients with dysphagia and those with a normal swallow at admission, at 1 week, patients with dysphagia were significantly more likely to be malnourished than patients with normal swallows (48.3 percent versus 13.6 percent) (Davalos, Ricart, Gonzalez-Huix et al., 1996). Standard treatment for patients with dysphagia in this acute-care facility was enteral feeding. While tube feeding is generally a last resort measure that is supposed to prevent malnutrition, this study suggests that it does not always achieve its goal.

Dehydration in dysphagia

One would expect patients with dysphagia to have an increased risk of dehydration because of their inability to swallow thin liquids safely. Dehydration is generally measured through blood tests such as measurement of hematocrit or blood urea nitrogen/creatine ratio. Very few studies have reported these test results for patients with dysphagia. Smithard et al. (1996) reported no changes in hydration status regardless of the ability to swallow in acute-care stroke patients (Smithard, O'Neill, Parks et al., 1996). However, these patients were only followed for 1 week post-stroke, and therefore these findings do not address long-term dehydration risks in this population.

One study (Barer, 1989) examined the occurrence of hydration-related blood measurements comparing three groups of patients with different levels of swallow disability. Blood urea nitrogen measurements were significantly higher at day 8 post-stroke for patients on NG tubes compared with those with more minor swallowing problems. However, blood urea nitrogen changes examined in isolation do not provide definitive information about hydration status; they must be examined in relation to creatine levels.

Two studies have examined the occurrence of dehydration in aspirators versus nonaspirators (Holas, DePippo, and Reding, 1994; Schmidt, Holas, Halvorson et al., 1994). Neither found any significant difference in the occurrence of dehydration between these two groups of patients.

In conclusion, the current data examining the relationship between dysphagia and dehydration are sparse, and do not suggest a significant causal relationship. However, more research in this area is needed before a firm conclusion can be drawn.

Aspiration in dysphagia

Not all patients with dysphagia suffer from aspiration, but many do (see Evidence Table 11). Three studies specifically examined the proportion of stroke patients with dysphagia who present with aspiration. These three studies are generally in agreement that stroke dysphagics aspirate; 46.3 percent (Kidd, Lawson, Nesbitt et al., 1993); 53.5 percent (Holas, DePippo, and Reding, 1994); and 43.2 percent (Daniels, McAdam, Brailey et al., 1997)

Aspiration in stroke

Five studies (Chen, Ott, Peele et al., 1990; Daniels, Brailey, Priestly et al., 1998; Daniels, McAdam, Brailey et al., 1997; Kidd, Lawson, Nesbitt et al., 1993; Smithard, O'Neill, Parks et al., 1996) directly examined the occurrence of aspiration in a stroke population within 7 days of the event (without consideration of general dysphagia) (see Evidence Table 12). Their estimates ranged from 21.3 percent (Smithard, O'Neill, Parks et al., 1996) to 38.2 percent (Daniels, Brailey, Priestly et al., 1998). Diagnosis of aspiration will depend upon bolus size and consistency tested during VF. The study reporting the highest rate was the only one to recruit patients (Daniels, Brailey, Priestly et al., 1998); patients sensing a swallowing problem may have been more likely to volunteer for a study. Smithard et al. (1996), reporting the lowest rate with 21.3 percent, did not confirm stroke diagnoses with computed tomography (CT) or MRI, only through clinical assessment (Smithard, O'Neill, Parks et al., 1996); it is possible that some cases are not truly acute stroke, thus potentially lowering any estimation of morbidity. As a result, the best estimate is probably a median of all hospitalized strokes, 33.5 percent, which is the average of the two median numbers from Daniels et al. (1997) and Kidd et al. (1993).

If we assume, as discussed in the previous section, that approximately half of stroke dysphagics experience aspiration, then, from the numbers calculated above, approximately 42 to 76 percent of the stroke patients in these three studies would be expected to be dysphagic. These numbers are in rough accordance with rates discussed in Appendix A, in which a range of 46 to 90 percent of hospitalized CVA patients were diagnosed with dysphagia on VF.

In conclusion, studies indicate that approximately 43 to 55 percent of stroke patients with dysphagia experience aspiration; 21 to 38 percent of all stroke patients experience aspiration.

Pneumonia

Pneumonia is a fairly common occurrence after a stroke, especially if there is no preventative treatment. Four studies have examined the rate of pneumonia after a stroke (Dobkin, 1987; Haerer and Smith, 1974; Odderson and McKenna, 1993; Young and Durant-Jones, 1990) (see Evidence Table 13); rates range from 1.5 to 13.0 percent. Because we have calculated (in Appendix D) that most pneumonia resulting from a stroke occurs within the first week after the event, even though the followup for each of these studies was different, the rates are comparable. The lowest figure comes from a stroke rehabilitation unit (Dobkin, 1987), the highest from an acute-care hospital that included only aspiration pneumonia (Young and Durant-Jones, 1990). Both were retrospective case series that only followed patients for their length of stay. Two other studies also followed stroke patients in an acute-care hospital and found pneumonia rates of 6.7 percent (Odderson and McKenna, 1993) and 8.8 percent (Haerer and Smith, 1974). There is no obvious reason for the differences found in the three acute-care studies, although Odderson et al. (1993) only included nonhemorrhagic strokes (and found the lower of the two rates), while Young et al. (1990) did not have such an inclusion criterion. It may be the case that patients who suffer nonhemorrhagic strokes are less disabled and, therefore, less susceptible to pneumonia. Haerer and Smith (1974) included only cases of acute pneumonia, thus possibly lowering their reported rates slightly.

The fact that the highest rate from these four studies included only aspiration-specific pneumonia suggests that most pneumonia after stroke is aspiration related. This supposition may be confounded by the fact that these patients were followed longer than those in other studies. This may not be specifically aspiration resulting from oropharyngeal dysphagia but may also include many patients suffering from esophageal reflux. However, this does appear to be the most reliable figure (13.0 percent) for acute hospital stroke patients (Young and Durant-Jones, 1990).

Aspiration resulting in pneumonia: Relative risk compared with dysphagics and nonaspirators

Aspiration, a common side effect of stroke, can put individuals at risk for pneumonia. Five studies examined the occurrence of all types of pneumonia in patients experiencing aspiration (Croghan, Burke, Caplan et al., 1994; Holas, DePippo, and Reding, 1994; Schmidt, Holas, Halvorson et al., 1994; Smithard, O'Neill, Parks et al., 1996; Teasell, McRae, Marchuk et al., 1996) (see Evidence Table 14). The rates of pneumonia ranged from 11.9 percent (Teasell, McRae, Marchuk et al., 1996) to 50 percent (Croghan, Burke, Caplan et al., 1994). The highest rate came from a nursing home, the lowest from a stroke rehab unit, reflecting the relative debilitation of patients in these different care settings.

However, almost everyone experiences aspiration at one time or another, and, therefore, it is not a given that pneumonia will result. It is therefore important to determine whether aspiration is significantly associated with pneumonia, through a comparison of those who experience aspiration to dysphagics without aspiration and other nonaspirators. Evidence Table 15 displays those studies providing such comparisons. Because these are case series data, they cannot be subject to a meta-analysis; we can, however, pool the data in such a way that we can approximate the trend in pneumonia incidence in those with and without dysphagia.

Evidence Table 15 displays data from patients in all care settings, including acute-care, rehabilitative, and long-term care. Patients in these different care settings will have different risks of contracting pneumonia because of general differences in health status. We therefore pool the data separately for these three settings.

Table 9. Summary Comparison Among Different Patient Groups for Risk of Pneumonia
Patient populationAcute-careStroke rehabilitationLong-term care
Condition# with pneumoniaTotal with condition# with pneumoniaTotal with condition# with pneumoniaTotal with condition
Aspirators3596231712442
Proportion0.3650.1350.571
Nonaspirators1714213517618
Proportion0.1200.0250.333
Dysphagic nonaspirators32135375816
Proportion0.2370.0400.500
All dysphagics56196212203258
Proportion0.2860.0950.552
Table 9 provides a summary comparison from several studies, shown in Evidence Tables 14 through 18, among aspirators, nonaspirators, nonaspirating dysphagics, and all patients with dysphagia.

The most informative comparison in the table above is between those who aspirate and those who have dysphagia but do not aspirate; any difference between these two groups will be the net additional risk of contracting pneumonia due to the presence of aspiration. There appears to be a trend for acute-care patients that aspirators have a greater risk of contracting pneumonia than the other three groups of patients. However, because these are case series data, we cannot determine whether there is a significant difference between aspirators (36.5 percent) and nonaspirating dysphagics (23.7 percent). This difference is very strong in the rehabilitation group (13.5 versus 4.0 percent).

This difference is not apparent in the nursing home group. While 57.1 percent of aspirators contract pneumonia, 50.0 percent of nonaspirating dysphagics do. It is unlikely that this is a significant difference. It may be the case that the elderly in nursing homes are so debilitated that any such disability can lead to greater risk of illness.

Several of the studies shown in Evidence Tables 14 through 18 found a significant association between the presence of aspiration and the development of subsequent pneumonia (Holas, DePippo, and Reding, 1994; Kidd, Lawson, Nesbitt et al., 1993; Langmore, Terpenning, Schork et al., 1998; Reynolds, Gilbert, Good et al., 1998; Schmidt, Holas, Halvorson et al., 1994). One study found that silent aspiration was significantly more predictive of subsequent pneumonia than symptomatic aspiration (Holas, DePippo, and Reding, 1994). However, many of these studies also found a significant relationship between dysphagia and subsequent pneumonia (DePippo, Holas, and Reding, 1994; Kidd, Lawson, Nesbitt et al., 1993; Langmore, Terpenning, Schork et al., 1998; Reynolds, Gilbert, Good et al., 1998; Smithard, O'Neill, Parks et al., 1996). In fact, one study (DePippo, Holas, and Reding, 1994) found that the presence of cough alone detected on a bedside screening test (Burke Dysphagia Screening Test) predicted 100 percent of pneumonia cases (however, with a high false positive rate). It is unclear how much of the relationship between dysphagia and pneumonia is caused by dysphagic aspirators, because aspiration was not controlled for in the researchers' statistical analyses. Interestingly, two other studies found no relationship between aspiration and pneumonia/chest infection (Croghan, Burke, Caplan et al., 1994; Smithard, O'Neill, Parks et al., 1996) (although some of this may have been due to lack of statistical power).

In conclusion, while there appears to be a trend for aspirators, especially stroke patients in acute-care, to be more likely to contract pneumonia, it is currently not possible to determine if this is a statistically significant increase in risk. Statistical analyses in the published literature are currently equivocal about these possible differences.

Pneumonia mortality

Pneumonia can be successfully treated with antibiotics. However, in weakened elderly people, pneumonia is a leading cause of death; the Centers for Disease Control and Prevention (CDC) reported that in 1995 alone, 74,297 elderly people died of pneumonia or influenza, the fifth leading cause of death in the elderly (Department of Health & Human Services, 1997).

Pneumonia-specific mortality in elderly pneumonia patients

Three studies have examined the pneumonia-specific mortality rate among the elderly with pneumonia (Riquelme, Torres, El-Ebiary et al., 1996; Thompson, Hall, Szpiech et al., 1997; Venkatesan, Gladman, Macfarlane et al., 1990) (see Evidence Table 20). These estimates range from 13.7 percent (Venkatesan, Gladman, Macfarlane et al., 1990) to 39.5 percent (Thompson, Hall, Szpiech et al., 1997). Both the lowest and highest numbers were reported from a hospital on patients with similar mean age; however, the higher number included only nursing home ward patients. Both included all causes of pneumonia. A median number reported by Riquelme et al. (1996) provides probably the best estimate, a 19.8 percent death rate after 30 or more days of hospitalization.

Table 12. Dysphagia-Related Morbidity and Mortality in Stroke Patients
Disease/disorderNumber or rateSourceComments
Stroke, annual incidence
1. All ages
Total population265,284,000U.S. Department of Commerce, 1997
Rate of stroke, low estimate145/100,000Brown, Whisnant, Sicks et al., 1996Mayo Clinic
Rate of stroke, high estimate290/100,000Modan and Wagener, 1992National Hospital Discharge Survey
Total affected384,662
768,528High estimate
2. 65+
Men, total population:13,880,892Centers for Disease Control and Prevention, 1998
Rate of stroke953/100,000Barker and Mullooly, 1997
Total affected, male132,285
Women, total population19,979,990Centers for Disease Control and Prevention, 1998
Rate of stroke736/100,000Barker and Mullooly, 1997
Total affected, female147,053
Total affected, 65+279,338From above
All-case dysphagia in elderly population
Total population, age 60+43,859,973Centers for Disease Control and Prevention, 1998a
Rate of dysphagia14.20%Baum and Bodner, 1983Only study
Total affected6,228,116
Dysphagia in stroke
1. Total stroke population, low estimate384,662Calculated above
Total stroke population, high estimate768,528Calculated above
Rate of dysphagia, VFSS74.60%Daniels, McAdam, Brailey et al., 1997Median of all studies, hospitalized
Total affected286,958Low
573,322High
Rate of dysphagia, BSE41.70%Kidd, Lawson, Nesbitt et al., 1993Median of all hospitalized BSE numbers taken within 5 days of stroke
Total affected160,404Low
320,476High
2. Total elderly stroke population, age 65+279,338From above
Rate of dysphagia, BSE, age 70+35.50%Barer, 19891 day post-stroke
Total affected99,165
Malnutrition in stroke
Total stroke population384,662Brown, Whisnant, Sicks et al., 1996Mayo Clinic + CDC wonder
768,528Modan and Wagener, 1992Modan, national, hospital discharge survey + CDC wonder
Rate of malnutrition26.4%Davalos, Ricart, Gonzalez-Huix et al., 1996At 1 week
Total affected101,551
202,891
Malnutrition in dysphagia
Total dysphagic stroke160,404Low
320,476High
Rate of malnutrition48.4%Davalos, Ricart, Gonzalez-Huix et al., 1996
Total affected77,636
155,110
Aspiration in stroke
Total stroke384,662Low estimate, from above
768,528High estimate, from above
Rate of aspiration33.50%Daniels, McAdam, Brailey et al., 1997; Kidd, Lawson, Nesbitt et al., 1993Average of Daniels,1997 and Kidd, 1993, median number for acute hospital strokes <5 days poststroke assessment
Total affected128,862
257,457
Pneumonia, annual incidence
Total population265,284,000U.S. Department of Commerce, 1997
Rate of pneumonia1,600/100,000Adams and Marano, 1995
Total affected4,244,544
Total population 65+33,860,882Centers for Disease Control and Prevention, 1998c
Rate of pneumonia3,032/100,000Houston, Silverstein, and Suman, 1995Mayo Clinic
Total affected1,026,662
Pneumonia in stroke
Total stroke384,662Low, from above
768,528High, from above
Rate of pneumonia13.00%Young and Durant-Jones, 1990Acute-care hospital, aspiration pneumonia only, LOS, no dysphagia program
Total affected50,006
99,909
Dysphagia to pneumonia in stroke
1. Total dysphagic stroke, BSE160,404Low, from above
320,476High, from above
Rate of pneumonia14.30%Nilsson, Ekberg, Olsson et al., 1998No specific swallow treatment, acute hospital, LOS, BSE
Total affected22,938Low
45,828High
2. Total dysphagic stroke, VFSS286,958Low
573,322High
Rate of pneumonia14.30%Nilsson, Ekberg, Olsson et al., 1998No specific swallow treatment, acute hospital, LOS, BSE
Total affected41,035Low
81,985High
3. Total dysphagic stroke, elderly99,165
Rate of pneumonia14.30%Nilsson, Ekberg, Olsson et al., 1998No specific swallow treatment, acute hospital, LOS
Total affected14,181
Nondysphagia to pneumonia
Total nondysphagic in stroke, BSE224,258Low, from above
448,052High, from above
Nondysphagia to pneumonia0.198Reynolds, Gilbert, Good et al., 1998BSE
Total affected44,403Low
88.714High
Aspiration to pneumonia in stroke
Total aspirating stroke128,862Low, from above
257,457High, from above
Aspiration to pneumonia19.20%Schmidt, Holas, Halvorson et al., 1994Acute-care hospital, no specified dysphagia treatment
Total affected24,741Low
49,432High
Proportion of all CVA pneumonias occurring in dysphagic CVAs
Total pneumonia in CVA50,006
99,909
Dysphagia to pneumonia22,938
45,828
Proportion of pneumonias that occur after dysphagia0.459
Dysphagia to pneumonia, BSE22,938Low
45,828High
Nondysphagia to pneumonia, BSE44,403Low
88,714High
Proportion of CVA pneumonias that result after dysphagia0.341
Proportion of all CVA pneumonias occurring in aspirating CVA
Aspiration to pneumonia24,741
49,432
Total CVA pneumonia50,006
99,909
Proportion of CVA pneumonias that result after aspiration0.495
Pneumonia mortality
1. Total pneumonia incidence, all ages4,244,544Adams and Marano, 1995; U.S. Department of Commerce, 1997
Total pneumonia deaths, all ages81,972Centers for Disease Control and Prevention (CDC), 1998a
Rate of death0.019
2. Total pneumoniaincidence, 65+1,026,662Houston 1995 + CDC
Total deaths74,297Department of Health & Human Services, 1997Health, U.S., 1996-7, for 1995, age 65+, Table 34
Rate of death0.072Calculated from above
Clinical literature on neurogenic pneumonia indicates 19.8 percent
3. Pneumonia hospitaladmissions, age 65+690,000U.S. Department of Health & Human Services, 1997Health, U.S., 1996-7, Table 89
Total deaths71,000Centers for Disease Control and Prevention, 1998bNational Hospital Discharge Survey, 1995
Rate of death, hospitalized0.103Calculated from above
Dysphagia to pneumonia death
Total dysphagic pneumonia22,938
45,828
Rate of pneumonia death19.8%Riquelme, Torres, El-Ebiary et al., 1996
Total number of deaths4,542
9,074
Total dysphagic pneumonia, age 65+14,181
Rate of pneumonia death19.8%Riquelme, Torres, El-Ebiary et al., 1996
Total number of deaths2,808
Total dysphagic stroke, BSE160,404Low
320,476High
Proportion of dysphagic strokes leading to pneumonia deaths0.028Only diagnosed on BSE, so doesn't include silent aspirators who are at particular risk for pneumonia
Aspiration to pneumonia death
Total aspirating pneumonia24,741
49,432
Rate of pneumonia death19.8%Riquelme, Torres, El-Ebiary et al., 1996
Total number of deaths4,899
9,787
128,862
257,457
Proportion of aspiration resulting in pneumonia death0.038
Nondysphagia to pneumonia death
Total nondysphagic pneumonia44,403
88,714
Rate of pneumonia death19.8 percentRiquelme, Torres, El-Ebiary et al., 1996
Total number of deaths8,792
17,565
Nondysphagic strokes, BSE224,258Low, from above
448,052High, from above
Proportion of nondysphagic strokes leading to pneumonia death0.039
This is at odds with an estimate determined from vital statistics data (shown in Table 12), in which the estimate was 7.2 percent. The vital statistics data rely on reporting hospitals giving data on all pneumonia cases and related deaths; a single clinical study relies on the quality of care at a single institution. However, the national data on mortality include deaths occurring in all pneumonia cases, not just those resulting from neurologic disease. Therefore, it is prudent to use the Riquelme data for the purposes of this report.

Pneumonia-specific mortality in patients with aspiration

Three studies have reported on the pneumonia-specific mortality rate among patients with aspiration (Croghan, Burke, Caplan et al., 1994; Feinberg, Knebl, and Tully, 1996; Pick, McDonald, Bennett et al., 1996) (see Evidence Table 19). The rates range from 18.8 percent (Pick, McDonald, Bennett et al., 1996) to 31.8 percent (Croghan, Burke, Caplan et al., 1994). All three studies examined nursing home patients in particular. Both the lowest and highest rates came from studies with a 1-year followup. However, the lower rate determined aspiration on the basis of nurse observation rather than instrumented exam and, therefore, misses cases of silent aspiration. Both of the other two numbers were determined from VF on patients of mixed etiology. The most reliable rate comes from Feinberg et al. (1996), because they evaluated 152 patients versus Croghan's 42, with an estimated pneumonia-specific mortality rate for patients with aspiration of 20.9 percent after a mean followup of 29 months (Feinberg, Knebl, and Tully, 1996). There is no figure for the elderly population as a whole, and, therefore, an estimate cannot be made for stroke patients in acute-care.

All-cause mortality in pneumonia patients

Pneumonia may contribute to mortality resulting from other causes through its weakening of the immune system. Nine studies have reported on the overall death rates of patients with pneumonia (see Evidence Table 21). The overall mortality rates ranged from 5.8 percent (Beck-Sague, Villarino, Giuliano et al., 1994) to 53.5 percent (Thompson, Hall, Szpiech et al., 1997). These mortality rates are not specific to aspiration pneumonia; only one study (Jones, 1993) reported mortality rates specific to aspiration pneumonia, a rate of 47.5 percent, the second highest rate reported by all studies in Evidence Table 28. It has not been reported in the literature what proportion of elderly pneumonia patients suffer from aspiration pneumonia specifically. These studies are also largely not specific to stroke patients or even general neurologic patients.

Both the lowest and highest rates were found in nursing home patients, normally the frailest population represented. Beck-Sague et al. (1994) examined all nursing home residents with pneumonia, both those treated at the nursing home and those sent to acute-care (Beck-Sague, Villarino, Giuliano et al., 1994). Thompson et al. (1997) examined only those admitted to an acute-care hospital, which may account for some of the difference (Thompson, Hall, Szpiech et al., 1997). On the other hand, Houston, Silverstein, and Suman (1995) compared the mortality rates of nursing home residents treated in the home versus those treated in the hospital and found a higher mortality rate in those treated at the home (46 percent versus 35.5 percent), which would suggest that pneumonia in these patients should be taken more seriously, and more patients should be hospitalized.

Hospitalized elderly pneumonia patients demonstrate an overall mortality rate of 11 to 48 percent in a 30-day period (Garb, Brown, Garb et al., 1978; Houston, Silverstein, and Suman, 1995; Jones, 1993; Marrie, Durant, and Kwan, 1986; Marston, Plouffe, File et al., 1997; Reynolds, Gilbert, Good et al., 1998; Riquelme, Torres, El-Ebiary et al., 1996); length of followup time does not seem to positively correlate to mortality rate in these studies, nor does age. Among nursing home patients specifically, hospitalized patients demonstrated a mortality rate of 36 to 54 percent (Houston, Silverstein, and Suman, 1995; Marrie, Durant, and Kwan, 1986; Thompson, Hall, Szpiech et al., 1997), substantially higher than the general pneumonia population. Because many nursing home patients with pneumonia are treated in the nursing home, the nursing home elderly who are admitted to the hospital may represent the sickest of the sick.

The most reliable figure for the overall elderly population specific to those with neurologic disease is the only one specific to stroke; with only 21 patients, there is likely to be error, but Reynolds, Gilbert, Good et al. (1998) reported an all-cause mortality rate of 23.8 percent. This figure seems to correspond well with the pneumonia-specific mortality rate reported above of 19.8 percent (Riquelme, Torres, El-Ebiary et al., 1996).

Calculation of Burden of Illness

Table 10. Dysphagia-Related Morbidity and Mortality in Stroke
Disease/disorderNumber affected/probability of occurrence
Stroke: individuals affected annually384,662-768,528
Total affected, age 65 +279,338
Pneumonia4,244,544
Pneumonia, 651,026,662
Pneumonia in stroke50,006-99,909
Pneumonia mortality0.019
Total pneumonia rate of death, age 65+0.198
Rate of death, hospitalized0.103
BSE findings
All-cause dysphagia in elderly population6,228,116
Dysphagia in stroke160,404-320,476
Dysphagia in the elderly stroke patients99,165
Malnutrition in stroke101,551-202,891
Malnutrition in dysphagic stroke77,636-155,110
Dysphagia resulting in pneumonia in stroke22,938-45,828
Pneumonia in neurogenic dysphagic elderly14,181
Nondysphagia resulting in pneumonia in stroke44,403-88,714
Proportion of all CVA pneumonias occurring in dysphagic CVAs0.34-0.459
Dysphagia resulting in pneumonia death4,542-9,074
Dysphagia resulting in pneumonia death, age 65+2,808
Nondysphagia resulting in pneumonia death8,792-17,565
Proportion of dysphagic strokes leading to pneumonia death0.028
Proportion of nondysphagic strokes leading to pneumonia Deaths 0.039
VFSS findings
Dysphagia in stroke286,958-573,322
Dysphagia resulting in pneumonia in stroke41,035-81,985
Aspiration in stroke128,862-257,457
Aspiration resulting in pneumonia in stroke27,834-55,611
Aspiration resulting in pneumonia death5,511-11,011
Proportion of all CVA pneumonias occuring in aspirating CVAs0.557
Proportion of aspiration resulting in pneumonia death0.043
Table 11. Number of Patients Affected by Neurologic Disease Other Than Stroke
Disease/disorder (annual incidence)Number affected
Parkinson's disease34,487
Dysphagia in Parkinson's23,830
Motor neuron disease15,917
Dysphagia in MND8,150
Amyotropic lateral sclerosis4,775
Dysphagia in ALS1,385
Progressive supranuclear palsy2,918
Dysphagia in PSP1,622
Huntington's disease531
Dysphagia in HD531
Total burden of illness of dysphagia in neurologic diseases other than stroke51,435
Total burden of illness, all neurologic diseases including stroke, per year338,393-624,757
The total number of patients afflicted with each of the conditions, morbidities, and mortalities we have discussed in the Epidemiology and Burden of Illness sections of this report are shown in Table 10 and Table 11. Figures in these tables are based on our calculations using data from several studies, including government census figures. For population statistics, the CDC website (http://www.cdc.gov/), on which the most up-to-date population figures are stored, was accessed. Population statistics discussed here are from 1996, unless otherwise noted. The rates taken from the epidemiological literature are shown in Table 8. A discussion of why these rates were chosen is found in Appendices A and B. Table 12 shows burden-of-illness calculations specifically for a stroke population; addresses other neurologic disorders. These tables contain data from only those neurologic disorders for which there were reliable dysphagia statistics. Therefore, multiple sclerosis and Alzheimer's disease are not included.

A full discussion of these calculations is found in Appendix B. A summary of findings is found in Table 10 and Table 11.

Summary

Acute stroke affects approximately 384,662 to 768,528 individuals annually in the United States. Dysphagia occurs in over half of these patients at least temporarily during the first few weeks after the event. Malnutrition will affect more than 25 percent of these patients with dysphagia, if an effective treatment is not applied.

Some of these patients also experience aspiration, which leaves them susceptible to pneumonia. We have calculated that, ultimately, 4.3 percent of aspirating stroke patients will die of pneumonia if they are not part of a directed dysphagia management program. Approximately 240,781 people each year are affected by dysphagia resulting from other neurologic disorders. The large majority of these cases are the result of Parkinson's disease (almost 213,000). Combined with stroke, approximately 338,393 to 624,757 people will be affected by dysphagia resulting from neurologic disorders each year. Many of these patients may subsequently be affected by aspiration or pneumonia resulting from a disordered swallow, although presently, lack of data in the published literature makes it impossible to calculate these rates.

Methodology

Focus and Refinement of Topic

To focus, refine, and arrive at the key questions addressed by this assessment, the research team consulted with the initiator of the request of an evidence report [the Health Care Financing Administration (HCFA)] and a panel of nine experts in the field. Initial discussions clarified that the topic was to cover only swallowing disorders in elderly individuals and that dysphagia resulting from neoplasms or esophageal dysfunction were beyond the scope of the evidence report. Noninvasive therapies and feeding tubes were similarly deemed the only treatments within the scope of this report. Telephone conversations were then held with the experts, eight of whom were recognized experts in the areas of speech-language pathology, gastroenterology, and neurology, and one of whom was a patient representative.

During these conversations, a set of four basic questions was developed that would address the issues most important to HCFA and the technical experts. The research team then developed an evidence model based on these four questions (see the Evidence Model section below for a description of this model and the key questions). A document containing this evidence model and written descriptions of the specific issues depicted in this model (and addressed in the report) were then sent to HCFA and the experts for comment. Upon receipt of their written comments, members of the research team met to ensure that any issues raised by these individuals would be addressed in the evidence report.

Databases and Other Sources Searched

Electronic Database Searches

Several electronic databases were searched with the intention of retrieving both clinical trials for analysis, and review articles for background knowledge. Specific search strategies are shown in Appendix C. The databases searched and general keywords used are listed below:

American Speech-Language-Hearing Association (ASHA) Database (1945 to 1996)

CANCERLIT (through September 4, 1997)

CATLINE (through August 25, 1997)

The Cochrane Database of Systemic Reviews (through 1998, Issue 3)

The Cochrane Registry of Clinical Trials (through 1998, Issue 3)

The Cochrane Review Methodology Database (through 1998, Issue 3)

Combined Health Information Database (CHID) (through July 29, 1998)

Current Contents (through July 1998)

The Database of Reviews of Effectiveness (Cochrane Library) (through 1998, Issue 3)

DIRLINE (through November 1997)

ECRI Health Devices Alerts (1977 through July 1998)

ECRI Health Devices Sourcebase (through July 1998)

ECRI Healthcare Standards Database (1975 through July 1998)

EMBASE (Excerpta Medica) (1974 through February 6, 1998)

Health Care Financing Administration Coverage (HCFA) Manuals CD-ROM (through July 1998)

HealthSTAR (Health Services, Technology, Administration, and Research) (1990 through May 20, 1998)

Incidence and Prevalence Database (1988 through August 25, 1997)

ECRI International Health Technology Assessment (IHTA) Database (1990 through July 1998)

MEDLINE (1964 through July 24, 1998)

NIH Grants Database (through December 12, 1997)

Nursing and Allied Health (NAHL) (1988 through April 30, 1998)

PsycINFO (1967 through September 10, 1997)

Sociological Abstracts (1963 through November 1997)

The search strategies employed a number of free-text keywords, as well as controlled vocabulary terms, including but not limited to:

Diagnostic modalities: barium sulfate; barium swallow; barium; fluoroscopy; cineradiology; videofluoroscopy; FEES (fiberoptic endoscopic evaluation of swallowing); ESE (endoscopic swallowing evaluation); FEED (fiberoptic endoscopic evaluation of dysphagia); FEESST (fiberoptic endoscopic evaluation of swallowing with sensory testing); VEED (videoendoscopic evaluation of dysphagia)

Disorder: deglutition disorders (exploded); deglutition (exploded); dysphagia; swallowing

Epidemiology: epidemiology; research design; epidemiologic study characteristics; epidemiologic methods; epidemiologic studies; evaluation studies; incidence; prevalence; statistics and numbers; aspiration pneumonia; neurodegenerative diseases (exploded); Parkinson disease; silent aspiration; stroke

Etiology: aging; Alzheimer's disease; dementia; multiple sclerosis; Parkinson disease; stroke

Miscellaneous: cachexia; wasting; weight loss; quality of life; QOL; life satisfaction; satisfaction

Treatment: speech therapy; speech-language pathology; electrical stimulation; enteral nutrition; intubation, gastrointestinal; nasogastric; nasointestinal; NG; percutaneous endoscopic gastrostomy; PEG tube feeding; rehabilitation, geriatric; rehabilitation; speech and language; rehabilitation, patients; elder care; mobile health units

In general, the searches were restricted to human. Case reports were excluded.

World Wide Web Searches

Searches of the World Wide Web were also conducted using various search engines including, but not limited to, AltaVista, Hotbot, and Yahoo. Pertinent websites include:

Dysphagia

American Academy of Private Practice in Speech Pathology and Audiology (AAPPSPA) http://www.aappspa.org

American Dietetic Association (ADA) http://www.eatright.org

American Physical Therapy Association http://www.apta.org

American Speech-Language-Hearing Association http://www.asha.org

Atlanta Voice and Swallowing Center http://www.mindspring.com/~newvoice/

Otolaryngology for the Primary Care Practitioner [continuing education] http://cpmcnet.columbia.edu/dept/cme/cont0013.html

Dysphagia Research Society http://www.als.uiuc.edu/drs/

Evaluation and Interpretation of Acute Dysphagia Disorders and Dysphagia Rehabilitation/EMG

Biofeedback Assisted Treatment http://www.speechpaths.com/pms9.htm

Ivanhoe's Medical Breakthroughs-Electrical Swallowing #1057 http://www.ivanhoe.com/docs/backissues/electricalswallowing.html

NIH Speech Pathology http://www.cc.nih.gov/rm/sp

NYEEI: Otolaryngology: FAQs about Swallowing Disorders http://www.nyee.edu/otolaryn/dysphfaq.htm

Swallowing Disorders http://www.webpages.marshall.edu/~lynch4/swallow.html

Swallowing Disorders Program http://www.stjosephs.org/iv_k_6.htm

Aging

AgeInfo http://www.cpa.org.uk/ageinfo/html

GoldenAge.Net http://www.elo.mediasrv.swt.edu/goldenage/script.htm

National Aging Information Center http://www.aoa.dhhs.gov/naic/

Netherlands Institute of Gerontology: GeronLine http://www.nig.nl/

Statistical Information on the Aging - Online Data http://www.aoa.dhhs.gov/aoa/stats/statlink.html

Diseases

American Heart Association 1997 Statistical Supplement http://www.amhrt.org/1997/stats/Stroke.html

Ask NOAH About: Aging and Alzheimer's Disease http://www.noah.cuny.edu/aging/aging.html

The National Multiple Sclerosis Society http://www.nmss.org

National Stroke Association http://www.stroke.org

NINDS Stroke Information Guide http://www.ninds.nih.gov/healinfo/disorder/stroke/strokehp.htm

Hand Searches of Journal and Nonjournal Literature

In addition to searching Current Contents - Clinical Medicine on a weekly basis - more than 1,600 journals and supplements maintained in ECRI's collections were routinely reviewed. A hand search of the Cumulated Index Medicus (1960-1964) was conducted for the terms deglutition: deglutition disorders, speech therapy, and speech disorders. Nonjournal publications and conference proceedings from professional organizations, private agencies, and government agencies were also screened.

Other Mechanisms

Other mechanisms were used to retrieve additional relevant information, including review of bibliographies/reference lists from peer-reviewed and gray literature. (Gray literature includes reports, studies, etc., produced by local government agencies, private organizations, educational facilities, corporations, etc., that do not commonly appear in the published peer-reviewed journal literature.) Published and unpublished information was also solicited from a panel of experts in the field.

Information Retrieved

The use of these search methodologies, as well as personal communications from many technical experts, resulted in the identification of 4,646 items of information in the form of journal articles, book chapters, manuscripts, monographs, Web pages, personal communications, and other miscellaneous items. The titles and abstracts of all electronic search results were independently reviewed by two primary analysts blinded from each other, and articles were ordered using the following inclusion criteria:

  • Human studies

  • In vivo studies

  • English language

  • 10 or more subjects

The latter criterion warrants some explanation inasmuch as there are a number of studies in this literature that contain small numbers of patients. The primary reason for excluding such studies is that their results are of uncertain generalizability. It is difficult to determine whether patients chosen for these studies were unique, or whether unique treatments were used in such studies. Hence, it is not clear that the results reported in these studies would be obtained at other sites and in other settings. Another reason for excluding small studies is statistical. The variance (and, hence the confidence intervals) surrounding an effect increases as sample size decreases. Consequently, small studies have larger variances than large ones. This means that the information from small studies is not as "precise" as information from larger ones.

Whenever possible, we used U.S. literature. This was to ensure that the potentially different healthcare systems and practices of other countries did not influence the information obtained. However, when little or no U.S. information was available, we used information collected in other countries.

These inclusion criteria, like our literature search strategies, were chosen to be broad to ensure retrieval of all relevant information. To further this goal, literature requested by each analyst was delivered to both analysts. During an initial review of the retrieved literature, the bibliographies of each article were reviewed for additional information that might not have been found during the database searches.

In certain instances, more specific inclusion criteria were used. These criteria are described in the appropriate sections of this report.

As a result of our searches, a total of 1,808 articles was retrieved. These included 1,467 clinical trials, 183 review articles, and material from 9 World Wide Web sites. In addition, we obtained 32 unpublished articles and received 28 personal communications.

Evidence Model

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is f14_F001.jpg.

   Figure 1. Evidence Model of Diagnosis and Treatment of Dysphagia in the Elderly

To organize a discussion on the effects of different diagnostic and treatment strategies for dysphagia management in the elderly, we created an evidence model depicting the relationships among treatments, diagnostics, and patient outcomes. This evidence model is shown in Figure 1. The top box in this model shows the entry of patients age 65 and older into the healthcare system (as related to swallowing disorders). Below this, the model is divided into six sections (indicated by the solid black lines that surround groups of boxes). These sections depict diagnostic tools, symptoms detected by these tools, treatments, short-term outcomes, morbidities, and long-term outcomes. Within each box is a code that consists of a letter followed by a number (e.g., D1). The letters D, S, and T stand for Diagnostic Tools, Symptoms Detected by Diagnostic Methods, and Treatments, respectively. The letter O is used for both Short and Long-Term Outcomes. Any meaningful relationship between two boxes in any given section is noted in the subsequent section.

Typically, lines are drawn between the boxes in an evidence model to indicate the particular relationships being addressed in an evidence report. For example, a line might be drawn between boxes D5 and S1 to indicate the relationship between the modified barium swallow test and aspiration, a relationship that manifests itself as the ability of this test to detect aspiration. As another example, a line between boxes T1 and M1 indicates that the report is addressing the effect of swallow therapy on rates of aspiration pneumonia. These lines, sometimes called links, are not shown in this evidence model because the large number of them that we addressed could not be displayed in an easily readable fashion. Instead, we refer to these links by their code numbers (for example, Link T1-M1 refers to the link, or line, between boxes T1 and M1).

When viewed in this way, the evidence model depicts the kinds of diagnostics and treatments available, and the kinds of patient outcomes that may result from their use. In this respect, the evidence model is a map of available data, and is used to answer the four key questions of this report. These key questions are:

  • 1

    How does diagnosis of dysphagia or aspiration affect the subsequent course of treatment and patient outcomes? (A related question, addressed in a Supplemental Analysis, is whether dysphagia diagnosis and treatment programs are cost-effective.)

  • 2

    What are the appropriate indications for having patients diagnosed using a full bedside exam (BSE), modified barium swallow (MBS), fiberoptic endoscopy (FE), or another instrumented exam?

  • 3

    Is one diagnostic technology more effective than any other diagnostic?

  • 4

    When is noninvasive swallow therapy appropriate? Does it work particularly well or particularly poorly in any particular patient population? Are feeding tubes useful as a primary therapy, or should they be used as a last resort that might be avoided for some patients by dysphagia diagnosis and noninvasive therapy?

Common to the first three questions is a question about the efficacy and appropriate role of the BSE.

In addition to serving as a tool that assists in answering the questions upon which this report is based, the evidence model also serves a second function: to illustrate the approximate clinical pathway that patients follow. Each of these two functions of the evidence model is described in detail in the following sections of this report.

The Evidence Model as a Clinical Pathway

The evidence model approximates the clinical pathway followed by patients. Understanding of this pathway is critical to understanding not only the relationship(s) between the kinds of available evidence, but also for understanding the kinds of future research that might be conducted. Thus, the purpose of this section is to elaborate on the use of the evidence model as an illustration of the clinical pathway.

Section one of evidence model: Diagnostic tools

The first step in patient management is to diagnose the patient as having or not having dysphagia. The patient will likely have a history and physical first, which will identify any outward symptoms that may indicate a problem; alternatively, the physical exam may simply be part of the diagnosis of a disease or disorder with which dysphagia is commonly associated (e.g., stroke). Patients who meet some diagnostic criteria for having dysphagia (or suspected dysphagia) will continue on to a more dysphagia-specific examination (designated as D2 to D7 in the evidence model), which may include one or more of the following diagnostic tests (described in detail in the Introduction to this report):

  • 3-ounce water test

  • Formal BSE

  • Informal BSE

  • MBS

  • FEESST

  • FEES

  • Other instrumented exam

Section two of evidence model: Symptoms detected by diagnostic tools

These diagnostic tests will have varying abilities to detect signs and symptoms of dysphagia. One of the main issues of this report is how well these diagnostic tests detect aspiration or risk for aspiration pneumonia. The two important symptoms (designated in the evidence model as S1 and S2) are therefore dysphagia and aspiration.

Section three of evidence model: Treatments

Depending on the symptoms detected, a patient with dysphagia or aspiration may undergo one or more treatments (designated as T1 to T4 in the evidence model). These include:

  • Swallow therapy

  • Modified diet

  • Combined swallow therapy and modified diet

  • Feeding tube

The treatment a patient receives depends on the type and severity of the problem detected.

Section four of evidence model: Short-term outcomes

There are many important outcomes reported in the field of medicine. Some of the most common and important ones are complications resulting from treatment and mortality. However, not every outcome of interest to society or patients is relevant to every intervention. It is therefore important to pick outcomes of interest based on relevance to the topic as determined by clinical experience (e.g., the observation of pneumonia in patients with dysphagia) or logical inference (e.g., that dysphagia may lead to malnutrition or dehydration).

It is possible, relatively soon after treatment has begun, to measure certain outcomes that indicate whether the treatment is working. These short-term outcomes are designated O1 to O6 in the evidence model and include:

  • Swallow function

  • Food intake

  • Feeding method

  • Weight change

  • Aspiration

  • Tube complications

Swallow function refers to the physiological changes in the process of swallowing that may occur after certain therapies, such as total swallow time and pharyngeal transit time. Food intake refers to the change in the amount of energy (e.g., kilocalories) ingested by the patient as a result of treatment. Weight change is an important outcome, as many elderly are malnourished, potentially as a result of a swallowing problem. Aspiration indicates whether the therapy of choice has successfully eliminated or reduced a previous aspiration problem. Tube complications, an outcome specific to the feeding tube, indicate that a problem has occurred with the tube, either intraoperatively or postoperatively (e.g., wound infection).

Section five of evidence model: Morbidities

One goal in the diagnosis and treatment of dysphagia is to prevent serious morbidities that can result from aspiration and an inability to ingest adequate amounts of food. The morbidities of interest in this report, designated M1 to M3, are:

  • Pneumonia

  • Malnutrition/dehydration

  • Major tube complications

Pneumonia resulting from dysphagia and aspiration is usually specifically aspiration pneumonia, the result of material passing into the lungs and causing infection. Pneumonia is a major cause of morbidity and mortality in the elderly, and is therefore perhaps the most important consideration. Malnutrition and dehydration are a result of the inability to ingest adequate amounts of food or liquid and are included in this model through logical inference rather than clinical evidence (to be discussed in depth later in this report). Major tube complications can lead to serious systemic morbidity such as gastric fistula, septicemia, peritonitis, or gastric hemorrhage. Treatment effects on the incidence of these morbidities are important to examine.

Section six of evidence model: Long-term outcomes

The ultimate concern in the diagnosis and treatment of any disorder is how the patient is ultimately affected: Does the patient live or die, and if the patient lives, is it a good life? These concerns are addressed in the evidence model by boxes O6 to O9:

  • QOL

  • Mortality (other)

  • Mortality resulting from morbidity

  • Mortality resulting from treatment (feeding tube)

QOL specifically addresses a concern about the ability of the elderly person to maintain an active, rewarding life although suffering from dysphagia. The measurement of QOL is a developing field, in which the patient's physical, psychological, and social well-being are assessed. From the patient perspective, this is perhaps the most important outcome to consider because it incorporates all outcomes, including death. Mortality (other) refers to patients who die of their underlying disease rather than from complications related to or resulting from the dysphagia problem. Mortality resulting from morbidity refers to those patients who die directly as a result of the morbidities outlined above: either from malnutrition, pneumonia, or an acute aspiration event. Mortality resulting from treatment refers specifically to those tube-fed patients who experience serious complications with the feeding tube and die as a result; all other treatments covered in this report are noninvasive and therefore do not include any directly related mortality. In this report, the effect of treatment on these long-term outcomes is examined.

Viewing the evidence model as a clinical pathway, and not as six discrete sections, highlights the fact that no single section should be reviewed separately from any other section. Further, to separately measure the effects of diagnosis and treatment of swallowing disorders on patients is not possible (see the section titled The Relationship between Diagnosis and Treatment of Swallowing Problems).

Use of the Evidence Model to Answer the Four Main Questions

Table 14. Levels of the Key Questions of this Report
Level of evidenceType of question(s) posed by levelExamples of experimental end points
Level 6Is a dysphagia management program cost-effective?Cost-effectiveness
Level 5Is a dysphagia management program efficacious?Comparison of morbidity and mortality combined with demonstration of diagnostic efficacy
Level 4, Type 1 Level 4, Type 2Are dysphagia diagnostic methods efficacious? Are dysphagia treatment methods efficacious?Percentage of patients given specific treatments (without regard to treatment efficacy) Morbidity and mortality (without regard to diagnostic efficacy)
Level 3Do dysphagia diagnostic methods change patient management?Percentage of patients whose management was changed
Level 2What is the performance of the various bedside exams, MBS, FEES, and FEESST?Sensitivity, specificity, positive and negative predictive values
Level 1What is the resolution of the MBS, FEES, and FEESST?Not relevant to the present report
In this section, we discuss each of the links addressed in this evidence report as they pertain to the key questions. Noting at the outset that the answers to some of these questions may be of greater interest to patients and society than the answers to others is important to recognize. This suggestion agrees with the evidence hierarchy proposed by Fryback and Thornbury (1991), and we have employed a modification of this scheme for the purposes of the present report. Our scheme, as it pertains to the diagnosis and treatment of swallowing disorders, is depicted in Table 14.

This hierarchy contains six levels. The lowest level (Level 1) addresses the technical efficacy of a technology. This level of evidence is illustrated by the situation often found in radiology in which it is determined that an imaging device can resolve a certain number of lines per centimeter.

Level 2 evidence provides information related to a more general question: how well a given diagnostic can determine whether a patient has (or does not have) a certain disease or condition. Evidence at this level, unlike that at the first level, provides the sensitivity and specificity of the diagnostic.

Level 3 evidence has even greater generality: it addresses how physicians use the results of a diagnostic test. For example, a study providing evidence at this level could measure the percentage of patients whose management was changed as a result of the diagnostic test. The precedence this evidence takes over Level 2 evidence becomes apparent when we consider that even a perfectly sensitive and specific diagnostic test would not benefit patients if its results were not used in patient management. This level of evidence, however, does not specify the treatments given to patients.

At Level 4, there are two types of evidence. The first type is evidence that examines the performance of a diagnostic without regard to treatment efficacy. The second type is evidence that examines the efficacy of treatments without regard to the performance of the diagnostic test. Neither type of evidence is as general as evidence about the efficacy of a program of diagnosis and treatment, and both are more meaningful than the previous levels, which do not provide much information about patient outcomes. An example of the first type of Level 4 evidence is a hypothetical study that specifies that a certain percentage of patients given a MBS exam were instructed to tuck their chins when they swallowed and that another percentage of patients were instructed to turn their heads to one side when they swallowed. Even though the treatments are specified by this type of Level 4 evidence, the efficacy of these treatments (e.g., how many patients died of aspiration pneumonia) is not considered. Rather, this level of evidence specifies the percentages of patients given treatments thought to be effective. Therefore, this type of evidence primarily gives information about the performance of the diagnostic test. An example of the second type of Level 4 evidence would be a study that compared the effectiveness of two treatments given to patients who all had the same results on a diagnostic test. This type of evidence primarily provides information about treatment efficacy and does not consider whether patients are receiving the best possible diagnostic test. As such, information about the outcomes of patients given diagnostic tests thought to be effective is provided.

Level 5 evidence addresses both the efficacy of the diagnostic test and the efficacy of treatment. Evidence at this level addresses whether patients are receiving both the best available diagnostic and the best available treatment. In other words, the efficacy of dysphagia programs is addressed. As such, information that is of greater interest to patients and society than that provided by the evidence in Levels 1 through 4 is provided.

The highest level of evidence is provided by Level 6, which addresses the cost-effectiveness of any given program. At one extreme are programs that might be highly effective but so expensive that society's finite amount of dollars would save more lives were they spent elsewhere. At the other extreme are inexpensive programs that are not particularly efficacious. Such programs give rise to debates about whether they should be implemented because of the remote possibility they might benefit some patients.

How this evidence hierarchy is used to rank the major questions of this evidence report is explained below, in the discussion of the evidence model links of interest.

Question 1: How does diagnosis of dysphagia affect subsequent course of treatment and outcomes? (Level 5)

This question is a Level 5 question because it calls for assembling evidence or for studies that directly address the efficacy of a swallowing disorder program. As such, this question calls for one to directly link diagnostic methodologies to treatment and then to patient outcomes. Specifically, this question asks: When a particular diagnostic technology is used to direct treatment, how often do patients come down with pneumonia, become malnourished, or require a feeding tube after failed treatment? How many of these patients die or experience decreased quality of life? The ultimate purpose of any diagnostic and treatment program is to avoid such patient outcomes, and thus the focus of this question is of relatively great importance.

The specific links in the evidence model that address this question and the question represented by each link are:

Link D2-D8 to M1: How does the use of each of these diagnostic tests affect the rate of pneumonia in patients with dysphagia?

Link D2-D8 to M2: How does the use of each of these diagnostic tests affect the rate of malnutrition or dehydration in patients with dysphagia?

Link D2-D8 to O7: How does the use of each of these diagnostic tests affect the quality of life of patients with dysphagia?

Link D2-D8 to O9: How does the use of each of these diagnostic tests affect the rate of death due to dysphagia-related morbidities (pneumonia, malnutrition/dehydration)?

Link D2-D8 to T1-T4 to O1-O10: How does the use of each diagnostic to guide treatment affect outcomes after specific treatments?

Link D2-D8 to T1-T4 to M1 to M3: How does the use of each diagnostic to guide treatment affect morbidities after specific treatments?

Link D2-D8 to T1-T4: How are the results of diagnostic tests used to guide treatment?

Question 2: What are the appropriate indications for having patients diagnosed using a full BSE, MBS, or FEES? (Level 2)

This question is a Level 2 question because in asking how to selectively choose patients to undergo more extensive testing for dysphagia and aspiration, the question is about the sensitivities and specificities of a particular noninstrumented exam, and does not consider the efficacy of any treatment(s) that may result from this test or how patient management might be changed by the test results. Of interest in this question is whether particular signs and symptoms are detected during preliminary clinical or noninstrumented examination (general physical exam, preliminary BSE, etc.) that identify patients who would be most or least likely to benefit from further testing (patients who in particular are at risk for morbidities or mortality resulting from dysphagia or aspiration).

The links addressed by this question are:

Link D1-D4 to M1: Do particular signs or symptoms detected during a history and physical or preliminary BSE predict pneumonia?

Link D1-D4 to S1: Do particular signs or symptoms detected during a history and physical or preliminary BSE predict aspiration?

Question 3: Is there any evidence that one diagnostic technology (full BSE, MBS, etc.) provides more useful information than any other diagnostic? (Level 2)

This question is a Level 2 question because it is restricted to the sensitivity and specificity of each diagnostic modality and compares them with one another in their ability to predict pneumonia or aspiration. This question only addresses the ability of these tests to identify the probability for these problems and does not consider the effect of these technologies on patient outcomes.

The links that represent these issues are:

Link D2-D8 to M1: How well do each of these diagnostic methods predict pneumonia?

Link D2-D7 to S1: How well do each of these diagnostic methods predict chronic aspiration?

Question 4: When is noninvasive swallow therapy appropriate? Does it work particularly well or particularly poorly in any particular patient population? What can the evidence tell us about this therapy? Is PEG useful, or is it a last resort? (Level 4)

This question is a Level 4 question because it addresses the efficacy of different therapies for the treatment of dysphagia and which patients, if any, receive the most benefit from the therapies. It does not, however, address the efficacy of any diagnostic test.

Of interest are both short-term outcomes, such as physiological changes immediately detectable during or after treatment, as well as long-term outcomes, such as rate of morbidity and mortality. The long-term outcomes are more important, and are therefore the primary focus of this question.

Several links in the evidence model are important in answering this question:

Treatments to Short-term Outcomes

Link T1 to 01: How does swallow therapy affect physiological swallow function?

Link T1-T3 to O2: How does noninvasive therapy affect the patient's ability to ingest adequate amounts of food and drink as measured by energy intake and weight change?

Link T1-T3 to O3: How does noninvasive therapy affect the type of feeding (oral versus tube, special textures versus normal diet) the patient is able to safely perform?

Link T1-T3 to O4: How does noninvasive therapy affect the weight of the patient (as an indirect measure of food ingestion)?

Link T1-T3 to O5: How does noninvasive therapy affect the occurrence of aspiration in known aspirators?

Link T4 to O4: How does tube feeding affect the weight of the patient (as an indirect measure of food ingestion)?

Link T4 to O5: How does tube feeding affect the occurrence of aspiration in known aspirators?

Link T4 to O6: How often does the use of a feeding tube result in complications directly related to the tube?

Treatments to Morbidities

Link T1-T3 to M1: How often does pneumonia occur in patients receiving each of these noninvasive therapies compared with those receiving other therapies or no therapy?

Link T1-T3 to M2: How well does each noninvasive therapy prevent or cure malnutrition or dehydration compared with other therapies or no therapy?

Link T4 to M1: How often does pneumonia occur in patients on a feeding tube compared with those receiving other therapies or no therapy?

Link T4 to M2: How well does the use of a feeding tube prevent or cure malnutrition compared with other therapies or no therapy?

Link T4 to M3: How often does the use of a feeding tube result in major tube-related complications?

Treatments to Long-term Outcomes

T1-T3 to O7: How does each noninvasive therapy affect a patient's QOL?

T1-T3 to O8: How often do patients undergoing each noninvasive therapy die of underlying disease?

T1-T3 to O9: How often do patients undergoing each noninvasive therapy die of serious morbidity related to the dysphagia?

T4 to O7: How does tube feeding affect a patient's QOL?

T4 to O8: How often do patients undergoing tube feeding die of their underlying disease?

T4 to O9: How often do patients undergoing tube feeding die of serious morbidity related to dysphagia?

T4 to O10: How often do patients undergoing tube feeding die of causes directly related to the feeding tube itself?

Supplemental analysis: What is the cost-effectiveness of dysphagia diagnosis and treatment programs? (Level 6)

This question addresses the cost-effectiveness of dysphagia programs. Ideally, this sort of analysis is performed from a societal perspective, but such analyses require data that are often difficult to obtain (e.g., costs to family members to drive a patient to a care facility). Even though some data relevant to the strict societal perspective of a Level 6 question have not yet been collected, we will, nevertheless, treat this question as if it were Level 6. It is the only question to take both costs and effectiveness into account, so it is clearly of a higher level than our other questions.

Study Quality

There are essentially four types of study designs in the literature that we analyzed for an evidence-based appraisal of diagnosis and treatment of swallowing disorders: randomized controlled trials (RCTs), historical prospective case series,1 case-control studies, and case series. In general, we rank data from RCTs higher than data from studies of other design, and assign data from historical prospective case series to the second highest rank. We do not, however, adhere rigidly to this scheme because, as discussed below, some RCTs have flaws serious enough to preclude assigning their data to the highest rank. Similarly, we do not rank case-controlled studies above case series because of certain flaws in the former. These flaws are particularly evident when case controls are used to measure the sensitivity and specificity of diagnostic tests. Such designs artificially set the prevalence and severity of disease, rendering measures of test performance derived from them suspect.

Because no inviolable rule for preferring the data from a study of one design to a study of another design can be constructed for the present report, we do not assign numerical or letter codes to these study designs. Rather, we simply provide the study designs in the evidence tables and the narrative of the report. Also, because of the lack of an inviolable rule, we discuss the quality of individual studies at some length in the following sections. This is done to provide evidence that such rules cannot be rigidly adhered to. Our discussions of studies in the sections below generally follow the order in which studies are typically ranked; it begins with RCTs and ends with case series. As explained in the section entitled Information Retrieved, we did not consider case reports.

Quantitative Methods

Throughout the Results section of this report, we perform numerous original calculations. A list of these original calculations appears in Appendix H. Briefly, however, we computed many of the effect sizes of treatment, computed their 95 percent confidence intervals (C.I.s), conducted statistical tests on the reported effects, and computed most of the sensitivities, specificities, positive predictive values, and negative predictive values discussed below. It is important to note that the 95 percent C.I.s calculated around proportions were computed using a formula that corrects for the distortions observed in these intervals when any given proportion greatly deviates from 0.5. The need to perform this substantial number of original calculations is a reflection of the relatively poor reporting and the relatively poor quality of the literature related to this evidence report. Without these calculations, it would have been difficult to address our four key questions.

In addition to the above-mentioned calculations, we also adjusted for the varying lengths of followup used by different studies, conducted meta-analyses (in the form of summary receiver operating characteristic curves) on the sensitivities and specificities of the 3-ounce water test and the BSE, performed an exploratory meta-analysis (termed exploratory because it employs pooled historical data as the control group data), and conducted a meta-analysis used to illustrate the low statistical power of vote-counting procedures.

We also employed other quantitative methods in our supplemental analysis, which contains two cost-effectiveness analyses. Both of these latter analyses play an important role in our conclusions.

Quality Control Methods

Preparation of this evidence report was monitored by an eight-member ECRI internal review committee. This committee performed several functions. The first was to oversee the abstraction of data from full-length articles. This was accomplished by having the review committee examine the evidence tables into which data were directly entered. Data collection forms were not used in the present report because this method was cost-ineffective. This was because of the large number of links we examined (each of which required unique information) and because of the relative lack of data relevant to some links (preparing an abstraction form for only a few studies is inefficient). Data for the key questions were abstracted independently by two analysts.

The second function of the committee was to critically evaluate the methods, logic, and conclusions of the evidence report. This was accomplished by reviewing the project at several phases. These reviews consisted of examination of all aspects of the report, but particularly the evidence model, the results and conclusions, the supplemental analysis, and all calculations performed in the report. Six full-day review meetings were held in addition to several shorter meetings. As a result of these processes, no one individual had undue influence over the conclusions made in this report.

Results

In this section, we discuss and assess the evidence for each relevant link in the evidence model as it relates to each of our four primary questions.

1. How does the diagnosis of dysphagia or aspiration affect the subsequent course of treatment and outcomes?

We address the first question in two subsections. The first addresses whether use of an instrumented or noninstrumented diagnostic improves patient outcomes, particularly aspiration pneumonia. In the second subsection, we examine the evidence for whether these diagnostics change patient management. Nevertheless, we do not ignore outcomes in this second section, and we discuss, wherever possible, the outcomes associated with specific treatments. We more directly address the question of which treatments are most effective in the fourth question of this evidence report.

Outcomes: Dysphagia Diagnosis Treatment Program for Prevention of Serious Morbidities

The primary reason to use any diagnostic test is to improve patient management and, thus, patient outcomes. Several dysphagia-related outcomes are important, including subjective patient quality of life (QOL) improvement, prevention of malnutrition, prevention of dehydration, prevention of pneumonia, and prevention of death.

We could not find sufficient suitable data for formal evaluation of QOL improvement due to dysphagia diagnosis, so this outcome is not evaluated in relation to our first question.

One of the most meaningful endpoints to patients is prevention of death. Again, there was little evidence on this in the literature, particularly on death due to malnutrition (if, despite the availability of feeding tubes, it occurs). We were, however, able to estimate potential death rates due to dysphagia, and these are addressed in the Epidemiology section of this report.

Malnutrition and Dehydration

Prevention of malnutrition and dehydration is also important, but the relationship between these outcomes and diagnosis of dysphagia is difficult to determine from available evidence. This is partly because the administration of parenteral or enteral nutrition is often introduced if there is even a threat of malnutrition. Similarly, intravenous fluids are commonly given to inpatients with any serious disease or illness that precludes ingestion of liquids. For these reasons, in acute care, malnutrition and dehydration are largely dealt with outside of the context of formal diagnosis and treatment of dysphagia. This could explain why few studies have attempted to determine the extent to which (or even whether) dysphagia causes malnutrition/dehydration. Because of the resulting lack of evidence, we do not examine this outcome in relation to this question. There is, however, the possibility that for some patients, diagnosis of dysphagia may lead to therapy that prevents malnutrition and/or dehydration, and therefore obviates the need for feeding tubes. Whether, in fact, dysphagia management does reduce the use of feeding tubes is addressed in our fourth question.

Pneumonia

For the purposes of this question, then, prevention of pneumonia is the primary endpoint, an endpoint that is reported relatively often in the literature. The advantage of this endpoint is that it allows us to ask whether use of any particular diagnostic leads to lower pneumonia rates. In addressing this question, we focused on stroke patients, either in acute care, rehabilitation units, or nursing homes, inasmuch as this reflects the emphasis of the literature.

Aspiration pneumonia, in particular, is the specific diagnosis of interest in a population with dysphagia, as these patients are at risk of aspirating food or liquid into their lungs. The definition of aspiration pneumonia is not clear in the medical literature, and it is often difficult to diagnose without the use of invasive procedures. It is therefore questionable in many studies whether patients diagnosed as having aspiration pneumonia were actually definitively diagnosed as such. Adding to this confusion is the fact that the causative relationship between aspiration and pneumonia has been demonstrated in the literature as somewhat tenuous (see the section entitled Burden of Illness). In addition to aspiration, the general debilitation, malnutrition, and dehydration that may accompany stroke and degenerative neurologic conditions could increase risk of pneumonia by weakening immune defenses. Also, poor oral hygiene might accompany these conditions and thus increase the presence of oral-pharyngeal bacteria or other microorganisms that cause increased risk of pneumonia. Nevertheless, in the presence of markedly increased dysphagia and aspiration in these neurologically impaired patients, aspiration remains the most plausible route into the lungs for microorganisms that might overwhelm a weakened immune system. Therefore, even though aspiration alone may not be sufficient to cause pneumonia in otherwise uncompromised people, aspiration does appear to participate in the etiology as the most plausible route of harmful microorganisms into the lungs. Thus, the term aspiration pneumonia is probably warranted in patients who are at such increased risk of both aspiration and pneumonia.

Because of the lack of uniformity in the criteria for distinguishing aspiration pneumonia from other pneumonia in dysphagia in published studies, we were forced to accept the terminology used by the authors of each study. This is unlikely to have had a major impact on the results of this report because there were no marked differences in the frequency of pneumonia in acute stroke patients, regardless of whether the authors stated they were studying aspiration pneumonia or just pneumonia. However, the term chest infection, which is common in the medical literature of the U.K., did appear to be a broader definition that was associated with markedly higher frequencies. For this reason, we generally did not use this type of data. Otherwise, we included and analyzed any dysphagia study that examined pneumonia, regardless of whether the authors stipulated that it was aspiration pneumonia. Because we must frequently refer to results from more than one study, we often use the generic term pneumonia, even though some of the studies being referred to may have used the term aspiration pneumonia. Under the circumstances, it does not appear practical to put too fine a point on the possible distinctions between these terms.

The links of the evidence model that are relevant to answering this first question are the links between each of the diagnostic techniques and aspiration pneumonia. Referring to the evidence model shown in Figure 1, these are the eight imaginary lines between each of the boxes labeled D1 through D8, and box M8. These links each ask whether there is a relationship between use of any of the diagnostic tests and pneumonia or, more specifically, does use of any of these tests prevent aspiration pneumonia? It is also possible to compare the results of one link with another to determine whether the use of one diagnostic leads to lower pneumonia rates than another. As such, these links presume that treatment intervened between diagnosis and the outcome and, therefore, these links address the efficacy of dysphagia diagnosis and treatment programs. To ensure that the meaning of each link is clear, we preface our discussion of these links by presenting in each heading the question posed by the link and parenthetically noting the codes for the specific boxes of the evidence model that comprise the link.

Does Use of Noninstrumented Exams in A Dysphagia Program Reduce Pneumonia Rates? (Links D1 through D4 to M1)

To address this link, we have combined history and physical examinations and the formal and informal bedside examinations (BSEs) into a single question. This is appropriate because we can assume that all patients receive some sort of history and physical as part of the routine examinations conducted on stroke patients.

We found no randomized controlled trials (RCTs) that were relevant to this question.

We found one study (Odderson, Keaton, and McKenna, 1995) of a dysphagia program that used aspiration pneumonia as an outcome and that included a control group. This study was a historical prospective case series that assessed the efficacy of a dysphagia program using a BSE followed by speech-language therapy. The assessment consisted of comparing pneumonia rates at the same institution immediately before, and after introduction of the program (see Evidence Table 25 for a description of this study and its results). This study was performed in an urban community hospital. A pneumonia frequency of 6.7 percent (95 percent CI of 3.1 to 14 percent) was reported in consecutively admitted stroke patients the year before instituting a dysphagia program (the historical control group). A frequency of 4.1 percent (and a CI of 1.8 to 9.3 percent) was reported in 121 patients during the first year of the program, and no cases of pneumonia out of 124 admissions (CI = 0 to 3.0 percent) during the second program year. Oral communication with the author indicated that there was one case of pneumonia in the third year and no cases in the fourth year (Odderson, 1998). In this program, a preliminary BSE, typically administered by a nurse, was used to determine whether patients had a safe swallow, defined by the following criteria: (1) the patient is alert, follows simple requests, has a clear, strong voice, and can produce a strong cough; (2) the patient can handle his or her own secretions without difficulty, and can swallow ice chips and sips of ice water briskly; and (3) the larynx elevates completely at the time of swallowing, the voice remains clear after swallow, and there is no coughing afterwards. If the patient met all of these criteria, trial meals were instituted starting with thin and thick liquids.

Patients not passing the preliminary test had oral intake restricted and were given a formal BSE by a speech-language pathologist (SLP). Dysphagia patients were followed by speech pathologists with 15-minute daily sessions for 6 days. The mean length of hospital stay for all stroke patients was 7 days during the program and 9 days in the pre-program period.

The authors did not determine whether the reported decrease in pneumonia after initiation of the program was statistically significant. The authors of this study also did not test statistically whether the results obtained in the second program year were different from those obtained in the first. We carried out such an analysis by calculating the 95 percent CI around the difference between the two proportions (6.7 percent pneumonia prior to the program, and 0 percent for the second year of the program) using the method of Wilson as described by Newcombe (Newcombe, 1998), which compensates for the fact that the commonly used methods of calculating CIs around proportions (or their differences) yields spurious results when the proportions are near zero. The absolute difference between the percentage of all stroke patients with pneumonia before the program and during the second year was 6.7 percent, with a 95 percent CI from 2.0 to 11 percent. The fact that the CI around the difference does not include zero indicates that the difference is statistically significant. Another way to analyze these data is to calculate the proportional reduction in relative risk (RRR).1 We calculated the RRR for the first program year was 39 percent, and for the second year, it was 100 percent.

This trial may represent one of the few ethical ways to compare the results of an explicit program that included diagnosis and treatment with a situation in which there was no diagnosis. Nevertheless, the pre-program data almost certainly underestimate a true no diagnosis situation, because according to oral communication with the author, some patients in the pre-program year were receiving whatever dysphagia diagnosis and treatment was requested by attending physicians (Odderson, 1998). Given this, it is tempting to speculate that greater pneumonia rates would be found in the true no diagnosis situation and that, therefore, this trial underestimates the effect of a dysphagia program. However, the study has weaknesses that preclude reaching such a conclusion. First, there was no in-depth comparison of the patient characteristics before the program and during the program, so the authors did not formally demonstrate that these patient groups were similar. Second, the data were not collected in the same way for the pre-program and program groups. The pre-program data were collected retrospectively, and the program data were collected prospectively. It would have been preferable if both were collected prospectively, because this would provide more confidence that there was no bias in the selection of comparison of the groups in this nonrandomized study. Finally, because the control and experimental group data were obtained at different time periods, there remains the formal possibility that, as with any time series-like study, improvements in patient outcomes were due to some unspecified change in physician practice, patient management, or hospital practice and policies. Because of these shortcomings, it is fair to conclude that an improvement with pneumonia rates was correlated with the presence of a dysphagia program that used a BSE as the diagnostic tool, but the evidence that this improvement was caused by this program is relatively weak. It is also imprudent to assume that the size of the effect measured by these authors (i.e., that their program was perfectly successful) is generalizable, but in the face of such a large apparent effect, it is similarly imprudent to entirely dismiss the results of this study because of its weaknesses.

Because this was the only relevant U.S. information we located, we searched the foreign literature for additional evidence. We found an uncontrolled study from a Swedish program (Nilsson, Ekberg, Olsson et al., 1998) that reported a pneumonia frequency of 2.8 percent (2/72) for consecutive stroke patients over a 6-month interval (see Evidence Table 27). This was the frequency for patients who were able to respond to the preliminary BSE, which was simply patient report of difficulty swallowing. Patients who thus self-reported dysphagia were treated with parenteral nutrition (IV drips) or diet modification as needed; no nasogastric (NG) or gastrostomy tubes were used. An additional 28 percent of stroke patients were not evaluable, and they had a pneumonia frequency of 11 percent (3/28).

Adjustment of published results to a common followup period

It is difficult to interpret or compare the results of the above studies because they used different followup periods. For example, the pre-program followup in the Odderson et al. study (Odderson, Keaton, and McKenna, 1995) averaged 9.8 days, and the followup during the second program year averaged 7.2 days. Further, the Nilsson et al. study (Nilsson, Ekberg, Olsson et al., 1998) used yet another followup, 6 months. The problem in comparing such data arises because the incidence of pneumonia may change as a function of the time since the stroke. We therefore sought to adjust these results to a standard time period after stroke.

In this effort (described more fully in Appendix A), we found no studies that specifically reported the change in the incidence of aspiration pneumonia with time following stroke. We did, however, find one study that reported the change in incidence of chest infection over the time following stroke. Davenport, Dennis, Wellwood et al. (1996) (Webb, Fayad, Wilbur et al., 1995) plotted the cumulative percent of chest infection out to 30 days following stroke. Chest infection is a broader definition than the usual aspiration pneumonia criterion of radiologically confirmed lower pulmonary evidence of pneumonia; however, the shape of this declining incidence curve should be approximately the same for both definitions. This means that at each weekly interval following stroke, the proportion of cases should be the same for both curves, although the absolute numbers might be different. Thus, we fitted a logarithmic curve (SPSS 7.5, Chicago, IL) to the Davenport et al. data, and from this curve, we were able to interpolate the proportion of pneumonia cases occurring at each weekly interval following stroke. We then applied the appropriate weekly interval adjustment factor to the frequency for any reported interval, and standardized the pneumonia frequencies to a single interval.

For the Odderson et al. study, we standardized to a 7-day interval, because that closely approximated the 7.3- and 7.2-day length-of-stay (LOS) intervals for their pneumonia frequencies for the first and second years of their program. The 9.8-day LOS interval for the pre-program year was likewise adjusted to a 7-day interval. The only effect of this adjustment was to change their historical control frequency of pneumonia from 6.7 to 6.3 percent (Evidence Table 25). The program frequencies did not change, and the difference in pneumonia frequency between the pre-program year and the second program year was still statistically significant.

Exploratory analysis using pooled historically controlled data

Because only the Odderson et al. study provided a historical control pneumonia frequency for a period without a structured dysphagia program, we searched for additional historical control data (Evidence Table 26) from studies that that did not have an evaluation of a dysphagia program, nor they attempt evaluation. We found one such contemporary U.S. study that reported the frequency of pneumonia in stroke patients in an acute-care facility (Young and Durant-Jones, 1990). The pneumonia frequency was 13 percent (28 of 216 patients; 95 percent CI were 9.1 to 18 percent). Four percent of the patients were excluded because they were comatose. They are likely to have had a high frequency of pneumonia that would increase the overall frequency slightly if they were included. However, on an intent-to-treat basis, these patients would not be eligible for modified barium swallow (MBS), fiberoptic endoscopic examination of swallowing (FEES), or interactive therapy, therefore, we can exclude them from analysis. The followup interval was a mean LOS of 31 days. In Evidence Table 26, we show the pre-program pneumonia frequency from Odderson, Keaton, and McKenna (1995) and the additional U.S. no-program frequency we located (Young and Durant-Jones, 1990).

We also found two older studies reporting pneumonia frequency after stroke, covering the years 1967 through 1984 (Barker and Mullooly, 1997; Haerer and Smith, 1974). These studies reported frequencies of 8.6 percent and 10 percent. Because these frequencies were within the range reported in the two more recent studies described above and did not appear to change over these years, we performed an exploratory analysis in which we averaged all of these reported frequencies to get a historical no-program pneumonia frequency of 9.2 percent (95 percent CI = 7.8 to 11 percent) to compare with the above dysphagia program case series pneumonia frequencies. Adjusted to a 2-week interval, this historical control frequency was 8.2 percent (CI = 6.9 to 9.7 percent; Evidence Table 27). This is an arbitrary interval, but it does approximate the typical LOS in the articles we examined.

We then performed an exploratory analysis in which the Odderson, Keaton, and McKenna (1995) and Nilsson, Ekberg, Olsson et al. (1998) data were compared with the historical controls. We refer to this as an exploratory analysis because it is subject to the same problems as other quasi-experimental designs. Among these problems are the inability to control for patient characteristics and the inability to control for different practices and policies that may exist at different institutions. These weaknesses exist not only in our comparison of the experimental and control groups but also in combining the historical data into a single number. Therefore, the results of this analysis should be considered circumstantial.

Stated in another way, although our comparison involves statistically combining results, it should not be confused with a formal meta-analysis. The latter is usually conducted on data from a homogeneous set of RCTs (all using the same treatment), whereas we have compared the results of heterogeneous case series studies (different diagnostic and treatment methods) with those of other case series that we used as historical controls. Our analysis does not have the internal validity of a meta-analysis of controlled trials, which would pool only the within study differences between experimental and control groups, because it cannot control for different treatment settings of patient populations.

Evidence Table 27, which displays the results of this exploratory analysis, shows the raw and adjusted frequencies for the program years for Odderson, Keaton, and McKenna (1995) and Nilsson, Ekberg, Olsson et al. (1998), along with the difference between each program year and the historical control frequency we calculated, both raw and adjusted to a 2-week interval. Reduction of relative risk, adjusted and unadjusted, are also shown. The 95 percent CIs around the differences indicate that, for the raw or adjusted frequencies, the second year of the Odderson, Keaton, and McKenna (1995) study achieves statistical significance at an alpha level of 0.05 (the 95 percent CIs for the differences do not include 0), but that the reported frequencies in the Nilsson, Ekberg, Olsson et al. (1998) study do not achieve statistical significance (the CIs around the differences include 0).

Because our analysis is exploratory, it should be considered a descriptive analysis, and not one that attempts to ascribe causation. Nevertheless, this analysis is consistent with the results obtained when the Odderson, Keaton, and McKenna (1995) data are considered by themselves and are consistent with the idea that there are lower pneumonia rates in the second year of this program than in historical controls.

It is tempting to speculate that the reason that the Odderson, Keaton, and McKenna (1995) data were different from historical controls and from the Nilsson, Ekberg, Olsson et al. (1998) data is that the latter investigators diagnosed dysphagia on the basis of patient self-report, and the former used a more structured exam. This, in turn, would imply the superiority of the more structured exam. Such an explanation, however, is not warranted because, although the results of the Nilsson et al. study were not significantly different from those of the controls, they were also not different from those of Odderson and colleagues.

We can also speculate that the reason that Odderson, Keaton, and McKenna (1995) found results that were significantly greater than our historical controls and Nilsson, Ekberg, Olsson et al. (1998) did not is because the latter had fewer patients (72 versus 124 in the Odderson et al. study). Stated in another way, the statistical power of the comparison of Nilsson, Ekberg, Olsson et al. (1998) data to the historical controls was too low. However, this argument cannot be proven, because we cannot demonstrate that the results of the comparison of the two studies would have been different if Nilsson et al. had used more patients.

Because of the issues related to statistical power that surround this comparison, we also calculated the proportional RRR brought about by incorporating a swallowing program into stroke patient management. For the raw numbers this ranged from 55 to 100 percent RRR, and for the frequencies adjusted to a 2-week interval this ranged from 44 to 100 percent reduction in risk. We did not conduct a significance test on these reductions, and provide them only to illustrate that the same trend exists in both the Odderson, Keaton, and McKenna (1995) and Nilsson, Ekberg, Olsson et al. (1998) data. Again, these data are suggestive that dysphagia management programs are efficacious, but cannot be conclusive.

For these reasons, the results of our exploratory analysis are equivocal. It is therefore prudent to rely more heavily on the results of the internally controlled analysis of the Odderson et al. data. This evidence indicates that a reduction in the frequency of aspiration pneumonia is correlated with the presence of a dysphagia diagnosis and treatment program.

Does Use of Videofluoroscopic Swallowing Studies (VFSS) in A Dysphagia Program Reduce Pneumonia Rates? (Link d5 to m1)
VFSS in acute-care settings

Next, we examined whether using VFSS (see Evidence Table 24 for study descriptions and Evidence Table 27 for results) in dysphagia programs reduces aspiration pneumonia rates. We found no relevant controlled trials or historical prospective case series. Rather, we found only one case series performed at an acute-care facility that enrolled consecutive stroke patients and diagnosed them using VFSS (Daniels, Brailey, Priestly et al., 1998). This study was conducted at a Veterans Administration medical center. All stroke patients were given a MBS exam, except for 4 percent (2/55) whose mental state did not permit the exam. Patients with dysphagia were treated using one or more of: swallowing therapy, compensatory strategies, diet alteration, or nonoral feeding. This study reported a pneumonia frequency of 1.8 percent (1/55; 95 percent CI = 0.3 to 9.6 percent) over an interval of 3 months.

Exploratory analyses using pooled historically controlled data

In another exploratory analysis, we compared the results of the study noted in the previous paragraph with the above historical no-program controls. The raw difference between the pneumonia frequency in this study and these controls (Evidence Table 26) was 7.4 percent (95 percent CI = 0.5 to 9.6 percent). The frequency adjusted to a 2-week interval (1.8 percent; CI = 0.3 to 9.6 percent) was nearly identical to the nonadjusted frequency. The adjusted difference was 6.4 percent (-1.5 to 8.5 percent), which was not statistically significant. The relative reduction of risk for the raw values was 80 percent, and the adjusted value was 78 percent.

As with the above study by Nilsson, Ekberg, Olsson et al. (1998), there is a possibility that the failure to achieve statistical significance in this VFSS study was due to its small size (55 patients) and consequent low statistical power. In view of the statistical significance of the more powerful (90 control year patients, 124 program year patients) results of Odderson, Keaton, and McKenna (1995) for their program using a BSE on only 39 percent of stroke patients and considering that VFSS apparently detects patients with silent aspiration who may not be detected by the BSE, it seems unlikely that a program using VFSS on all patients would do worse in preventing aspiration pneumonia than a BSE program. Thus, the interpretation most consistent with the Odderson et al. study and the ability of VFSS to detect more patients with aspiration is that the trend reported in the Daniels et al. VFSS study is real, but the study had statistical power too low to detect the effect with statistical significance (a type II error in statistical parlance). However, this argument is not conclusive, because it does not prove that the comparison involving the Daniels et al. study would have yielded different results had they enrolled more patients.

One way to compensate for the low power of the Nilsson et al. and Daniels et al. studies is to determine whether taking the results of all three dysphagia programs together, despite their obvious diagnostic and treatment differences, yields a result that is significantly different from the result obtained in the combined historical control group. We did this in a third exploratory analysis, in which we examined the CI around the difference between the mean of the three program groups and mean of the historical control group (see Evidence Tables 26 and 27). The differences, both unadjusted and adjusted to a 2-week interval, and the appropriate CIs are shown at the bottom of Evidence Table 27. The unadjusted difference is 8.0 percent (95 percent CI = 5.4 to 9.8 percent), and the difference adjusted to a 2-week interval is 6.9 percent (4.3 to 8.6 percent). Neither of the confidence intervals include 0, demonstrating statistical significance at an alpha level of < 0.05. This statistical significance is likely due to a real effect, either of the dysphagia programs or of any several possible confounding factors.

This statistically significant difference also strongly suggests that low statistical power is, indeed, a problem in these individual studies. Further, it supports the idea that reductions in aspiration pneumonia are correlated with the presence of a dysphagia diagnosis and treatment program. This difference, however, does not demonstrate that dysphagia management programs cause a reduction in aspiration pneumonia. This is because this exploratory analysis cannot compensate for possible confounding of results, for example, by differences in patient characteristics between the program studies and the historical controls or other unspecified differences in provider practice or hospital policy.

The results of one BSE program study were statistically different from those of historical controls but the results of the VFSS program study were not. However, it would be a mistake to conclude that the two tests were of equal efficacy from these data. This is for several reasons, the first of which is the experimental confounds that hamper reaching conclusions from our comparisons. Equally worthy of mention, however, is that the statistical power of these studies is too low to detect the apparently small difference between programs. If two of these three studies did not have sufficient power to detect the substantial apparent difference between the program results and the historical controls, then they must clearly have much less power to detect the much smaller apparent differences among the three program results. We have calculated in the Future Research section of this report that it would require tens of thousands of patients to detect a meaningful difference (25 percent reduction of relative risk) between two diagnostic programs. Therefore it would be wrong to interpret these data to indicate that the BSE is as good or better than VFSS, or vice versa, because these studies are too small to resolve the expected small difference between these methods.

VFSS in rehabilitation care settings

We found one relevant U.S. study conducted at a rehabilitation center (Evidence Table 29). This study, performed by DePippo, Holas, and Reding (1994), reported a frequency of pneumonia of 6.5 percent (9/139) over a mean LOS interval of 9 weeks. Patients judged clinically to have dysphagia were given an MBS exam followed by compensatory training and/or a modified diet. Prior to admission to the rehabilitation unit, 13 percent of these patients had pneumonia during acute care (mean LOS 5 weeks). It is not clear how many of them were exposed to a dysphagia program during acute care; however, pneumonia frequency during acute care is the same as the no-program frequency reported by Young and Durant-Jones (1990), but over a longer interval (5 weeks versus 31 days), and only for those patients who went into rehabilitation, not consecutive stroke patients.

We found a Canadian study (Teasell, McRae, Marchuk et al., 1996) reporting a pneumonia frequency of 2.7 percent (12/441) at a rehabilitation center, with an unreported LOS. Clinical suspicion of aspiration resulted in 24 percent of these consecutive stroke patients being given an MBS exam. High-risk patients received one or more of modified diet, compensatory swallowing techniques, or NG or gastrojejunal tubes. We also found an Israeli study (Gottlieb, Kipnis, Sister et al., 1996) that used a 50-ml water test BSE for a stroke dysphagia program. They reported a pneumonia frequency of 16 percent (8.1 to 27 percent) for their first 60 patients, 12 percent (5.8 to 22 percent) for the second 60, and 3.3 percent (0.9 to 11 percent) for the third 60 patients. Their overall mean was 10 percent (6.5 to 156 percent). Their followup intervals were between 7.4 and 8 weeks.

We could find no historical control data for aspiration pneumonia frequency for stroke patients in rehabilitation centers. Without either internal or historical controls, the results of these studies cannot be interpreted.

Does Use of FEES in A Dysphagia Program Reduce Pneumonia Rates? (Link D6 to M1)

We located one recent historical prospective case series, published in two articles, in which FEES was used to assess swallowing in patients referred to an otolaryngologist and SLPs in two nursing homes [Spiegel, Creed, Selber et al., unpub.(a); Spiegel, Creed, Selber et al., unpub (b)] (Evidence Table 29). The patients received individualized diet therapy depending on the FEES results. The authors found no cases of pneumonia during a 6-month period among the 85 referred patients; whereas, in the same 2 nursing homes there were 11 cases of aspiration pneumonia in the 6 months prior to instituting systematic use of FEES. The authors do not report the total number of patients for the pre-program period. If we assume that approximately the same number of patients were involved in the previous 6 months, the frequency of pneumonia would have been approximately 13 percent (95 percent CI = 7.4 to 22 percent), a statistically significant result. These frequencies are not directly comparable to the above frequencies for three reasons. First, these were nursing home patients. Second, these were not stroke patients, but rather patients with mixed etiology of dysphagia. Finally, these pneumonia frequencies were not for the whole cohort of patients, but only for those suspected of dysphagia and referred to dysphagia specialists. Nursing home stroke patients would be expected to have a lower frequency of stroke-induced aspiration pneumonia than do patients in acute-care or rehabilitation facilities because they are further from the event of the stroke. However, they do not comprise the whole stroke cohort, but only the most serious cases that cannot be rehabilitated to the point of home care. Also, within the nursing homes, the selected group of patients who are suspected of dysphagia and referred to swallowing specialists would be expected to have a much higher frequency of pneumonia than patients not referred.

As with the Odderson, Keaton, McKenna (1995) study, this study lacks information on the comparability of the patients before and after the FEES dysphagia program was introduced, and the lack of a control group makes it possible to ascribe the apparent effect of FEES to some unspecified change in physician practice, patient management, or nursing home practice and policies. Therefore, although it is possible to conclude that the introduction of FEES was correlated with a decrease in pneumonia rates, we cannot conclude that FEES was the cause of this decrease. In addition, the authors did not report the number of patients with clinically suspected dysphagia. Finally, there is no assurance that the tracking and determination of aspiration pneumonia cases were carried out in exactly the same way both years, preferably blinded to FEES diagnosis results and treatment. Nevertheless, and again as with the Odderson et al. study, the relative reduction in risk after introduction of FEES was 100 percent, a large apparent effect than cannot be dismissed lightly. This result is important because it was obtained in patients clinically suspected of dysphagia, a group known to have a high risk of aspiration pneumonia in any care setting (see Appendix B and Appendix C for further details).

Also similar to the Odderson et al. study is that our estimated absolute difference between the pre-program and program periods (13 percent; 95 percent CI of 5.9 to 22 percent) is statistically significant. Although these latter calculations are based on the assumption that the same number of patients were seen during the pre-program and program years, the statistical significance of this result is unlikely to be overturned by any reasonable imprecision in this assumption. Thus, we calculated that the pre-program year had to have 228 patients clinically suspected of dysphagia, or more than 2.6 times the number of clinically suspected dysphagia patients referred during the program year, for the difference between the pre-program year and the FEES program year to become nonsignificant at an alpha level of 0.05.

Unevaluable studies

We retrieved a number of studies that we did not further consider in our analysis. These were case series studies without historical controls that only reported the pneumonia frequency for patients clinically suspected of dysphagia and referred for dysphagia assessment by specialists. Because of differences in referral patterns and patient mixes, it was not possible to make comparisons between these studies. In addition, we could not devise any method for determining historical control pneumonia frequencies with which to compare these disparate studies. Some of these studies had two or more arms comparing different diagnostic methods or treatments, and these are presented below in sections dealing with treatment comparisons. However, none of them contained no-program arms that would allow assessment of the efficacy of whole diagnosis-treatment programs compared with no intervention.

Conclusions

The conclusions in this subsection concern comparisons of pneumonia rates obtained with and without dysphagia diagnostic and treatment programs. The above-evaluated studies of Odderson, Keaton, and McKenna (1995), Daniels, Brailey, Priestly et al. (1998), and Spiegel, Creed, Selber et al. [unpub.(b)] have provided data (albeit data that are limited, and potentially confounded) that are consistent with, but do not prove, the idea that use of a BSE, VFSS, or FEES in a dysphagia diagnosis and treatment program may substantially reduce pneumonia rates in stroke patients with dysphagia. These data do not allow us to specify exactly how successful these programs are (i.e., how many cases of pneumonia they would prevent in a larger context), nor do they allow us to conclude definitively that these programs caused the reduction in pneumonia frequency (as noted above, that could have been caused by changes in hospital policy, physician practice, or some other unspecifiable factor). Nevertheless, it would be imprudent to dismiss the substantial effects observed in these studies, and thus they provide evidence about a question that is at the fifth level, near the top, in the hierarchy of the societal interest of the questions that we describe in the Evidence Model section of this evidence report.

The suggestion of apparently substantial efficacy of these dysphagia programs, while derived from historical prospective case series and case series compared with historical controls (which are typically not as reliable as RCTs), indicates that conducting a randomized controlled trial in which some patients were not allowed to participate in a dysphagia program would be unethical. Additional data from dysphagia programs and from historical controls or contemporary controls in institutions or geographic areas without dysphagia programs would be helpful in assessing the magnitude of the effect of dysphagia programs in preventing pneumonia.

Because of the lack of even a historically controlled study of a VF dysphagia program, because of a lack of studies of the long-term patient outcome of interest (prevention of aspiration pneumonia), and because of the small size of the one reported uncontrolled case series, there is no statistically significant efficacy demonstrated for VF studies in the prevention of aspiration pneumonia (although there is an apparent trend in the one study reported). Further, it has not been demonstrated that predictions of aspiration pneumonia derived from videofluoroscopy are superior to those derived from the BSE or FEES, although there are some studies demonstrating that VFSS can detect substantially more patients with the surrogate endpoint of aspiration than the BSE. This lack of statistically significant evidence does not mean VF swallowing diagnosis and treatment programs have no effect. Rather it reflects the lack of hard experimental evidence.

This section has only dealt with comparisons of diagnostic methods used in dysphagia programs compared with the no-program situation. Direct comparisons among BSEs and instrumented exams are discussed below in Questions 2 and 3.

Diagnostic Test Guidance of Treatments

The results of a diagnostic test are typically used to determine the course of subsequent treatment. The diagnosis of aspiration or dysphagia per se will simply determine that treatment is necessary; more specific symptoms (such as the severity of aspiration) will determine the specific treatment applied. Clinicians may do this to different degrees- while some clinicians may simply determine treatment on the basis of gross manifestations (such as the presence or absence of aspiration), other clinicians may use a more specific technique, such as assessment of particular physiologic functions, as predictors of which treatment will be most effective.

One implication is that a description of the diagnosis that guided treatment is relatively uninformative unless the outcomes after the treatment are described as well. A study may report, for example, that detection of aspiration of more than 10 percent of bolus volume led to the placement of a feeding tube. If the study does not report the outcomes from these patients placed on feeding tubes, and compare them with other patients placed on feeding tubes using different diagnostic criteria, there is no way to determine if the influence of the diagnostic test on treatment helped the patient, had no beneficial effect, or harmed the patient.

Outcomes reported after the treatment should be clinically meaningful. It is not sufficient to report that the pharyngeal transit time as measured by videofluoroscopy was significantly reduced after tactile-thermal stimulation (TTA). Rather, the outcome measure must in some way directly measure the patient's well-being. Clinically meaningful outcomes are occurrence of pneumonia, dehydration, malnutrition, QOL, and mortality. Intermediate or surrogate measures that would fit these criteria would be occurrence of aspiration, coughing, or choking; these are all measurable outcomes that directly affect the patient.

Diagnostic technologies can be used to guide treatment in two ways: the diagnostic can detect signs that predict the success of future treatments, or the diagnostic can be used while treatments are being attempted. Videofluoroscopy has often been used in the latter way. However, while videofluoroscopy may indicate that a particular treatment is effective in the first 5 minutes, it cannot determine whether the patient is able to continue receiving benefit long term.

Many studies, most using videofluoroscopy, have used a diagnostic technology to guide or assess treatment. This has been done to assess the appropriateness of specific dietary characteristics (Bisch, Logemann, Rademaker et al., 1994; Logemann, Pauloski, Colangelo et al., 1995), postural changes (Ekberg, 1986; Logemann, Kahrilas, Kobara et al., 1989; Rasley, Logemann, Kahrilas et al., 1993; Shanahan, Logemann, Rademaker et al., 1993), and other applied treatments (de Lama Lazzara, Lazarus, and Logemann, 1986; Rosenbek, Robbins, Willford et al., 1998). Only one study used a diagnostic method other than videofluoroscopy to make treatment recommendations (Eibling, 1994). No studies have reported the use of symptoms detected using any other diagnostic method to guide the course of treatment.

Does Diagnosis-Directed Diet Modification Improve Swallow Function? (Links D1 - D8 to T2)

Evidence Table 30 shows two studies that used specific findings on videofluoroscopy to guide diet modification. Logemann, Pauloski, Colangelo et al. (1995) tested the use of a sour bolus at different volumes on several physiologic swallow measurements in 27 patients with neurogenic dysphagia (19 stroke, 8 other) who demonstrated delayed onset of oral stage or delayed pharyngeal swallow. The theory behind this test is that boluses providing a great deal of sensory information will be more easily swallowed, and that patients with dysphagia often have problems swallowing because the food they are given is very bland. The results indicated that for both stroke and other neurologic patients, the sour bolus did improve several physiologic swallow measures compared with an unflavored bolus. Bolus volume also had a significant effect, with the larger volume (3 ml versus 1 ml) causing improved swallow function. Aspiration was eliminated in three patients who demonstrated it on normal boluses. Stroke patients appeared to reap the most benefit from this treatment. However, only physiological measurements were provided as outcomes of this treatment; the only clinically meaningful outcome was an anecdotal remark that the patients did not like the sour bolus. This indicates that the use of this particular treatment may be impractical, as patients would be reluctant to continue it voluntarily. Otherwise, there was no indication of how this treatment would affect patient well-being; we can only infer indirectly that these improved swallow measures would lead to decreased morbidity and increased QOL.

Bisch, Logemann, Rademaker et al. (1994) tested the use of cold boluses in two volumes and two viscosities with 18 neurologic patients (10 stroke with mild dysphagia; 8 other patients with neurologic disorders and moderate to severe dysphagia) and 10 controls. Again, the theory of this test is that sensory stimulation will improve swallow function. On videofluoroscopy, the cold 1-ml bolus improved several swallow measures in the first-time stroke patients; increased volume and pudding viscosity resulted in shorter pharyngeal delay time. For patients with other neurologic conditions and with moderate to severe dysphagia, there were no temperature effects. Increased volume improved measures, and pudding improved measures compared with thin liquids. This study did not report any clinically relevant outcome measures, but does seem to indicate that cold bolus treatment is more useful for those with mild dysphagia than those with more severe problems. As found in Logemann et al. (above), 1-ml volume appears to be less safe than 3-ml.

One study (not tabled) reported anecdotally on eight patients who were examined using FEES to determine if their diet should be modified. Of eight patients who demonstrated significant swallow delay, it appeared that the longest delay was on the first swallow. Therefore, the researchers recommended that such patients be given a warmup swallow with ice or water to increase safety (Eibling, 1994).

Evidence from this field therefore suggests that boluses with high sensory information may help neurologic patients safely swallow, as does increased volume and thickened viscosity.

Do Diagnosis-Directed Postural Techniques Improve Swallow Function? (Links D1 - D8 to T1)

Evidence Table 31 shows four studies that used videofluoroscopy to assess different postural techniques. Logemann, Kahrilas, Kobara et al. (1989) found that a head turn 90 degrees to the weaker side in unilateral stroke patients caused swallow success to improve 33 to 65 percent. Rasley, Logemann, Kahrilas et al.(1993) also tested the head turn in a variety of neurogenic dysphagics with aspiration, and found that head rotation eliminated aspiration on five different volumes in 26 percent of those tested (20 out of 77).

The chin tuck is another frequently used postural technique. Ekberg (1986) tested 53 patients with dysphagia using a forward head tilt, and found that, of 18 patients who demonstrated defective laryngeal vestibular closure in the normal head position, 50 percent (nine patients) developed a normal swallow with the head tilt. Rasley et al. (1993) found that a chin tuck eliminated aspiration for 25 percent of patients (21 out of 84). Shanahan, Logemann, Rademaker et al. (1993), in a study to identify signs that would predict chin tuck success, found that those patients unsuccessful with chin tuck (50 percent in this study) were more likely to experience aspiration from the pyriform sinus, versus successful patients who experienced it from the valleculae.

Thus, in studies using VFSS to evaluate postural techniques, the only clinically meaningful measure that has been reported is the elimination of aspiration in two studies; these studies found that approximately 25 percent of patients were helped by postural techniques. From the other studies, we can again only infer that improvements in swallow measures would lead to a safer swallow and lower incidence of morbidities, but this has yet to be reported.

Do Other Diagnosis-Directed Treatments Improve Swallowing Function? (Links D1 - D8 to T1 - T4)

Two studies have used videofluoroscopy to assess thermal stimulation as a treatment for dysphagia (see Evidence Table 32). De Lama Lazzara et al. (1986) examined the efficacy of thermal stimulation of the anterior faucial arches in dysphagics with swallow reflex delay. Videofluoroscopy results indicated that pharyngeal transit time and total transit time were reduced immediately after treatment; however, this effect diminished within two to three swallows after the stimulation. This suggests that thermal stimulation is not a very practical treatment approach.

Rosenbek, Robbins, Willford et al. (1998) assessed TTA at four different intensities: 150, 300, 450, or 600 trials per week. Videofluoroscopy results indicated that there was no significant effect of treatment intensity on duration of stage transition, penetration, or aspiration. Results did not indicate the efficacy of this type of treatment; however, they indicate that if this treatment is at all efficacious, a lower intensity level is safe and might be used to lessen person-hours.

These two studies did not provide clinically meaningful outcome measures. However, the use of videofluoroscopy during these treatment trials did provide practical considerations for two types of treatment: the results of one indicated that the treatment provided no long-term effects and therefore may be impractical; the other allowed for lower intensity of treatment.

Other published studies have reported that the use of videofluoroscopy or fiberoptic endoscopy played a role in the determination of appropriate treatment (Aviv, Kim, Sacco et al., 1998 ; Eibling, 1994; Feinberg, Ekberg, Segall et al., 1992; Ott, Hodge, Pikna et al., 1996), but provide no indication as to what particular signs or symptoms led to the choice of treatment, or what the outcomes from treatment were. These studies therefore tell us little about the efficacy of the diagnostic.

Conclusions Concerning Diagnostic Guidance of Treatment Management

Eight of nine studies discussed here used videofluoroscopy to assess treatment options. The most meaningful outcome, reported by three of these studies, was the occurrence of aspiration. All other measures reported were specific physiological swallow measures. These are considered a way of measuring the likelihood of having a safe swallow; however, they only imply changes in swallow safety. No studies have compared different diagnostic criteria in the determination of treatment to compare the relative predictive value of specific symptoms shown by the diagnostic tool. No diagnostic methodologies have been compared with one another as to their ability to guide treatment. In particular, no studies have assessed the ability to use specific symptomatology during a BSE to guide treatment.

2. What are the appropriate clinical indications or components of a preliminary BSE useful for referring patients for a full BSE or instrumented exams?

In this section, we address the question of determining the best clinical indications for referring patients for further diagnostic testing. This question is best answered by examining the signs and symptoms readily apparent during a history and physical or preliminary BSE that may predict future morbidity or mortality.

Noninstrumented diagnostic tests have been used to determine if a patient has dysphagia or if the patient has a special risk for dysphagia. But dysphagia is a poorly defined condition in which the patient's subjective experience, and hence QOL, plays a major role (see the section entitled Dysphagia Defined). While patient QOL issues are important, we could find no suitable QOL efficacy data by which to judge dysphagia diagnosis or treatment and therefore were unable to analyze this issue. Thus, in the first subsection below we consider the ability to predict pneumonia and in the second subsection the ability to predict aspiration resulting from dysphagia. However, because not all aspiration inevitably causes pneumonia, aspiration must be considered a surrogate endpoint for pneumonia and thus will be considered only as an outcome of secondary interest. In the third subsection, we consider any clinical signs or symptoms predictive of malnutrition that results from dysphagia.

What Clinical Signs and Symptoms Predict Pneumonia? (Link D1 to M1)

Recent studies have not been able to determine the absolute test characteristics [sensitivity, specificity, positive predictive value (PPV), Negative predictive value (NPV); for definitions see Appendix E] for signs, symptoms or BSEs. This is because treatment intervenes between the test and the outcome of pneumonia (for further discussion of this issue, see the section entitled The Relationship between Diagnosis and Treatment of Swallowing Programs above). In this situation, the test characteristics indicate the ability of the test to detect or predict only the unpreventable cases of aspiration pneumonia; we have no idea how many cases would have occurred without treatment. To the extent that some cases were prevented and false positives are unknown, sensitivity, specificity, and PPV would be underestimated, while NPV would remain unaffected. Nevertheless, within the patient population of a single study, it is possible to determine the relative measures of diagnostic examination performance carried out on the same subjects, provided that treatment following each exam is chosen in exactly the same way, and caregivers are preferably blinded as to which test was used to prescribe treatment.

We are interested in detecting as many of the patients at high risk for pneumonia as possible; thus, sensitivity needs to be high, over 80 or 90 percent (fewer than 10 to 20 percent of the high-risk patients would be missed). At the same time we want to minimize the number of false positives referred for the more expensive instrumented exam; thus, specificity needs to be at least moderate, say over 70 or 80 percent (fewer than 20 percent or 30 percent of the low risk patients would be wrongly referred for the instrument exam). Because, as discussed above, intervening treatment will result in underestimates of sensitivity and underestimates of specificity, we should possibly require 90 percent sensitivity and accept 70 percent specificity.

Evidence Table 33 shows clinical signs and symptoms for prediction of pneumonia in patients with and without dysphagia that might possibly predict pneumonia. One of the better and most recent studies of signs and symptoms was carried out by Langmore, Terpenning, Schork et al. (1998) in a prospective case series that analyzed the data using a multivariate analysis. The signs and symptoms that they determined to be independently correlated with pneumonia on long-term followup of up to 4 years are presented in the table. Although there was some correlation for these, all of the signs and symptoms had sensitivities below 50 percent, and only dependent-for-feeding had a PPV over 50 percent; its PPV was 65 percent. The low sensitivities and PPVs are all the more noteworthy because, as explained above, any successful efforts to prevent pneumonia would have falsely decreased sensitivity values and PPVs.

The signs and symptoms data reported by Harkness, Bentley, and Roghmann (1990) in a matched-control study show higher sensitivity and specificity than those examined by Langmore et al. (1998), but the methods are problematic. The biases in matched-control diagnostic studies are such that they typically overestimate measures of test performance. For a given diagnostic test, PPV and NPV are not set, but PPV increases as prevalence increases and NPV decreases (Fletcher, Fletcher, and Wagner, 1988). Yet in matched-control studies, prevalence is usually set artificially high, which overestimates PPV and underestimates NPV (see Appendix E). For example, in this study, each case in the disease arm was matched with two nondiseased controls; thus, pneumonia prevalence was artificially set at 33 percent, artificially inflating PPV and NPV for any population with a lower prevalence.

Sensitivity for a given test is also not fixed, but depends on the spectrum of disease severity (Brenner and Gefeller, 1997; Fletcher, Fletcher, and Wagner, 1988; Gann, 1996; Ransohoff and Feinstein, 1978). Most tests are more sensitive for severe disease (or risk). Case-control and matched-control studies usually preselect disease cases so that the spectrum of disease severity is not typical, which overestimates sensitivity. In the Harkness et al. study, only radiographically confirmed cases with one or more signs or symptoms possibly associated with pneumonia that occurred within an 18-week period were chosen for cases. Atypical cases of pneumonia that were not radiographically confirmed or that were slow to develop and had not yet occurred in the 18-week period may have been more difficult to predict but were not included in the sensitivity count, thus possibly inflating sensitivity. Specificity is also not fixed but is dependent on the presence of comorbidities with symptoms that overlap those of the disease of interest. In case-control studies, controls are usually selected to have no confounding comorbidities, which overestimates specificity. Such was the case in this study, in which controls were selected who had no obvious signs or symptoms of respiratory infection. To the extent that signs and symptoms of respiratory infection and dysphagia overlap, some patients with these overlapping symptoms were excluded from the controls even though they may have been falsely predicted to get pneumonia by these very signs and symptoms, thus artificially inflating specificity. It is not surprising then that the test characteristics given by the Harkness, Bentley, and Roghmann (1990) data are generally higher values than Langmore, Terpenning, Schork et al. (1998) reported, given the potential biases described above, and therefore are probably unrealistically inflated.

Conclusions

If there are any single clinical signs or symptoms that alone would be useful for referring patients for instrumented exams, they have apparently not been discovered, and more effort needs to be expended on this problem in future research. However, single signs and symptoms do not have to be used, and using combinations of them might better reflect clinical reality. If any one of several items is considered a positive test result for risk of pneumonia, then sensitivity will be increased, but at the expense of specificity. The low sensitivities and high specificities in Evidence Table 33 do indeed indicate that this would be a good strategy. The likely result of using such combinations would be to increase the sensitivity of these noninstrumented tests. Thus, the above studies and the data of Odderson, Keaton, and McKenna (1995) discussed above suggest that some clinical signs and symptoms can be combined into an algorithm that may achieve high enough sensitivity and specificity to be useful. This, in turn, suggests that future research should concentrate not on single signs and symptoms, but on combinations of them.

What Clinical Signs and Symptoms are Indicative of Aspiration? (Link D1 to S1)

Because aspiration does not inevitably cause pneumonia, it must be considered a surrogate endpoint for pneumonia, and thus of secondary importance. Nevertheless, because of our inability to determine true test characteristics for prediction of pneumonia (as discussed in the previous subsection), we were forced to consider an intermediate portion of the overall link between test effectiveness and the outcome of pneumonia. Therefore we now consider the link between clinical signs and symptoms on a BSE (D1 through D4) and detection of aspiration (S1).

Evidence Table 34 shows oral-pharyngeal signs and symptoms for detecting aspiration from several studies that included patients with mixed etiologies, and Evidence Table 35 shows signs and symptoms from studies that included only stroke patients. The endpoint of aspiration was determined by VFSS in these studies. While VFSS is not a perfect gold standard, for the purposes of these tables we must assume that it is reasonably perfect compared with clinical symptoms, and that all VFSS studies have reasonably similar high sensitivity and specificity for aspiration. The studies included in our tables for signs and symptoms had too many items to practically present here; therefore, we only tabled and only discuss here those signs and symptoms that were indicated by the authors or the evidence they presented to be the most useful signs and symptoms (either sensitivity or specificity over 60 percent).

For detection of aspiration, we would want to have signs and symptoms that are almost as good as aspiration detected at a VFSS above 80 percent sensitivity (less than 20 percent of patients with aspiration detected by VFSS would be missed by the sign or symptom), and we would want specificity to be moderately high, above 70 percent (no more than 30 percent of patients referred for VFSS would be needlessly tested by VFSS for aspiration). In general it can be seen in Evidence Tables 34 and 35 that items with higher sensitivities have low specificities. This is expected because of the inevitable tradeoff between sensitivity and specificity as the diagnostic threshold for a positive test is changed, either intentionally or unintentionally. Because of this tradeoff, sensitivity and specificity values are best considered as sensitivity-specificity pairs, and the relationship of the pair values to the threshold chosen by the examiner, or enforced by the symptom, is best appreciated on a plot of sensitivity versus specificity. Such a plot is known as a receiver operating characteristic (ROC) plot, and the regression curve for a set of sensitivity-specificity points is the ROC curve (see Appendix E for more details of this statistical method).

We examined the available data for each of the other individual signs and symptoms of the BSE, to determine whether they could be combined meta-analytically. Combining all the points from the different subtests into a summary ROC curve would not be an appropriate way to determine the effectiveness of the BSE. Such a curve would underestimate the sensitivity and specificity of the BSE because: (1) it would not consider the effects of employing multiple tests (results of other tests could cancel out a false-positive or false-negative result of any given test, making the whole more effective than the parts), and (2) some of the less-effective subtests are unlikely to be used in an optimized BSE, and therefore should not be allowed to affect the results of the meta-analysis. Also, there are only a small number of trials for some of the individual signs and symptoms (e.g., only one trial reported "multiple swallows"). For these reasons, we chose not to perform a meta-analysis on the data for each individual subtest and simply present these data in ROC space.

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is f14_F002.jpg.

   Figure 2. Sensitivity and Specificity of Oral-Pharyngeal Symptoms for Detection of Aspiration Plotted in ROC Space

A plot in ROC space of the individual signs and symptoms is shown in Figure 2. If an imaginary 45-degree diagonal line were drawn from lower left to upper right, it would represent points that are of no diagnostic value; such points would detect aspiration no better than chance. The top left corner represents a perfect test. The higher an ROC curve rises toward the top left corner, the more useful the test would be (both the sensitivity and specificity increase toward the top left).

Thus, some of these signs and symptoms look more promising than others, but there needs to be more research on this topic before a conclusion can be reached. As noted above, more useful than the test characteristics for these items alone are the characteristics of various combinations of signs and symptoms, and some of these authors reported results of combinations that we present in the next two tables.

Silent aspiration

A number of studies have reported that VFSS detects a substantial number of patients who aspirate silently, and whose aspiration may thus be missed by some BSEs (Garon, Engle, and Ormiston, 1996; Holas, DePippo, and Reding, 1994 ; Horner, Massey, Riski et al., 1988; Kidd, Lawson, Nesbitt et al., 1993; Leder, Sasaki, and Burrell, 1998). In these studies, the proportion of aspirators who aspirated silently ranged from 20 to 73 percent. We found only one study (Holas, DePippo, and Reding, 1994 ) that reported on the pneumonia outcome for silent aspirators. For silent aspirators, 16 percent (7/44) contracted pneumonia, compared with 6 percent (1/17) for nonsilent aspirators. Thus, the relative risk of developing pneumonia within 1 year for silent aspirators was 2.7 compared with aspirators who coughed. This was not statistically significant (RR 95 percent CI was 0.36 to 20.4, which includes 1.0); however, the low power of the comparison prevents the conclusion of no difference.

Silent aspiration was generally defined as absent or weak cough upon aspiration. Such silent aspiration is indeed indicated by the sensitivity data we discussed above. Detection of aspiration by cough has a sensitivity ranging from 57 percent (Daniels, Brailey, Priestly et al., 1998) to 84 percent (Horner, Massey, and Brazer, 1990) in Evidence Tables 34 and 35. Thus, if cough upon aspiration were the only criterion for detecting aspiration, a BSE would be expected to miss from 16 to 43 percent of aspirators. However, BSE assessment of aspiration is not limited to cough alone, but other signs and symptoms are available. Thus, the well-documented existence of silent aspiration does not automatically imply that BSEs are inadequate for detecting aspiration. The overall assessment of aspiration danger by BSEs is considered in the next subsection.

Meta-analysis of the 3-ounce Water Test: Sensitivity and Specificity for Aspiration

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is f14_F003.jpg.

   Figure 3. Sensitivity and Specificity of the 3-oz Water Test for Detection of Aspiration Plotted in ROC Space

Various versions of the BSE include some of the signs and symptoms discussed above along with other items. Evidence Table 36 shows the test characteristics obtained in several studies of BSEs for detecting aspiration confirmed by VFSS for patients with mixed etiologies. Evidence Table 37 shows BSE test characteristics for aspiration in stroke patients referred to VFSS with dysphagia symptoms. Evidence Table 38 is similar to Evidence Table 37, except that the former table presents results obtained from examining all stroke patients, not just those referred for VFSS. Sensitivities range from 0 to 91 percent. However, in general, specificity is high when sensitivity is low, and low when sensitivity is high. Four of the studies tabled used an abbreviated version of a full BSE, the 3-ounce (or 50-ml) water test. In Figure 3, these sensitivity-specificity points have been used to construct a summary ROC curve (Littenberg and Moses, 1993) (see Appendix E). The points of the individual signs and symptoms discussed above are also plotted. It can be seen that the 3-ounce water test summary ROC curve falls below our hypothetical curve passing through 80 percent sensitivity and 70 percent specificity. This indicates that, even if the threshold is somehow manipulated (which causes one to move along the curve as opposed to shifting the curve), this test is unlikely to achieve the necessary sensitivity without sacrificing too much specificity, and vice versa. Also, it can be seen that the 3-ounce water test curve is similar to the average ROC curve we would expect for the whole group of signs and symptoms (11 points are above and 12 points are below the 3-ounce water test curve). This indicates that the 3-ounce water test may not contain the best individual signs and symptoms available.

Meta-analysis of the Characteristics of the BSE: Sensitivity and Specificity for Aspiration

An external file that holds a picture, illustration, etc., usually as some form of binary object. The name of referred object is f14_F004.jpg.

   Figure 4. Sensitivity and Specificity of BSEs for Detection of Aspiration Plotted in ROC Space

In Figure 4, we plotted the summary ROC curve for the five studies that reported both sensitivity and specificity for a complete, formal BSE. These comprehensive BSEs produce a curve that is above both the 3-ounce water test curve and the average curve for all the individual signs and symptoms. Interestingly, the BSE curve falls very close to our hypothetical 80 percent sensitivity and 70 percent specificity curve. This reasonably high sensitivity of the BSE is one explanation for why the BSE dysphagia program reported by Odderson, Keaton, and McKenna (1995) (discussed under Question 1 above) had such a substantial apparent effect in reducing aspiration pneumonia spite of the likely existence of substantial silent aspiration (as defined by absence of cough). Full BSEs score several items that may increase their sensitivity for aspiration, and so are not limited to the single symptom of failure to cough.

The data in the above-mentioned tables and figures indicate that only a BSE using more and/or better signs and symptoms than the 3-ounce water test would be appropriate for detection of aspiration and/or referral of patients to an instrumented exam. In fact, these data, along with the effectiveness of the BSE program reported by Odderson, Keaton, and McKenna (1995), call into question the need for instrumented exams to detect aspiration. The 80 percent sensitivity of the BSE for aspiration using VFSS as a gold standard indicates that VFSS would only detect an additional approximately 20 percent of patients who aspirate. Instrumented exams, particularly those providing imaging, seem likely to be more useful than a BSE for guiding treatment and therapy, because of the additional physiological and functional information they provide. However, the effect of this additional information on patient outcomes such as pneumonia, malnutrition, dehydration, and QOL has been poorly demonstrated. Also, as discussed above under Question 1, the approximately equal effectiveness of the Odderson BSE program (Odderson, Keaton, and McKenna, 1995) and the Daniels VFSS program (Daniels, Brailey, Priestly et al., 1998) in preventing pneumonia indicates that the added benefit of VFSS over BSE is likely to be small, and, in fact, too small to be detected by anything other than a study much larger than either of these.

The potentially small benefit that accrues to patients as a result of using VFSS, and its somewhat greater cost than the BSE, mean that a comparison of the BSE with VFSS is primarily a cost-effectiveness issue. We address cost-effectiveness in our supplemental analysis.

Conclusions

Single symptoms detectable on clinical exam do not appear to reliably predict aspiration. It seems likely that there should be some combination of clinical oral-pharyngeal signs and symptoms that can predict aspiration with adequate sensitivity and at least moderate specificity. The approximately 80 percent sensitivity and 70 percent specificity for aspiration of full BSEs including several signs and symptoms bears this out, and could possibly be improved upon. Further research on this topic is of great interest, but it is important that researchers understand the tradeoff involved in the sensitivity-specificity relationship. They may be able to reach a consensus on some combination of these values that achieves both adequate sensitivity and specificity for prediction of aspiration or aspiration pneumonia. The reasonably high sensitivity and specificity of BSEs raises the issue of whether a BSE could entirely replace instrumented exams. The possibly small improvement of instrumented exams over BSEs and the increased cost of instrumented exams present a cost-effectiveness question that is discussed in our supplemental analysis. This question also raises the issue of the effect of diagnostic exam findings on subsequent treatment.

What Clinical Signs and Symptoms Predict Malnutrition? (Link D1 to M2)

Malnutrition is a significant predictor of mortality in the elderly (Elmstahl, Persson, Andren et al., 1997; Keller, 1995; Murden and Ainslie, 1994; Ryan, Bryant, Eleazer et al., 1995; Wallace, Schwartz, LaCroix et al., 1995). It can also lead to other morbidities through a weakening of the immune system (Chandra, 1990). It is therefore important to enact measures to prevent malnutrition if it is determined to be prevalent in a population of patients with dysphagia. To do this, it would be helpful to identify any signs or symptoms that predict eventual malnutrition in these patients.

The relationship between malnutrition and diagnosis of dysphagia is difficult to determine from available evidence. This is partly because the administration of parenteral or enteral nutrition is often introduced if there is even a threat of malnutrition, so the number of patients whose dysphagia might lead to malnutrition may be at least partially obscured. In addition, diagnosis of dehydration and malnutrition does not require extensive diagnostic methods for detection. For these reasons, in acute care, malnutrition may be largely dealt with outside of the context of formal diagnosis and treatment of dysphagia. This could explain why few studies have attempted to determine the extent to which (or even whether) dysphagia causes malnutrition. There is, however, the possibility that for some patients diagnosis of dysphagia may lead to therapy that prevents malnutrition and therefore obviates the need for feeding tubes.

There is limited evidence about the prevalence of malnutrition in patients with dysphagia, and whether patients with dysphagia are at increased risk for malnutrition compared with individuals with normal swallows. Current evidence on this is equivocal, as two studies have reported a significant association (Davalos, Ricart, Gonzalez-Huix et al., 1996; Keller, 1995) and another found no association (Thomas, Verdery, Gardner et al., 1991). Only one of these studies was specifically about acute-care stroke patients, and it found that 1 week after stroke, patients with dysphagia were significantly more likely to be malnourished than those with a normal swallow (Davalos, Ricart, Gonzalez-Huix et al., 1996).

However, no published studies examine signs or symptoms that may predict malnutrition in patients with dysphagia. Given that an association between dysphagia and malnutrition has not been definitively made, there may be no reason for such studies to be conducted. Therefore researchers must definitively document a link (if one does exist) between dysphagia and malnutrition in specific patient populations to determine the burden of illness in this population; then clinicians can work on ways to detect potential malnutrition risk factors before malnutrition occurs.

What Clinical Signs and Symptoms Predict Dehydration? (Link D1 to M2)

Dehydration may be a particularly important outcome in regard to neurologically disabled patients with dysphagia because they most often suffer from an inability to swallow thin liquids safely; less common is the inability to swallow thick liquids and solids. However, this is a logical inference, rather than one based on clinical studies. While clinicians recognize the importance of measuring dehydration (Langmore, 1998 ; Logemann, 1998b; Yorkston, 1998), few studies have yet been conducted on this outcome. Currently, no studies examine the predictive value of clinical signs and symptoms for development or presence of dehydration.

3. Is there any evidence that one diagnostic technology is preferable/is more effective/gives more information than any other diagnostic technology?

In addressing this question, we address whether any of the diagnostic exams is better than another. Better in this section refers to the ability of a diagnostic test to accurately detect or predict the presence or absence of important outcomes. In this section, we pay particular attention to videofluoroscopy because the MBS has been the reference standard for diagnosis of dysphagia, detection of aspiration, and prediction of patients at high risk for pneumonia (Logemann, 1983b). Thus, we begin this section by examining the evidence on the reliability of VFSS results. In the subsequent subsections, we then compare videofluoroscopy with certain of the instrumented tests that have been introduced in recent years. In making these comparisons, we included only articles that directly compared videofluoroscopy and another instrumented diagnostic exam.

The outcomes discussed here are pneumonia and aspiration (as a surrogate measure of pneumonia). Other outcomes are of clinical significance and interest; however, pneumonia is the only outcome on which researchers in this field have focused. Therefore, mortality, malnutrition, dehydration, and QOL will not be discussed here.

How Reliably do Videofluoroscopic Studies Detect Aspiration? (Link D5 to S1)

Interobserver Variability of VFSS

While workers in the field have accepted the VFSS as the reference standard, there is no evidence showing that this is as a perfect gold standard, so this test likely yields some false-positive and false-negative results. It may also be the case that aspiration detected by one clinician will not be detected by another. In this case, it would be difficult to determine which clinician is correct. Evidence Table 39 shows the three studies we found that examined interobserver variability for detection of aspiration on several different bolus viscosities. The study by Ekberg, Nylander, Fork et al. (1988) reported a kappa value of 0.83 for independent observations of 6 radiologists on 72 patients. This is a moderately high kappa value.

The study by Perlman, Booth, and Grayhack (1994) reported 100 percent agreement between independent observations of 2 speech pathologists on 33 patients. The study was conducted by having the 2 SLPs retroactively review videotapes of 33 patients in slow motion, frame-by-frame. There were at least six swallows for each patient, presumably of different consistencies. While some SLPs may sometimes review tapes over and over, frame-by-frame, the more typical situation is that the SLP views the exam in real time and makes decisions and starts therapy measures on the spot. Although there may also be some post-exam review of the tapes before writing a report, it seems unlikely that most SLPs routinely carry out extensive post-exam frame-by-frame or slow motion review. Therefore, this study may not be generalizable.

The study by Kuhlemeier, Yates, and Palmer (1998) reported interobserver variability for each consistency of the bolus separately. They conducted an all-way pairwise comparison among nine raters (four physicians and five SLPs). Furthermore, they reported both positive and negative agreement. The negative agreement was very good, being above 98 percent for all consistencies. However, the positive agreement showed considerable interobserver variability. Juice had the most positive agreement with 85 percent. Beef stew had the lowest with 0 percent. The three other consistencies, cookie, spread, and nectar, all had positive agreement below 20 percent.

Conclusions

These studies together indicate that there is fairly good interobserver agreement on aspiration of thin liquids, but that thicker liquids and solids present a problem of interobserver agreement. Unfortunately, the inter- and intraobserver variability of a diagnostic technology are used as the reference standard for judging all newer technologies, yet they have not been thoroughly studied. Potentially, the absence of interobserver agreement on aspiration of solids and thicker consistencies could present a risk to some patients, because in the standard procedure (Logemann, 1983b), these consistencies are tried first as indications of whether it is safe to proceed on to thin liquids. We were unable to find any evidence that addressed this issue, and there is likely to be little risk for stroke patients, who usually have more difficulties when swallowing thin liquids.

Which is a Better Predictor of Pneumonia: BSE or Videofluoroscopy? (Links D4 and D5 to M1)

There is a general assumption in the field of dysphagia research that VFSS (or any instrumented exam for that matter) is better than any noninstrumented exam. This hypothesis needs to be tested in clinical trials. The sensitivity and specificity of the BSE for detection of aspiration using VFSS as a gold standard was discussed above in Question 2. However, the use of VFSS as a gold standard requires the assumption that VFSS is 100 percent accurate, which may not be the case. Thus, in order to compare the BSE with VFSS, a clinical outcome must be used as the gold standard. However, this is problematic because measuring effectiveness only in terms of the ability of the test to predict pneumonia is unreliable, because treatment intervenes and will bias the results. Examples of this will be presented below.

Within-study Comparisons of VFSS and BSEs

We located two trials that compared the abilities of VFSS and the BSE to predict clinical outcomes, in terms of test performance (Reynolds, Gilbert, Good et al., 1998; Smithard, O'Neill, Parks et al., 1996). They are described in Evidence Table 40. Some test characteristics, however, are not generalizable but apply only to patient populations with the same prevalence as the study population, thus limiting the generalizability of these results (see Appendix E). Also, as mentioned above, treatment biases the measurement of test performance by decreasing sensitivity, specificity, and PPV. Assuming treatment is given only to those who test positive, treatment will have no effect on NPV (this is further explained in Appendix G).

Smithard, O'Neill, Parks et al. (1996) followed 117 consecutive stroke patients who presented at an acute-care hospital within 24 hours of the stroke event. All patients received a BSE and 94 also received a VFSS; clinicians performing one test were blinded to the results of the other. If the BSE indicated that swallowing was unsafe, parenteral feeding was initiated; diet modification was subsequently instituted based on findings from both diagnostic tests. The abilities of BSE and VFSS to predict chest infection and death were compared in terms of measures of test performance.

Although Smithard and colleagues report that the BSE was superior to VFSS in its sensitivity, inferior in specificity, and similar in PPV (see Evidence Table 40 for details), these findings do not take into account the apparent spectrum bias in this study. Spectrum bias is introduced when the patients selected for one diagnostic or treatment differ from the patients selected for the other, and the difference is likely to affect the results. In this instance, this bias manifests itself because patients who received only the BSE were apparently sicker than those who received both the BSE and VFSS. Specifically, patients who did not receive the VFSS were those with reduced consciousness or medical unfitness for the test. The BSE, however, was administered to all patients, including these sicker individuals. The fact that some of the patients who received the BSE alone were sicker than those who received the VFSS is supported by the death statistics provided by the report. Twenty-one patients died over the course of the study; all of whom contributed data to the BSE results. Only 10 of these patients contributed data to the VFSS results. If 8 of the 11 patients not given VFSS would have aspirated during the exam (which is entirely reasonable because of their relatively ill health), the diagnostic effectiveness of the BSE and VFSS would not have been significantly different, as tested by the chi-square statistic.

Smithard and colleagues base their conclusion that VFSS did not appear to add greatly to this risk profile partly on the absence of a statistically significant correlation between the results of the VFSS examination and clinical outcomes of chest infection and mortality. In other words, patients that the VFSS predicted would have these outcomes did not significantly develop them. As a result, observed VFSS sensitivity was much lower than observed BSE sensitivity. As suggested in the previous paragraph, this finding may be the result of spectrum bias, and VFSS may actually be equal to or more effective than BSE. If Smithard et al. had published the data only for the subset of BSE patients who also had VFSS, this would have permitted a less biased comparison of the two diagnostic tests, even though their absolute sensitivity and specificity would still be affected by treatment intervention. Given the data that were reported, it is not possible to draw conclusions from this study about which test is more effective for identifying patients at elevated risk for pneumonia.

NPVs from the trial of Smithard et al. are not affected by treatment bias, and since the patients who died before they could undergo VFSS are not likely to have had negative results, the selection bias is unlikely to have affected NPV. For both outcomes (chest infection and death), the NPVs of both tests are high (84 percent NPV for chest infection, 91 to 94 percent NPV for death). This suggests that both the BSE and the VF examination would be reasonable tests for determining which patients need no further dysphagia followup.

The other direct comparison of VFSS and BSE was by Reynolds, Gilbert, Good et al. (1998). This was a retrospective case series, including 102 acute stroke patients who were given a BSE and then referred to videofluoroscopy if oropharyngeal dysfunction was suspected; those patients exhibiting no dysfunction on BSE were not followed; this patient selection bias has the result of inflating the sensitivity and lowering the NPV reported on this test. Silent aspirators (those who did not cough while aspirating on VFSS) were excluded from the study; this may reduce any advantage VFSS might have over BSE as a result of improved detection of these silent cases. Subsequent treatment included diet modification and treatment recommendations (Reynolds, Gilbert, Good et al., 1998).

Reynolds et al. properly limited their internal comparisons to the subset of patients who had both the barium swallow and the standardized BSE. However, treatment intervened between the diagnostic procedure and the outcome measurement (pneumonia) for those who tested positive by barium swallow. As we have shown in Appendix G, this causes a decrease in sensitivity, specificity, and PPV. Only NPV remains unchanged. The treatment bias will affect barium swallow results more than they affect BSE results since only half (6 out of 12) of the patients with positive BSE results also had a positive barium swallow. Data reported are insufficient to allow the magnitude of the bias to be determined.

In this trial, the treatment bias favoring the BSE (fewer patients treated) is offset (to an unknown degree) by spectrum bias against the BSE. Assuming the protocol described in the report was followed when the patients were diagnosed and treated, patients who had no dysfunction during the BSE and no complaints of cough or throat-clearing were not given a barium swallow. Also, patients exhibiting significant problems and overt signs of severe aspiration throughout the BSE were not given a barium swallow. The effect of these exclusions is to create a spectrum bias, because the cases that would be easiest to diagnose by BSE were excluded. This would cause decreases in both sensitivity and specificity. To the extent that BSE and barium swallow results are correlated, the spectrum bias would also extend to the barium swallow results. While we cannot determine the magnitude of the effect, we can be sure that the net effect of the spectrum bias is to favor barium swallow.

Omission of obvious positives should not affect the NPV results significantly. Results from the Reynolds et al. trial were similar to those in the Smithard et al. trial, although the VF examination had somewhat better results than the BSE. NPVs were 87 percent for VFSS and 80 percent for BSE. As expected, the NPV increased to 91 percent if both tests were negative.

Case Series Comparisons of VFSS and BSEs

Because of the scarcity of trials comparing BSE with VFSS within the same patient population, we turned to the case series literature. Obvious differences in study design and patient characteristics among these studies make comparisons among them of questionable reliability. For example, treatments may have differed in significant ways, although none of these studies provided in-depth description of the therapies. Study design issues, such as retrospective versus prospective case identification, may have resulted in different findings so that retrospective case review could not identify all diagnoses or occurrence of illness, therefore skewing results. We seek, however, only to roughly approximate a comparison of test characteristics for these two tests. Given the purported advantages of VFSS in detecting cases of silent aspiration missed by BSE that may ultimately lead to pneumonia, even with study differences, VFSS should appear to have increased sensitivity.

Several case series studies have been published that examined either the ability of BSE to diagnose an unsafe swallow (Evidence Table 41) or MBS detection of aspiration to predict pneumonia (Evidence Table 42). We have assumed for this discussion that care setting does not affect test characteristics in any significant way (though the differences in prevalence of disease in different care settings will obviously have some effect); therefore, acute-care and rehabilitation centers are analyzed together. We have limited the pool of studies to those reporting on 90 percent or more stroke patients. Because of a paucity of data, the formal BSE and water swallow test have been collapsed into a single category of noninstrumented exam. We are, therefore, disregarding many factors that may potentially confound results.

Evidence Table 41 includes trials that related pneumonia rates to results of the BSE or 3-ounce/50-ml water test. Key points regarding these trials and their suitability for between-studies comparison are presented below.

Gordon, Hewer, and Wade (1987), examining the 50-ml water test, excluded those patients known to have choked on fluids the day of the test. Since these patients likely would have choked during the 50-ml water test, this selection bias is likely to have excluded cases that would have been true positives, thus underestimating sensitivity and PPV of the test.

DePippo had very sensitive criteria for interpreting the Burke Dysphagia Screening Test (BDST). Failure on any of the seven elements was interpreted as failure of the test. Thus, sensitivity was high (92 percent) and specificity was low (45 percent). Patients who failed the BDST were referred for barium swallow. Patients failing the barium swallow were enrolled in a dysphagia program including diet modification and compensatory swallowing techniques. As a result, treatment probably intervened for most patients who had dysphagia.

Table 15. One Study's Results Illustrative of the Effects of Treatment Bias (Gottlieb, Kipnis, Sister et al., 1996)
50-ml water test results in first group
Pneumonia outcome
PositiveNegative
50-ml water test results in the third group
Pneumonia outcome
50-ml water test resultPositive6 TP7 FP46 percent PPV
Negative3 FN44 TN94 percent NPV
Pneumonia rate: 15 percent67 percent sensitivity86 percent specificity
PositiveNegative
50-ml water test resultPositive0 TP13 FP0 percent PPV
Negative2 FN45 TN96 percent NPV
Pneumonia rate: 3 percent0 percent sensitivity76 percent specificity
Gottlieb, Kipnis, Sister et al. (1996) called their trial a validation of the 50-ml water test, but because it does not compare the 50-ml water test with a reference test for diagnosis of dysphagia, it is not truly a validation. It is, however, a trial measuring the ability of the 50-ml water test to predict pneumonia. More than that, this is a trial measuring the effect of a comprehensive swallowing program on pneumonia rates. Three consecutive groups of 60 patients were enrolled in the swallow program, using the 50-ml water test as the criterion for selecting patients for swallow therapy. Patients who failed the 50-ml water test were given a modified diet, along with three sessions per week of swallow therapy by a speech therapist. Pneumonia rates in patients with and without the modified diet were compared. The change in results is graphic evidence of the effect of treatment bias (Table 15).

As the dysphagia program became established, true-positive results decreased while false-positive results increased. Sensitivity, specificity, and PPV all appeared to decrease as treatment became more successful at preventing pneumonia. (See Appendix G for further discussion of the effects of treatment bias.)

All of these trials found that the BSE had high NPV. Reported PPVs were very low; this could be the result of treatment bias as well as selection of a sensitive test threshold to minimize false-negative results. Smithard's trial found NPV to be 84 percent; all the others had NPV results over 90 percent. This demonstrates that BSE results can be used to identify patients who are unlikely to need swallow therapy.

Although Evidence Table 42 shows DePippo, Holas, and Reding (1994) twice, only the results for the BSE were graphed to avoid double counting of patients. Smithard et al. (1996) is included in both tables but not included in our graph, due to only a 1-week followup period that may affect its results relative to the results from other studies with longer followup times (it has, however, already been discussed above). Holas et al. (1994) is tabled but excluded from analysis because results were only reported for those patients who tested positive on VFSS, thus resulting in patient selection bias.

Johnson's group tested the ability of VF examination to predict pneumonia. Kinematic pharyngeal transit time (Em) was the primary measure of swallow function; pneumonia rates were stratified by group, with threshold Em times ranging from 2 to 5 seconds. This allows consideration of a range of thresholds and illustration of the tradeoff between sensitivity and specificity. At the lowest threshold (2 seconds), sensitivity was 100 percent and specificity was 42 percent; at 3 seconds, sensitivity was 86 percent and specificity was 68 percent; at 4 seconds, sensitivity was 76 percent and specificity was 81 percent; and at 5 seconds, sensitivity was 66 percent and specificity was 94 percent. Treatment was not reported, so it cannot be determined if treatment bias affected the results. The trial does not appear to have spectrum bias, because it was a retrospective review of consecutive cases referred for videofluoroscopy. The only exclusion criteria were dysphagia caused by disease other than stroke (two cases) and patient noncompliance causing an examination of poor quality (two cases, 3.2 percent). The latter exclusion causes a small bias against VFSS, but is appropriately conservative in design.

The Holas, DePippo, and Reding (1994) report on the correlation between MBS results and pneumonia showed that all patients selected for this trial had videofluoroscopy results that were positive for dysphagia. Patients also were given unspecified dysphagia treatment, so treatment bias probably affected the results.

Aviv, Sacco, Mohr et al. (1997) reported MBS results in 20 patients with known dysphagia. Although all patients got some treatment, treatment differed by group. The 10 patients whose barium swallow demonstrated aspiration were given a percutaneous endoscopic gastrostomy (PEG) tube, and the 10 that did not have aspiration were given swallow therapy and no PEG tube. Pneumonia risk was twice as great for those patients who had negative barium swallow results than for those who had positive results, a counterintuitive result. The reported barium swallow sensitivity of 33 percent and specificity of 43 percent are worse than would be obtained by random chance. This suggests that treatment bias affected the results.

Aviv et al. also tested laryngopharyngeal sensory discrimination testing (LPSDT) in a matched-control trial. Twenty patients with complaints of dysphagia and who were not taking food by mouth were paired with healthy age-matched controls. The controls had no dysphagia and no history of stroke. None of the controls contracted pneumonia or had abnormal LPSDT results. Six of the dysphagia patients (30 percent) contracted pneumonia, all of whom exhibited a bilateral sensory deficit on the LPSDT. Treatment bias affects these results too, though; since there were no false negatives, sensitivity remained at 100 percent. Specificity of the LPSDT was 57 percent while PPV was 43 percent.

Daniels et al. (1998) and Aviv et al. (1997) are interesting studies to discuss because of outlier VFSS test characteristic findings. Aviv reports a very low sensitivity of 33 percent from a very small pool of patients (20). Because all patients who tested positive for aspiration were put on a feeding tube (as opposed to noninvasive therapies reported in most other studies), we excluded these results from further analysis in an attempt to control for treatment effects. On the other hand, Daniels et al. (1998) reported a sensitivity rate of 100 percent; this was because only one case of pneumonia occurred in their patient population of 55; these results are thus rendered unreliable because of the low number of patients who developed pneumonia, resulting in low statistical power.

NPV is the parameter least affected by the known biases. For the barium swallow trials listed in Evidence Table 42, there was a wide range of NPV values, from 55 to 100 percent. The wide range of values could be explained by the possibility that some investigators used more sensitive thresholds than others. Summary ROC meta-analysis would be able to test this hypothesis, but meta-analysis of this evidence base is not appropriate because of treatment bias and selection bias previously described.

Because so many of the trials are confounded by treatment bias and/or spectrum bias, we could not use their results in a between-studies comparison or draw conclusions on the relative effectiveness of the 3-ounce water test, the barium swallow, and other tests based on these results. Trials measuring the effect of these diagnostic procedures on outcomes such as pneumonia rate are necessary to determine the value of these diagnostic tests.

Which is the Better Predictor of Pneumonia: Videofluoroscopy or Fiberoptic Endoscopy? (Links D5 through D7 to M1)

We located two studies that addressed this question, a pseudo-randomized controlled trial and a nonrandomized controlled trial. We address these two studies in separate subsections below.

A Pseudo-randomized controlled trial comparing VFSS and FEESST for prevention of pneumonia in patients with mixed etiology of dysphagia

Evidence Table 43 shows the results of a trial using 1-year pneumonia incidence as the primary outcome for comparing the efficacy of VFSS and fiberoptic endoscopic examination of swallowing and sensory test (FEESST) (Aviv et al., 1997 ). We have called this a pseudo-randomized controlled trial because of the possible nonrandom manner in which the patients were directed to each diagnostic test. Patients referred to an otolaryngology department for outpatient diagnosis received FEESST if they were examined on Monday or Thursday, and VFSS if examined on Tuesday, Wednesday, or Friday. This apparently resulted in more patients being directed to VFSS (75 versus 51), which, by itself, is not a major concern. More problematic, however, is that the patient characteristics in terms of dysphagia etiology were unequally distributed between the comparison groups. Although patients and referring physicians were blinded as to which test would be used on each day of the week, it is a formal possibility that unknown factors could have influenced the selection of patients on particular days of the week. For example, Social Security or welfare checks might arrive on a particular day, or alcoholics might be less likely to show up on Mondays. It is recognized that it would be impractical for examiners to switch methods for randomized individual patients; however, the days of the week for each type of exam could be randomized. The same team of SLPs conducted both diagnostic tests, blinded to the comparison test results, and treated patients according to the same algorithm. Thus, while diagnosis and treatment were supposedly consistent, the treatment was not prescribed blinded to the test used. This would be considered a design omission in many studies, but in the present case, this limitation seems unavoidable, because for many patients treatment begins during the course of the diagnostic exam.

After 1 year of followup, 18 percent (14/75) of the VFSS patients developed pneumonia, compared with 9.8 percent (6/51) of FEESST patients. The difference between these two groups was not statistically significant (p = 0.176). Whereas these results fail to prove that the apparent trend of FEESST superiority is reliable, possibly because of low statistical power, the results also failed to prove that VFSS was superior (again with low power). The study is still ongoing, thus the issue of low power may ultimately be resolved.

The authors of this study suggest that the sensory testing provided by FEESST can identify patients who may aspirate intermittently but who do not necessarily aspirate during the test. They also suggest that the longer time over which FEES/FEESST can be used during a single session may detect aspiration that occurs with fatigue that is not observable on the shorter VFSS exam. However, there was no evidence presented that the sensory testing part of the exam per se was a predictive factor for aspiration pneumonia.

Studies of other designs comparing VFSS and FEESST for prediction of pneumonia

Because the ability of VFSS and fiberoptic endoscopic (FE) exam to assist in pneumonia prevention have been compared by only the single trial discussed in the preceding subsection (the results of which were inconclusive), we examined the other reported controlled trials. Evidence Table 44 presents the only other study we found that made this comparison (Aviv, Sacco, Mohr et al., 1997). This was a prospective trial controlled by giving VFSS and FEESST to the same patients, and blinding the readers of the results of one test to the results of the other test. This, plus the fact that this trial used the gold standard of long-term followup for pneumonia, can make the design of this trial quite strong but only if patients who test positive with either test receive the same treatment, which was not the case for this trial because the results of FEESST were not used in choosing treatment. Thus, any patient who aspirated on VFSS was given a PEG tube. After 2 years of followup, the number of cases of pneumonia were counted and compared with the test results. VFSS had a sensitivity of 33 percent (2/6) and a specificity of 43 percent (6/14); FEESST had a sensitivity of 83 percent (5/6) and a specificity of 50 percent (7/14). A combination requiring both tests to be positive had a sensitivity of 83 percent (5/6) and a specificity of 36 percent (5/14). The fact that some patients were treated, which likely prevented some cases of pneumonia, means that sensitivity and PPV may be underestimated for both tests. Furthermore, because only patients with a VFSS positive test were treated, the VFSS sensitivity and PPV were likely underestimated to a greater degree. This problem makes it impossible to use this trial to compare diagnostic methods.

Which Better Detects Aspiration, VFSS or FEES? (Links D5 through D7 to S1)

Because of the inconclusive nature of the above studies, we looked beyond the preferred outcome of long-term followup for pneumonia and considered studies that compared VFSS and fiberoptic endoscopy for detecting aspiration. Although aspiration is only loosely correlated to pneumonia (not all patients who aspirate get pneumonia, and not all patients who get pneumonia aspirate) and thus cannot be considered a reliable surrogate measure for pneumonia, the limited correlation that appears to exist makes aspiration an interesting symptom for test comparisons. As with the preceding question, we included only controlled trials. We found four such studies, which are shown in Evidence Table 45. All of these studies compared the ability of VFSS and FEES to detect aspiration.

Although some studies used one diagnostic as a gold standard against which to compare the other diagnostic, this is not appropriate, because FEES may detect some cases of aspiration that VFSS does not, and VFSS may provide some false positives that FEES correctly identifies as negative, and vice versa. The Leder et al. (Leder, Sasaki, and Burrell, 1998) study only reported diagnostic accuracy, which is the combination of positive and negative agreement. Unfortunately, diagnostic accuracy does not tell one whether the disagreements were in the positive or negative direction, which is particularly important when the consequences of a false positive are different from those of a false negative. In the present case, the consequences of a false negative are pneumonia and, perhaps, death. On the other hand, the most serious possible consequences of a false positive are the complications caused by feeding tubes.

The remaining three studies all reported finding some apparent cases of aspiration detected by VFSS but not FEES, but they also reported that some cases detected by FEES were not detected by VFSS. Kaye, Zorowitz, and Baredes (1997) reported substantially more cases of aspiration detected by VFSS than by FEES. This was the largest study, but there was no indication that the readers of one test were blinded to the results of the other test. Also, the bolus volumes, consistencies, and number of boluses were not equivalent between the two exams, and it is difficult to detect aspiration of water by FE (milk or colored liquids are typically used). In the Wu, Hsiao, Chen et al. (1997) study, the readers were blinded, and more positives were reported for FEES than for VFSS. The remaining study by Langmore, Schatz, and Olson (1991) reported the same number of positives for each test that were missed by the other test; however, this was only 1 (out of 21 tests) in each case.

The important finding is that in all of these studies, each test detected some apparent positives that the other test did not. This means that neither test can be the reference standard for the other. The amount of disagreement over these positives is not reliable, because there was no outcomes-based gold standard for aspiration. Investigators on future studies should recognize these limitations of study design and they should also recognize that aspiration is a surrogate marker for the more useful endpoint of long-term pneumonia frequency.

Conclusions Concerning the Comparability of Videofluoroscopy and Fiberoptic Endoscopy for Prevention of Pneumonia and Detection of Aspiration

A single pseudo-randomized controlled trial with low statistical power and moderately strong internal validity reported a trend that incidence of aspiration pneumonia in dysphagia patients with mixed etiologies was lower after FEESST than VFSS but failed to show that either test was statistically significantly superior. This study did not address whether either of these tests is superior to the BSE, and the reader is referred to Question 2 for a discussion of this latter issue.

We found no reliably interpretable evidence comparing the abilities of VFSS and FE to detect aspiration. This difficulty was primarily due to the fact that the trials that examined aspiration failed to incorporate an appropriate gold standard external to VFSS and FE. However, even were such an aspiration gold standard possible, it is important to remember that aspiration is a surrogate endpoint for the long-term followup of pneumonia frequency. Thus, at the present, aspiration pneumonia frequency is both the most desirable and the only available gold standard for any instrumented exam. However, intervening treatment biases diagnostic test detection and prediction results. Thus, without withholding treatment, the absolute test detection and prediction results cannot be known. If treatment is not identical after each test, the relative detection and prediction abilities also cannot be known.

Summary

Measuring the performance of diagnostic tests for dysphagia is difficult because of the lack of a gold standard. While VFSS is the most widely used diagnostic imaging method, and therefore considered the gold standard, it does not demonstrate perfect sensitivity and specificity. As a result, it becomes difficult to interpret the results of studies that compare VFSS with other newer diagnostics, such as FEES and FEESST. This is because when one test yields a positive diagnosis and the other test yields a negative one, it is impossible to determine which test is correct. Similarly, if the two tests agree, they may both be wrong or they may both be right. The only way to assess test performance is to use clinical followup. In the studies we examined, however, followup was confounded by treatment intervention.

Because of these limitations, research thus far has been unable to demonstrate any significant advantage of one test over another. Comparisons have been made of the abilities of VFSS and FEESST for prevention or prediction of aspiration pneumonia and of the abilities of VFSS and FEES to detect aspiration. There was a statistically nonsignificant trend in one pseudo-randomized trial comparing VFSS and FEESST suggesting that incorporation of FEESST into a dysphagia diagnosis and treatment program results in lower aspiration pneumonia rates than in programs that employ VFSS. Further data are needed to determine whether this trend is reliable. Another within-subjects trial comparing the predictive value of these tests for aspiration pneumonia was flawed in that treatment decisions were made on the basis only of VFSS, and not on the basis of the FEESST results, thus leaving the results of each program noncomparable.

A comparison of VFSS and FEES for detection of aspiration in four studies showed that VFSS detects some aspiration that FEES does not, and that FEES detects some aspiration that VFSS does not. The ultimate gold standard, clinical outcome, was not used in any of these trials. Because of the above-mentioned limitations in studies of dysphagia diagnostic technologies, and because of the lack of a reliable gold standard, these trials provide no reliable information about relative efficacy of these two diagnostic tests.

A few studies have evaluated the interobserver reliability of videofluoroscopy. Two studies reported very high reliability overall when testing several different viscosities. These studies used retrospective viewing of tapes or film. As such, these results may not be generalizable to the more common procedure of viewing the test and making clinical decisions in real time. One study that analyzed each viscosity independently reported a high reliability only for thin liquids, and very low or nonexistent reliability for thick liquids and solids. While this low reliability is problematic because solids and thick liquids are the first substances tested on VFSS, the high reliability of thin liquids is reassuring because this is the viscosity most often problematic for patients with neurogenic dysphagia.

No studies have compared the test characteristics of FEES and FEESST. Such a comparison is warranted to determine if the quantitated sensory testing included as part of FEESST actually provides any clinically relevant information. Further within-subjects studies comparing VFSS and FE should make treatment determinations for patients tested positive by either test so that results, although confounded by the treatment, will be comparable between the two tests. Alternatively, a randomized controlled trial should be carried out in which the patients in each arm are diagnosed by only one method, and are treated solely based on that method. This would demonstrate the relative effect of each test on actual patient outcomes; however, the detection and prediction abilities of the tests in the absence of treatment would not be known.

In the absence of such a trial, and because current literature does not appropriately address the issue of differences between diagnostic tests, we further explored the answer to this third question in our supplemental analysis. This latter analysis, in turn, relies upon the data we discussed in our answer to Question 1.

4. When is noninvasive swallow therapy or diet modification appropriate? Do these therapies work particularly well in any particular patient population? What can the evidence tell us about these therapies? Is PEG useful as a primary therapy or is it a last resort?

In addressing this last question, we turn our attention to treatment of dysphagia. The primary question asks: Do these treatments prevent morbidity and/or mortality after dysphagic stroke, and if so, which treatments work the best? As we have noted in our discussions of the previous three questions, treatment is difficult to address in the absence of a discussion of diagnosis, so the discussion surrounding the present question cannot be viewed apart from our previous discussions. In the present section, however, we broaden our discussion to include specific outcomes (e.g., malnutrition) that we have not previously discussed in detail. We also expand our discussion to include feeding tubes and the benefits and risks associated with these tubes.

As with our previous discussions, we focus on stroke victims, which again is a reflection of the emphasis in the literature. Because of this, one must be cautious about extrapolating our results to patients with other neurologic conditions; dysphagia in stroke victims typically does not progress (in fact, it often spontaneously resolves) but in patients with neurodegenerative disorders (e.g., Parkinson's and Alzheimer's disease), the opposite is true.

Because stroke victims often spontaneously regain their ability to swallow (Barer, 1989; Smithard, O'Neill, England et al., 1997), the effects of treatment are difficult to separate from those of the recuperation that occurs after a stroke. Because of this, it is especially important that researchers who evaluate dysphagia treatments in stroke patients include control groups in their studies. Although the ethics of including a no-treatment control group in a prospective trial are questionable, it remains possible to compare different treatments. This has been done in only two published trials (DePippo, Holas, Reding et al., 1994; Groher, 1987). Alternatively, historically controlled trials or other quasi-experimental designs can be employed. A large literature exists on how such studies can be conducted and validly analyzed, and the interested reader is referred to Cook and Campbell's classic work on this subject (Cook and Campbell, 1979) for further discussion.

Many noninvasive dysphagia therapies have been developed, as discussed in the section entitled Treatment of Oropharyngeal Dysphagia. Treatments of interest in this section are: diet modification, noninvasive swallow therapy (postural maneuvers and exercises), a combined treatment including both diet modification and swallow therapy, and PEG tubes. While diet modification could be considered one part of noninvasive swallow therapy, we broke it out separately here because many of the published studies looked at these two approaches independently.

In considering this fourth question, we included studies of noninvasive treatments regardless of design if they contained 10 or more treated patients and quantitatively reported results. For our analyses of PEG tubes, we were able to limit the studies included to those with the following characteristics:

  • 50 or more patients

  • U.S. studies

  • Quantitative results

  • Reported followup time

More stringent limitations are applied to some specific PEG tube outcomes because of possible effects of patient characteristics on results.

Outcomes Considered

As implied by the Evidence Model (see Figure 1), we considered a large number of outcomes for this evidence report. The most important are those that directly impact the long-term health or well-being of the patient. We divided these into short-term outcomes, major morbidities, and long-term outcomes. Short-term outcomes include energy intake, weight change, aspiration, prevention of feeding tube usage, and minor feeding tube complications. Major morbidities were pneumonia, malnutrition, and major feeding tube complications. Long-term outcomes are QOL, mortality resulting from underlying illness, mortality as a result of morbidity, and mortality resulting from treatment (feeding tube only).

Many studies have looked at the short-term effects that noninvasive swallow therapies (such as chin tuck, head turning, and palatal training appliances) and diet modification (such as changes in bolus volume, flavor, or temperature) have on the physiological process of swallowing, such as total swallow duration and pharyngeal transit time (Bisch, Logemann, Rademaker et al., 1994; Lazarus, Logemann, Rademaker et al., 1993). However, little evidence in the literature directly links changes in these physiological processes to changes in the overall swallowing-related health of the patient; for example, there is no evidence linking improved (decreased) pharyngeal transit time to reduced incidence of aspiration pneumonia. Because of this, we will not discuss these findings, and we limit our discussion to the clinically meaningful outcomes outlined above.

Table 16. Treatments for which Outcomes of Interest Were Not Reported
Swallow therapyDiet modificationCombined swallow therapy and diet modificationFeeding tube
OutcomeLinkOutcomeLinkOutcomeLinkOutcomeLink
Change in food intakeT1 to O2Change in food intakeT2 to O2MalnutritionT3 to M2MalnutritionT4 to M2
Weight changeT1 to O4Prevention of feeding tubeT2 to O3DehydrationT3 to M2DehydrationT4 to M2
MalnutritionT1 to M2Weight changeT2 to O4QOLT3 to O7QOLT4 to O7
DehydrationT1 to M2DehydrationT2 to O7
QOLT1 to O7QOLT2 to M2
Many of these outcomes of interest were not reported in the treatment literature. These treatments and outcomes, along with the evidence model links that represent their relationship, are depicted in Table 16.

Table 17. Outcomes of Interest for which No Comparison Groups Exist in the Published Literature
Swallow therapyDiet modificationCombined swallow therapy and diet modification
OutcomeLinkOutcomeLinkOutcomeLink
AspirationT1 to O5AspirationT2 to O5AspirationT3 to O5
Prevention of feeding tubeT1 to O3MalnutritionT2 to M2
MortalityT1 to O8, O9MortalityT2 to O8, O9
Because case series design provides little information on treatment efficacy in stroke patients, we looked for other case series that might provide comparative data (historical studies for control groups and contemporary studies for other treatments). In some cases potential comparison data were identified, but were unusable because of different care settings, patient populations, or ways of measuring the outcome of interest. In the case that no comparative data could be located, we could not analyze these case series. Those outcomes for which there were only case series studies and no comparative data were located are shown in Table 17 below, and are included in Evidence Tables 46 through 68 but are not discussed here.

The exception to this exclusion of case series is the literature on feeding tube usage, which may provide useful information from case series because this treatment is more applicable to chronic stroke patients who may not spontaneously recuperate.

Table 18. Outcomes Available for Analysis by Treatment Category
Swallow therapyDiet modificationCombined swallow therapy and diet modificationFeeding tube
OutcomeLinkOutcomeLinkOutcomeLinkOutcomeLink
PneumoniaT1 to M1PneumoniaT2 to M1PneumoniaT3 to M1PneumoniaT4 to M1
MortalityT3 to O8, O9MortalityT4 to O8, O9, O10
Change in food intakeT3 to O2AspirationT4 to O5
Prevention of feeding tubeT3 to O3Weight changeT4 to O4
Weight changeT3 to O4Minor tube complicationsT4 to O6
Major tube complicationsT4 to M4
Therefore, we considered only the outcomes shown in Table 18 in our discussion of Question 4.

We discuss each of these outcomes for each of these treatments (and, hence, each appropriate link in the evidence model) in the following sections and subsections. We will first discuss pneumonia, the most commonly reported outcome, followed by any other morbidities for which there were data; short-term outcomes (weight change, feeding method, energy intake, aspiration, minor tube complications) are then discussed, followed by mortality.

Noninvasive Therapies

Because of the inconvenience, discomfort, and cost of enteral feeding, it is important to patients, clinicians, caregivers, and payers that the patient be able to eat orally and independently whenever it is deemed safe to do so. This means that the patient must be able to ingest adequate amounts of food to maintain a healthy weight and not aspirate significant amounts of harmful substances. It must also be comfortable for the patient to eat; if they experience pain or the disquiet of coughing on every swallow, they will be reluctant to eat, and will ultimately lose weight and become susceptible to further illnesses.

Table 19. Characteristics of Controlled Studies on Noninvasive Therapies for Dysphagia
StudyNStudy designCare settingMean agePrimary disease(s)TreatmentTime frame
Groher 198723 23RCTNursing home71.8 74.2CVA with history of aspiration pneumoniaPureed diet Mechanically altered diet6 mos
Kasprisin 198948 13 8Retro CTHospitalNRVariousSwallow therapyNR
Martens 199016 15HPCSAcute hospital unit49.3 46.1Brain injury, tumor, CVAVarious (diet, exercise, counseling) No specific dysphagia treatmentNR
DePippo 199438 38 38RCTRehab unit76 74.5 73CVADiet and swallow technique recommendations Therapist-prescribed diet and swallow techniques Therapist prescribed diet and techniques reinforced1 year
Four published studies on noninvasive therapies in the dysphagia literature meet our inclusion criteria for size and study design in this section (DePippo, Holas, Reding et al., 1994; Groher, 1987; Kasprisin, Clumeck, and Nino-Murcia, 1989; Martens, Cameron, and Simonsen, 1990). Each of these studies used a different study design, patient population, care setting, or followup time that potentially affected results; these studies are therefore not comparable to one another. Table 19 displays the characteristics of each of these studies that render them noncomparable. We will therefore only perform within-study comparisons.

All of these studies suffer from the same limitation that makes it impossible to make conclusions based on their results, no matter how well designed they are. The problem is an issue of statistical power-if the outcome of interest (in this case, pneumonia or mortality) is one that has a low incidence rate, to obtain statistically significant group differences, the total number of patients in the study must be high enough that a type II error can be avoided. An example about how to calculate the necessary number of subjects needed is included in the Future Research section. Studies on patients with dysphagia interested in reporting pneumonia or death as an outcome would require several hundred patients to obtain reliable findings.

Pneumonia

Pneumonia has been the outcome most often reported in these studies. This, however, is not to say that pneumonia is the only important outcome after dysphagia-malnutrition, QOL, dehydration, and death are also important outcomes. Pneumonia is, however, the most common illness experienced by the elderly, and a common cause of death for these patients. It has thus been the primary focus of researchers in this field. There are no studies on the other major morbidities to discuss in this section, and we therefore recommend that researchers pay more attention to these outcomes when conducting future research.

Pneumonia as an outcome is shown in Evidence Tables 56 through 58. Pneumonia has been reported as an outcome in a retrospective controlled trial on noninvasive therapies (Kasprisin, Clumeck, and Nino-Murcia, 1989), a randomized controlled trial on diet modification (Groher, 1987), an RCT on combined therapy (DePippo, Holas, Reding et al., 1994), and a historical prospective case series on combined therapy (Martens, Cameron, and Simonsen, 1990). Only well-designed RCTs can be completely free of potential confounds. Controls in these studies were not patients who received no treatment at all, but a different treatment (Groher, 1987) or a different level of treatment (DePippo, Holas, Reding et al., 1994). Therefore, even in these relatively well-designed studies, no pure effects of treatment can be determined, only effects relative to the control treatment.

Groher (1987) compared the use of a pureed diet (control group) with the use of a mechanically altered diet (treatment group) for patients with a history of aspiration pneumonia. This study specifically challenged the notion that a pureed diet was safe for such patients. A mechanically altered diet was defined as one in which soft foods were allowed as well as altered fluids (thickened or frozen). The researchers reported that in the treatment group, 17.4 percent of patients developed pneumonia, while in the control group, 86.9 percent developed pneumonia; this difference was statistically significant (p<0.05). Although the number of subjects in this trial was relatively small, the large effect size made statistical significance possible. The results of this trial strongly indicate that a mechanically altered diet is safer for stroke patients with dysphagia than the conventional pureed diet; these results, however, are limited to those in a nursing home with a history of aspiration pneumonia. These results may not be generalizable to patients with higher functional status.

DePippo, Holas, Reding, et al. (1994) compared three different levels of intervention in a randomized controlled trial with 115 stroke patients in a rehabilitation center. Patients in Group I received diet and swallow technique recommendations without therapist followup or reinforcement, those in Group II received therapist-prescribed diet and special swallowing techniques, and those in Group III received therapist-prescribed diet and swallowing techniques with daily reinforcement from the therapist. The treatment lasted for about 2 months, and followup continued for 1 year. Those patients with the lowest intervention level had a pneumonia incidence of 2.6 percent; those on intermediate intervention experienced the highest pneumonia incidence at 13.2 percent. The most intense intervention group had an incidence rate of 5.1 percent. Pneumonia incidence did not correlate with the level of intervention, and the difference between the groups was not significant.

Because these two studies were well-controlled for patient population, treatment, and care setting, their results could, theoretically, be reliable. However, because of low statistical power (small number of patients and low incidence of pneumonia in these populations), conclusions cannot be drawn.

The results of DePippo et al. (1994) can be compared with those reported in a case series on the same type of treatment for a similar patient population. Gottlieb et al. (1996), in a case series of 50 stroke patients with dysphagia, reported a pneumonia incidence of 18 percent (9/50) after a mean 53.5 days. Gottlieb's pneumonia incidence rates are higher than DePippo's; DePippo's subjects received VFSS exams, while Gottlieb's were diagnosed using a water swallow test. It is conceivable that the water swallow test did not detect silent aspirators, and therefore the treatment applied in this study was not geared to prevent aspiration, thus resulting in more cases of pneumonia. However, it is also unclear how intensive the therapy was in Gottlieb's study. Thus, it is impossible to determine the exact cause of these differences.

Less well-controlled studies were also done on swallow therapy and combined therapy. Kasprisin et al. (1989) conducted a retrospective pseudo-controlled trial that compared three groups of patients with various diseases (both neurologic and nonneurologic) who received either swallow therapy (including thermal stimulation, supraglottic swallow, and bolus propulsion exercises) with those who did not. Group 1 (48 patients) received therapy and had no history of aspiration pneumonia; Group 2 (13 patients) received therapy and had a history of aspiration pneumonia; Group 3 (8 patients) did not receive treatment (for a variety of self-determined reasons) and their history of pneumonia was not reported. Incidence of aspiration pneumonia in each of these groups was 6.3 percent, 15.4 percent, and 100 percent, respectively, 1 to 36 weeks after treatment (mean not given). The differences between Groups 1 and 3, and Groups 2 and 3 were statistically significant. Although the results of this study suggest that swallow therapy helps prevent aspiration pneumonia both in patients with a history of it and those without such a history, few details were provided on the untreated group, and therefore potential biases may have been introduced. The control group consisted of a very small number of patients (eight) and could have differed from the treated groups in clinically significant ways. Their pneumonia rates were tracked immediately while the treatment group's pneumonia rates were not tracked until they had completed treatment (2-4 weeks after admission); this difference may be significant if pneumonia rates change over time (as discussed in Appendix D). It is also unclear how the control group patients were tracked after they transferred to a different hospital or moved out of the geographical area. It is possible that patients who stayed within the healthcare system because of health problems such as pneumonia were more easily followed, thus inflating the pneumonia rates in the untreated individuals. The difference between the pre- and post-treatment rates of pneumonia in Group II (100 to 15.4 percent) looks promising, but the meaning of this difference is difficult to evaluate without knowing the etiological makeup of the patient population; if there were many stroke patients, recovery could have occurred spontaneously.

Martens et al. (1990) conducted a historical prospective case series comparing 16 acute-care patients receiving individualized dysphagia treatment with 15 historical controls with no such individualized treatment. They reported no cases of aspiration pneumonia in either group during a 30- to 40-day acute-care stay (Martens, Cameron, and Simonsen, 1990), which suggests that a dysphagia program was not needed for these patients to prevent aspiration pneumonia. However, the etiologies of these patients were mixed (patients with head injuries and tumors were included), and the average age of these patients was in the upper 40s. Because this was the only study not reporting cases of pneumonia in the control group, it is possible that publication bias is playing a role, because investigators would not be likely to publish a study reporting no effect.

Short-Term Outcomes

There are several different ways of measuring the short-term efficacy of a treatment. In dysphagia treatment, whether noninvasive or invasive, the short-term goal is to safely increase a patient's food intake and/or weight. Only one study has thus far reported these results in a pseudo-controlled trial on dysphagia treatment (Martens, Cameron, and Simonsen, 1990) (see Evidence Tables 46 and 49). These researchers reported that patients of various etiologies who underwent a directed dysphagia treatment program fared better than those not treated, in terms of weight gain (+1.41 kg versus -2.82 kg) and kilocalorie intake (97.3 kilocalories fewer than desired versus 488.8 kilocalories fewer than desired); these differences were marginally significant (p = 0.05). Only limited information can be gleaned from this study because of design, patient characteristics, and nonspecific treatment descriptions (see Table 18); it suggests, but does not prove, that a directed dysphagia treatment program is better than no such program.

Another issue is whether patients who learn swallowing techniques can perform them independently after leaving the inpatient care setting; these techniques are useless (or, alternatively, very expensive) if they can only be performed under the direction of a clinician. DePippo et al.(1994) reported on three different levels of intervention in a stroke rehab unit (see Evidence Table 48). These researchers found that the patients undergoing the most intensive therapy were significantly more likely to be able to perform these techniques independently. However, this same study found no differences in pneumonia incidence among the three groups; this suggests that the ability to perform the techniques independently is not clinically significant. We do not, however, know the effect of independent performance on the QOL or nutritional status of the patient.

Mortality

Patients undergoing noninvasive therapy are not usually those patients expected to die; the more severely debilitated patients will likely be put on a feeding tube instead. As a result, very few studies have reported mortality as clinically important when examining noninvasive treatments. Indeed, the one analyzable study that did report this outcome found that no patients died in any of the treatment groups (DePippo, Holas, Reding et al., 1994) (see Evidence Table 65). Mortality is a significant outcome in patients with a PEG tube (as discussed below), suggesting that patients undergoing noninvasive therapy are characteristically different from those undergoing enteral feeding. Therefore, PEG tubes will be discussed below without comparison to the results of noninvasive treatment.

Summary

Only four studies in the published literature provided information suitable for analysis of the effectiveness of noninvasive therapy. Two of these studies provided data from well-designed and controlled RCTs that may, as a result, be considered reliable. In addition, these were also the only studies that exclusively examined stroke patients. These two studies suggest that soft mechanical diets lead to decreased pneumonia incidence over pureed diet in patients with a history of aspiration pneumonia, and that intensity of treatment intervention has no apparent effect on pneumonia incidence.

Less well-controlled studies have reported contradictory findings, one reporting that aspiration pneumonia does not occur in acute-care patients and therefore no treatment effects could be reported, and the other reporting that those without a directed dysphagia program experienced significantly higher incidence of pneumonia. Because these studies were of mixed etiology and flawed study design, no conclusions can be drawn.

There is also a suggestion from a case series with historical comparison group that dysphagia treatment leads to improved weight gain and calorie intake. Such short-term improvements may lead to reduced morbidities later, and therefore such outcomes are important to report.

DePippo et al. (1994) have also reported that greater intensity of treatment leads to better ability of patients to perform exercises independent of clinician assistance. However, they provided no evidence that such independent function leads to improved clinical outcomes.

In conclusion, there is limited information that some specific noninvasive treatments have significant effects on clinical outcome measures (pneumonia and weight gain) in certain care settings and specific patient populations. However, contradictory findings have been reported, and no two studies have examined identical patients in identical care settings, thus rendering these studies noncomparable.

PEG Tubes

In many cases, as discussed previously in this report, stroke patients spontaneously regain their ability to swallow. These are recuperative stroke patients with mild to moderate dysfunction. If these patients have severe difficulty swallowing immediately after the event, they may be given an NG tube temporarily to help them ingest adequate amounts of food. They may also receive noninvasive swallow therapy and/or diet modification until the dysphagia resolves. However, there are also chronic stroke patients who recover more slowly, or never recover full swallow ability. These patients would most benefit from effective noninvasive or invasive treatments because of the long-term risks of unsafe swallowing.

While NG tubes are used short term, PEG tubes, because they are invasive, are generally reserved for those patients requiring long-term enteral feeding (Lazarus, 1990). However, acute-care stroke patients are sometimes given a PEG tube for sustenance while undergoing noninvasive swallow therapy or while receiving recreational oral feeding. Those patients with chronic stroke accompanied by severe dysphagia may receive a PEG tube if noninvasive therapies prove ineffective in eliminating aspiration or sustaining adequate energy intake. With the advent of more specialized noninvasive therapies, PEG tube placement may become a last resort. The patients placed on PEG tubes are thus likely to be sicker than those who continue with noninvasive therapy. Consequently, it is not appropriate to compare the outcomes of PEG tube patients with those undergoing noninvasive therapy. Tube patients are not only at increased risk of morbidity and mortality because of their debilitated status, but the insertion of the feeding tube also can cause many different minor and major complications. These additional factors make it all the more important to identify efficacious noninvasive measures whenever possible and reserve PEG for those patients with unresolvable dysphagia or other feeding-related functional problems. Most studies conducted thus far on PEG tubes have included a mixture of patients with widely varying etiology (from paralysis to Parkinson's) and varying levels of cognition. Many studies included patients in a comatose state, and these studies will not be discussed here except in reference to tube-related complications and tube-related mortality, for which we assume the condition of the patient plays no significant role. For our discussion of morbidities and nontube related mortalities, we limit our discussion to studies including 90 percent or more cognizant patients, with a largely neurologic etiology (>75 percent), because those are patients most relevant to this report and most likely to be suffering from dysphagia, and therefore potentially eligible for noninvasive therapy. This reduces our pool of eligible studies from 20 to 7 (all studies are recorded in the appropriate evidence table, however). Only two studies were located that reported patients with dysphagia as a major part of their patient pool (Britton, Lipscomb, Mohr et al., 1997; Norton, Homer-Ward, Donnelly et al., 1996): We will pay particular attention to these two studies in this discussion.

Major Tube Complications

Major complications are those that seriously affect the health of the patient, threaten mortality, or require major interventions (such as surgery) to repair. Major complications were reported by 11 studies and included:

  • Gastric fistula

  • Bowel obstruction

  • Gastric bleeding

  • Peritonitis

  • Severe infection

  • Wound closure failure

  • Necrotizing fasciitis (extensive streptococcal cellulitis)

  • Gastric perforation

  • Hematoma

  • Septicemia

  • Ileus

Studies reporting these outcomes are shown in Evidence Table 61. If incidence rates of any of these complications were not reported in a given study, we did not automatically assume that this complication did not occur; instead, we do not discuss studies for which no data were reported. This may lead to overestimates of complication rates, if in fact the researcher found that none of the complication of interest occurred. This possible bias is unfortunately unavoidable.

All studies reporting major complications were case series. Studies on PEG tubes meeting our inclusion criteria reported that 1 to 2 percent of the patients experienced major complications during the acute-care stage or a 30-day followup interval. These complications included failure of the incision to heal at tube removal (Kadakia, Sullivan, and Starnes, 1992), gastric fistula (Miller, Castlemain, Lacqua et al., 1989; Ponsky, Gauderer, and Stellato, 1983), peritonitis (Grant, 1993; Miller, Castlemain, Lacqua et al., 1989; Stern, 1986), bowel obstruction (Golden, Beber, Weber et al., 1997), and bladder perforation (Larson, Burton, Schroeder et al., 1987). Higher percentages of complications were reported over longer periods of time by two studies (Kaw and Sekas, 1994; Taylor, Larson, Ballard et al., 1992). Three complications were reported in these two studies: bowel obstruction, gastric bleeding, and ileus. All had an incidence rate of 4.3 percent with a followup of about a year.

Because these complications were each only reported by a few studies, it is not possible to determine the precise risk of each of these, or trends based on age, patient characteristics, or time. Also, because these complications occur at such a low rate (often only occurring in a single patient in each study), a larger patient population is necessary to achieve sufficient statistical power to compare these numbers. We have assumed that neurogenic dysphagia would suffer the same risk of complications as other enterally fed patients; however, it is conceivable that a pool of patients with dysphagia who were alert and cognizant (the population of primary interest in this report) might experience fewer complications because of stronger physiology.

Minor Tube Complications

Minor complications are fairly common after PEG insertion, occurring in 4 to 40 percent of patients. For the purposes of this particular analysis, we modified the inclusion criteria of at least 50 patients that we adopted for studies of feeding tubes to include all studies with 10 or more patients with long-term followup, defined as greater than 30 days. We modified our inclusion criteria because such long-term studies were scarce.

The most commonly reported minor tube complications were:

  • Tube blockage

  • Wound infection

  • Tube extrusion

  • Tube migration

  • Cellulitis

  • Tube leakage

These complications are considered minor because they can usually be resolved with tube reinsertion, replacement, or antibiotics; invasive or extreme measures are rarely needed.

Those studies reporting minor tube complications are shown in Evidence Table 55. Dysphagia has not been broken out in any study reporting tube complications; we assume for the purposes of this discussion that disease etiology plays no significant role in the occurrence of tube complications. When studies did not report on individual complications, we do not automatically assume that they did not occur during that study, and therefore that cell of the relevant evidence table is left blank rather than filled with a 0.

Tube blockage

As shown in Evidence Table 55, tube blockage has been reported in 3 to 30 percent of patients with a PEG tube (Jarnagin, Duh, Mulvihill et al., 1992; Kaw and Sekas, 1994; Kirby, Craig, Tsang et al., 1986; Llaneza, Menendez, Roberts et al., 1988; Taylor, Larson, Ballard et al., 1992; Wolfsen, Kozarek, Ball et al., 1990). All studies were case series. The highest rate was found in a nursing home population (Kaw and Sekas, 1994); the lowest was in a mixed population followed only for the length of their acute-care stay (Jarnagin, Duh, Mulvihill et al., 1992). A tube blockage incidence rate of 8 to 9 percent was reported in three studies (Llaneza, Menendez, Roberts et al., 1988; Taylor, Larson, Ballard et al., 1992; Wolfsen, Kozarek, Ball et al., 1990); most were patients with neurologic disorders who were released to a variety of care settings. This may provide an estimate of an average occurrence rate for this complication.

One case series compared PEG with percutaneous endoscopic jejunostomy (PEJ) tubes, and found significantly more tube blockage in the latter (23 percent versus 9 percent) (Wolfsen, Kozarek, Ball et al., 1990). Given that PEJ also results in increased aspiration (as reported by the same study), as well as increased tube migration and tube leakage (reported below), it appears that PEG should be used instead of PEJ whenever possible.

Wound infection

Wound infection has been reported in 0 to 41 percent of PEG tube patients (Grant, 1993; Kaw and Sekas, 1994; Larson, Burton, Schroeder et al., 1987; Miller, Castlemain, Lacqua et al., 1989; Ponsky, Gauderer, and Stellato, 1983; Stern, 1986; Stiegmann, Goff, Silas et al., 1990; Taylor, Larson, Ballard et al., 1992). The lowest rate was reported in a case series of 302 patients followed for 30 days (Miller, Castlemain, Lacqua et al., 1989); the highest was reported in a similar study in which patients were followed for an average of 327 days. There were no apparent differences in patient mean age or proportion of patients having neurologic conditions. It therefore appears that the incidence of wound infection continues to increase over time for PEG tube patients; this again may be caused by patient selection bias, in which sicker patients remain on the tube and may be more susceptible to infections.

One randomized controlled trial (Stiegmann, Goff, Silas et al., 1990) compared the incidence of wound infection after PEG insertion with wound infection after operative gastrostomy (OG). While 3 percent of PEG tube patients experienced wound infection, 9 percent of OG patients did. No statistical analysis of this difference was reported.

Pain

Two studies reported that pain occurred in 3 percent (Taylor, Larson, Ballard et al., 1992) or 7 percent (Llaneza, Menendez, Roberts et al., 1988) of patients. The patients in these studies were similar in etiology and age. Given that pain is a subjective sensation that can be measured in many different ways, test methodology and time of measurement may account for this difference. Thus, limited evidence suggests that PEG tubes cause minimal pain for most patients.

Tube extrusion

Inadvertent tube extraction by the patient or caregiver was reported by five studies (all retrospective case series) as occurring in 4 to 41 percent of patients (Jarnagin, Duh, Mulvihill et al., 1992; Kirby, Craig, Tsang et al., 1986; Larson, Burton, Schroeder et al., 1987; Llaneza, Menendez, Roberts et al., 1988; Taylor, Larson, Ballard et al., 1992). The lowest incidence rate was reported by Kaw et al. (1994) who followed 46 nursing home residents for almost a year. The highest, from Taylor et al. (1992) (who also reported the highest infection rate) was in a group of 92 largely neurologic patients in various care settings. While nursing home patients in general have the highest rates of complications and morbidities, this is not the case here.

There appeared to be no correlation between time of followup or patient characteristics and the incidence of inadvertent tube extraction.

Tube migration

A feeding tube may slip from its position as a result of patient movement or improper anchoring. Given that reporting of complications was often not clear in these studies, it may be the case that tube migration and inadvertent tube extraction (above) were mixed together in some studies.

Tube migration has been reported by nine studies as occurring in less than 1 to 21 percent of PEG tube patients (Golden, Beber, Weber et al., 1997; Kadakia, Sullivan, and Starnes, 1992; Kaw and Sekas, 1994; Larson, Burton, Schroeder et al., 1987; Miller, Castlemain, Lacqua et al., 1989; Ponsky, Gauderer, and Stellato, 1983; Stiegmann, Goff, Silas et al., 1990; Taylor, Larson, Ballard et al., 1992; Wolfsen, Kozarek, Ball et al., 1990). The lowest number was reported by a case series of 87 consecutive patients in various care settings (Larson, Burton, Schroeder et al., 1987); the highest was reported by a study of 46 nursing home patients (Kaw and Sekas, 1994), again suggesting that nursing home residents suffer from more complications and morbidities. One randomized controlled trial compared PEG tubes with OG and found tube migration to occur slightly more often with PEG tubes (7 percent versus 5 percent), not a significant difference (Stiegmann, Goff, Silas et al., 1990). One case series compared PEG with PEJ and found that tube migration occurred more often with PEJ tubes (3 versus 15 percent, respectively) (Wolfsen, Kozarek, Ball et al., 1990) (statistical significance not reported).

There is no apparent effect of age or etiology on the incidence of tube migration.

Cellulitis

Four studies reported the incidence of tissue inflammation after infection (Jarnagin, Duh, Mulvihill et al., 1992; Kadakia, Sullivan, and Starnes, 1992; Kirby, Craig, Tsang et al., 1986; Llaneza, Menendez, Roberts et al., 1988), ranging from 3 to 12 percent. All were case series. The lowest incidence was found in a study of 79 patients who were followed during their acute-care stay (Jarnagin, Duh, Mulvihill et al., 1992). The highest rate was reported among 73 patients followed for an average 420 days and living at home or in a nursing home (Llaneza, Menendez, Roberts et al., 1988). These findings suggest that the incidence of cellulitis may increase over time; but again, patient attrition may have affected these rates. Interestingly, although cellulitis usually occurs after an infection, the rate of wound infections reported by these studies was lower than the rate of cellulitis.

Tube leakage

Nine studies reported tube leakage as a complication after PEG (Grant, 1993; Kadakia, Sullivan, and Starnes, 1992; Kaw and Sekas, 1994; Larson, Burton, Schroeder et al., 1987; Llaneza, Menendez, Roberts et al., 1988; Stern, 1986; Stiegmann, Goff, Silas et al., 1990; Taylor, Larson, Ballard et al., 1992; Wolfsen, Kozarek, Ball et al., 1990). Incidence ranged from 1 to 13 percent. The lowest rate, reported by Larson et al. (1987), was in a pool of 299 consecutive patients, 75 percent with neurologic disorders, mean age of 64, followed for 30 days. The highest rate, from Kaw and Sekas (1994), was in 46 nursing home patients, 91 percent of whom had neurologic disease, and who were followed for a mean of 321 days. Thus, duration of tube use, age, and debilitation may all play a role in the incidence of this complication. However, when all nine studies are examined, the incidence of tube leakage does not appear to change over time. On the other hand, there was a trend for age and incidence of tube leakage to correlate.

One randomized controlled trial reported tube leakage in PEG patients and OG patients. The incidence rate was 2 percent for both groups during their acute-care stay (Stiegmann, Goff, Silas et al., 1990). One nonrandomized trial comparing PEG with PEJ found a slightly lower rate of tube leakage in PEG patients followed over 275 days (9 percent versus 11 percent) (Wolfsen, Kozarek, Ball et al., 1990); statistical significance of this finding was not reported.

Other minor complications

Other complications reported by some studies included tube breakage, minor ileus, minor bleeding, gastric ulcer, hematoma, dehydration, and gastric fistula. Minor bleeding was reported as occurring in about 1 percent of patients in three case series (Golden, Beber, Weber et al., 1997; Grant, 1993; Stern, 1986), and in 5 percent of patients in a randomized controlled trial (Stiegmann, Goff, Silas et al., 1990). There is no clear reason for this discrepancy. One study reported a rate of 7 percent gastric hemorrhage (Taylor, Larson, Ballard et al., 1992); these patients were followed for an average of 327 days.

Summary of minor tube complications

Variations in rates of each minor complication suggest differences resulting from patient or care setting characteristics (nursing home residents experiencing more severe problems except with tube extraction), and in a few cases, length of followup time or age of patients. Two studies have reported the rates of complications broken out by care setting; Taylor et al. (1994) reported that 14 percent of all complications occurred while patients were in the hospital, 14 percent occurred when patients were released to go home, 62 percent occurred in a nursing home, and 9 percent occurred in other care settings. These findings suggest that nursing home patients experience substantially more complications than patients in other settings. On the other hand, another study (Hull, Rawlings, Murray et al., 1993) showed no significant differences in overall complication rates based on residential status; however, three out of four specific complications (tube blockage, tube migration, and cellulitis versus tube extraction) were found to occur more often in nursing home patients than in patients in other care settings. Overall, the evidence does suggest increased complications among nursing home patients, although because these are comparisons among case series, a lack of internal control makes these trends unclear.

Because patients suffering from severe dysphagia may require a feeding tube if they are unable to feed well enough orally to ingest adequate amounts of food safely, it is important to determine which enteral feeding method is the most effective and safest. One case series has suggested a superiority of PEG over PEJ in terms of incidence of complications (Wolfsen, Kozarek, Ball et al., 1990). One other randomized controlled trial found PEG to be far superior to OG in the rate of wound infection (Stiegmann, Goff, Silas et al., 1990). Such findings, if supported by further, well-controlled research, should be considered when placing these patients on enteral feeding.

Major Morbidities

The primary morbidity discussed in this report thus far has been pneumonia, primarily because it has been the only major outcome consistently reported in the literature. For patients with neurologic disease with a PEG tube, pneumonia has been reported in 3 percent during acute care (Stiegmann, Goff, Silas et al., 1990) and 10 percent when patients were followed for up to 210 days (Valenti, Trudell, and Bentley, 1978) (see Evidence Table 59). The number of stroke patients was not reported by either of these studies, but they were both over 80 percent neurologic etiology in these two studies. These results do suggest that for PEG tube patients, unlike patients orally feeding, the risk of pneumonia does not drop sharply after the first few weeks after the stroke event, further emphasizing that the patients placed on PEG tubes are patients suffering from chronic neurologic problems that do not clear up spontaneously.

However, when aspiration was measured as a surrogate for pneumonia immediately after the tube placement, one study reported an incidence rate of only 1 percent (Stern, 1986); few details were provided on patient characteristics or care setting. This raises the obvious question as to what kind of pneumonia these PEG patients are getting over the long term, if it is not the result of aspiration. Two other studies that did not meet our inclusion criteria because of mixed patient population and condition, reported that aspiration rates increased over time, from 4 percent during acute care (Kadakia, Sullivan, and Starnes, 1992) to 5 percent at 30 days (Wolfsen, Kozarek, Ball et al., 1990). Because these patient populations include many degenerating patients, this increasing rate could be a reflection of advancing disease. This is not necessarily applicable to stroke patients.

Malnutrition has not been reported as an outcome in any published PEG tube study.

Short-term Outcomes

The most immediate purpose of enteral feeding is to stabilize the patient nutritionally. Many patients put on a feeding tube have already lost significant amounts of weight by the time of insertion, thus necessitating the tube. If weight gain cannot be achieved, then there is no point to the feeding tube. One study reporting this outcome exclusively examined stroke patients with dysphagia who were randomized to receive either a PEG tube or an NG tube (see Evidence Table 50). These researchers found that while PEG tube patients gained a mean 2.2 kilograms over a 1- to 6-week period, the NG tube patients lost a mean 2.6 kilograms (Norton, Homer-Ward, Donnelly et al., 1996). However, these results are not reported on an intent-to-treat basis, only including surviving patients. Another study with 100 percent of patients with neurologic disorders, 86 percent of whom suffered from dysphagia, reported that after a 6-month mean period, patients on average had gained 9 kilograms (Britton, Lipscomb, Mohr et al., 1997). Two other similar studies not reporting on patients with dysphagia specifically reported similar mean weight gain at 4 weeks of 4.5 kilograms (Sant, Gilvarry, Shannon et al., 1993) and 3.4 kilograms (Park, Allison, Lang et al., 1992). These positive results may not be so much an indication of the effectiveness of tube feeding, but a reflection of how much weight patients have lost by the time they are on a feeding tube. These results in toto still suggest, however, that patients with neurologic disorders do benefit from PEG tubes in terms of nutritional status, and patients with dysphagia in particular may continue to gain weight for up to 6 months.

With such stabilization of nutritional status, it may be more likely for stroke patients to recover function and ultimately be removed from the feeding tube, and then become eligible for noninvasive therapies. Neumann (1993) reported that in a neurologic rehabilitation center during a 17-week period, patients on a feeding tube who were also given indirect and direct swallow therapy were able to be removed from tube feeding in 67.5 percent of cases (tube feeding method not specified) (Neumann, 1993) (see Evidence Table 47). This study suggests that the use of enteral feeding in addition to noninvasive therapy may help the patient regain exclusive oral feeding. However, because this study did not include a control group of tube-fed patients who did not receive therapy, it is unclear what exactly affected the feeding method outcome.

Mortality (Resulting from Morbidity, Underlying Illness, or Tube Complications)

If patients suffer from severe pneumonia, and are in a debilitated state, they may die from this illness. Such mortality has been reported in 2 percent (Stern, 1986) of patients after 30 days, and 6 percent of patients during a period of up to 210 days (Kirby, Craig, Tsang et al., 1986) (see Evidence Table 66). These findings were in populations of 85 percent or more neurologic etiology. They suggest that as the cumulative rate of pneumonia continues to rise over time (as discussed above), a substantial portion of these patients will die of the illness. If pneumonia incidence rates are 3 to 10 percent during a period of up to 210 days, and death rates are 2 to 6 percent during the same period, this suggests that approximately 60 percent of PEG tube patients with neurologic disorders will die of pneumonia if they contract it. The causal relationship between these morbidities and mortalities is unclear in PEG tube patients: Are they intubated because of severe illness, or do they become sicker because they are on a feeding tube?

Patients may also die of their underlying illness while receiving enteral feeding (see Evidence Table 66). In populations consisting largely of patients with neurologic disorders, this has been reported in 4 percent after 30 days up to 24 percent after a mean of 100 days (Stern, 1986). These findings suggest that many of these patients had neurodegenerative disease or recurring stroke, and thus are severely ill patients.

As discussed earlier in this section, the insertion of a PEG tube can lead to serious complications, such as gastric fistula, severe infections, and bowel obstruction. If the patient is frail and the complication is not detected soon enough, the patient may die (post-operative mortality). Complications may also occur peri-operatively, although these are more rare.

Data addressing this outcome were available in a single study (see Evidence Table 68), in which a single patient died as a result of peritonitis (1 percent death rate) (Stern, 1986). Peritonitis was the most often reported cause of tube-related death in all PEG studies. More information is needed in this area before making conclusions; when such mortality rates are so small, larger patient populations are needed for statistical power in order to compare these results with the results of other studies.

Summary: Mortality After a Feeding Tube

By far, most of the patients on feeding tubes die as a result of their underlying illness that necessitated the feeding tube in the first place; studies have reported this rate of mortality to be 4 to 24 percent, depending on the followup time and possible patient differences. The next most common cause of death appears to be serious comorbidity (pneumonia or aspiration), causing deaths in 2 to 6 percent of PEG tube patients with neurologic disease; the longer the PEG tube is retained, the greater the chance of dying of one of these morbidities. This may indicate that patients who retain the tube for longer periods of time are the frailest, perhaps most likely to have a terminal illness, and thus most likely to die. Deaths due to tube complications have been reported in only one study, with a single patient dying as a result of peritonitis.

Summary: Outcomes After Treatment for Dysphagia

Question 4 contained four separate (but interrelated) subquestions about the efficacy of different treatments for the prevention or elimination of complications and morbidities associated with dysphagia. The first subquestion asks when noninvasive swallow therapy or diet modification is appropriate. Answering this question requires the ability to identify patient populations who would most likely benefit (the second subquestion) as well as symptomatology and stage of disease at which noninvasive therapy has the greatest effect. No research to date has currently addressed these two questions; an answer would require a study that compared different patient populations in their response to particular therapies. Because the current literature is largely case series data on stroke patients, this information cannot be extrapolated.

The third question asks simply, What can the evidence tell us about these therapies? The simple answer to this is very little. To determine treatment efficacy in any patient group suffering from a recuperative or degenerative disease, a comparison group (whether an untreated control or another treatment) is necessary to adjust for the changing effects of disease over time. This has been done in only a few studies. It is also necessary to include enough patients in the groups so that statistical power can be achieved for the results; assuming that the rate of aspiration pneumonia in stroke patients with dysphagia who are treated noninvasively is approximately 9.2 percent (DePippo, Holas, Reding et al., 1994) and the aspiration pneumonia rate in untreated stroke patients with dysphagia is about 14.2 percent (Nilsson, Ekberg, Olsson et al., 1998), it would require about 1,296 patients to find a statistically significant difference.2 However, such a comparison between a treatment and a no-treatment group would not be ethical; smaller differences between different levels of intervention would require even more subjects.

One controlled trial has indicated that intensity of swallow therapy and diet modification affects how well stroke patients with dysphagia continue these exercises independently (DePippo, Holas, Reding et al., 1994); however, intensity of treatment appeared to have no impact on the incidence of pneumonia or mortality after treatment. Another study determined that the use of a mechanically altered thickened diet resulted in better outcomes (as measured by pneumonia incidence) than a pureed diet in stroke patients with dysphagia (Groher, 1987). Although these studies do not provide a no-treatment group to determine pure treatment efficacy, their use of a comparison group with different treatment does provide information on relative efficacy.

One study included a historical control group (Martens, Cameron, and Simonsen, 1990) and found no cases of pneumonia in either the historical controls or the treatment group; they did, however, report marginally significant improvements in nutritional intake and weight gain in the treatment group compared with the controls.

The only way to glean perspective from case series data is to compare their results with other studies reporting the same outcomes in patients with dysphagia without a treatment program. This was done for pneumonia outcomes in our discussion of main Question 1 above. Some relevant calculations are discussed in Appendix D, and there does appear to be evidence that pneumonia rates are lower in the presence of a dysphagia management program.

We found no specific research on those swallow therapies (as described in the section Treatment of Oropharyngeal Dysphagia) that use voluntary physiological maneuvers during a swallow to force the bolus to pass through the pharynx correctly (the Mendelsohn maneuver, the supraglottic swallow, and the super-supraglottic swallow). In fact, very little literature reported patient outcomes after specific therapies at all; there were a few studies on palatal training appliances and TTA, but the rest were on general swallow therapy and/or diet modification. We do understand that these treatments are individualized and can be applied in many combinations, thus making outcomes reporting difficult for each one; however, it is impossible to make evidence-based practice decisions solely on the basis of anecdotal clinical accounts that these treatments are effective.

The final subquestion asked, Is PEG useful as a primary therapy or is it a last resort? This may not be an appropriate question if patient populations directed to noninvasive therapy differ in clinically significant ways from those directed to PEG tubes. There are, however, scarce data reporting current practice patterns for directing patients to different treatment; we can only conjecture on the basis of anecdotal reports and personal communications. For one, stroke patients with mild to moderate dysfunction who are expected to recuperate will probably only receive a NG tube if enteral feeding is necessary. These patients are also most likely to receive diet modification or swallow techniques as temporary measures until their swallow function spontaneously improves with time. Patients likely to be recommended for a PEG tube are those whose swallow function does not appear to improve during the first few weeks after the stroke event, and therefore these may represent more debilitated, chronic stroke patients. However, these patients, if not severely debilitated, may be the most likely to benefit from long-term dietary modification and swallow techniques if they are functionally able to follow directions and perform the maneuvers; for these patients, a PEG tube as well as noninvasive therapy can be used simultaneously. It may then be possible, as suggested by a single study (Neumann, 1993), that tube-fed patients could potentially resume an oral diet.

Evidence from the literature thus far suggests that patients on PEG tubes fare worse than those undergoing noninvasive therapy. It is unclear whether this is caused by the tube or by the severity of illness. These findings suggest that there are clinically significant differences in current practice patterns between those patients referred for noninvasive therapy and those referred for a PEG tube.

We strongly recommend that researchers consider conducting controlled trials on dysphagia treatment in a defined patient population, including such outcomes as aspiration, aspiration pneumonia, malnutrition, wasting, weight change, caloric intake, mortality resulting from pneumonia and other morbidities, and overall mortality. We also strongly recommend that patients be followed for an extended period of time so that the long-term consequences of therapy can be explored.

Synthesis of Results Common to the Four Questions

Although this evidence report has concentrated on four main questions, these questions are not entirely independent of one another. Therefore, at this point, we present related information in one section. Thus, we present here the results of all of the controlled studies addressing our main question: Does a dysphagia diagnosis and treatment program prevent aspiration pneumonia? This, as noted in the Methodology section of this report, may be the question of most interest to society.

The studies relevant to the present discussion are described in Evidence Table 69. There were four studies of diagnostic-treatment programs [Daniels, Brailey, Priestly et al., 1998; Nilsson, Ekberg, Olsson et al., 1998; Odderson, Keaton, and McKenna, 1995; Spiegel, Creed, Selber et al., unpub(b)]. The Odderson and Spiegel studies had their own historical controls, and we calculated historical control values for the other two. (The caution that must be exercised in interpreting the results of such exploratory analyses is discussed in our answer to Question 1). The two studies with their own controls had highly statistically significant results (p = 0.0050 and p = 0.0003, respectively, with the Fisher exact test, one-tailed, alpha = 0.05). These studies also had statistically significant results by the Wilson score method for 95 percent confidence limits around the difference between two proportions (two-tailed). The two studies for which we provided historical controls had near significant results (p = 0.056 and p = 0.063, respectively) with the Fisher exact test and also by the Wilson method. These latter two studies were of low statistical power, so their nonsignificant results do not convincingly demonstrate a lack of effect and can best be considered inconclusive. One study of palatal training devices by Selley, Roche, Pearce et al. (1995) found nonsignificant results (p = 0.210) by the Fisher exact test, but attained a statistically significant difference by the Wilson method (95 percent CI 2.8 to 73 percent). Because of the extremely low power of this study, this is also an inconclusive result, and it again is important not to confuse lack of evidence for an effect with evidence of no effect. Two studies of diet and/or swallow therapy comparisons (Groher, 1987; Kasprisin, Clumeck, and Nino-Murcia, 1989) had highly significant results by both tests (p<0.0001 and p = 0.0002, respectively, on the Fisher exact test); and one (DePippo, Holas, Reding et al., 1994) had a nonsignificant result by both tests (p = 0.88 on the Fisher exact test, and a nonsignificant difference on the Wilson test). As in the studies above, this was a low powered study, and it is again difficult to have confidence in the finding of no statistically significant effect.

Thus, for the eight controlled studies above, we found four with highly significant results, two with near significant results and low power, and two with nonsignificant results and low power. Because of the low power, none of the nonsignificant results convincingly demonstrated no effect. This puts the emphasis on the four trials that found statistical significance. The most unbiased appraisal of these results is that there is an apparent effect (regardless of the cause of that effect) demonstrated by the studies that is not caused by random error. We again caution that a statistical analysis such as ours cannot eliminate confounding differences between control and experimental groups resulting from differences in treatment, patient characteristics, or other factors that contribute partially or wholly to the apparent effect.

Effect of Increased Power: An Illustration

Expressed in its most simplistic terms, the preceding analysis found that four trials found a statistically significant result, and four did not. Were we to simply tally up the number of studies for and against an effect (i.e., were one to vote count) we would likely determine that the sum total of the results was inconclusive. However, such vote-counting procedures have low statistical power (Hedges and Olkin, 1985). To illustrate this point and to further illustrate the low statistical power of the individual studies, we provide below another example using the data from the above-mentioned studies. We caution, however, that this exercise is solely for the purposes of illustration; although it can be used to infer that a correlation exists between the presence of a dysphagia program and a decrease in aspiration pneumonia rates, this analysis cannot be used to conclude that the program is the sole cause of these decreases. We also note that we could have illustrated the point with hypothetical data, but chose not to do so because using available data provides a more direct example of how vote-counting is conservative in the present case. Finally, we note that we did not use the results of this illustration in deriving the conclusions of this report.

In this example, we combine the above one-tailed p-values of the individual studies obtained from the Fisher exact test to yield an overall level of statistical significance. We chose to meta-analytically combine p-values rather than to employ a statistic that provides an effect size (e.g., the odds ratio) because any effect size estimated by combining the results of these studies is likely to be imprecise. In effect, our present analysis nevertheless approximates combining the results of the individual studies into one larger study. We combined the p-values by first taking their natural logarithms, and multiplying the result by -2. These quantities were summed according to the Fisher method, described in Rosenthal (Rosenthal, 1991). The sum, 76.946, was compared with the critical value for a chi-square distribution with df =16 (number of studies multiplied by 2). The resulting probability was less than 0.0001, which is highly significant. We obtained a similarly significant p-value using the method proposed by Stouffer et al. and described in Rosenthal (Rosenthal, 1991). The z-scores corresponding to the p-values were summed and this quantity was divided by the square root of the number of studies. The obtained z-value, 5.984, was then compared with the standard normal distribution. Again, the resulting probability was less than 0.0001.

We also conducted a statistical test to determine whether the results of these trials were significantly different from one another using the heterogeneity test described by Rosenthal (Rosenthal, 1991). Briefly, the sum of the squared differences between each z-score and the mean z-score was compared with the chi-square value for df = 7 (number of studies minus one). The obtained value, 14.811, was statistically significant, p = 0.0385, suggesting that the p-values were heterogeneous. We did not extensively investigate the source of this heterogeneity but, given that in some of these studies a single case of pneumonia can make the difference between statistical significance or nonsignificance, this heterogeneity is not surprising. However, we chose to be extremely cautious about combining these heterogeneous data, and, partly for this reason, we did not use the results of this analysis in deriving the conclusions of this report. However, the present analysis serves the purposes of our illustration by showing that because of an increase in statistical power, the overall test is statistically significant even when the results of the nonsignificant trials are included in the analysis.

Conclusions

Dysphagia (difficulty swallowing) is a cluster of symptoms that is commonly experienced by elderly patients with neurologic diseases. The incidence and prevalence of dysphagia cannot be exactly determined, precisely because it is a cluster of symptoms, implying that diagnostic technology as well as clinician judgment will affect the reported occurrence. With this in mind, it can be estimated that the greatest burden of illness of dysphagia occurs in stroke patients; approximately 160,000 to 573,000 stroke patients are affected by dysphagia each year; this is 42 to 75 percent of all stroke patients. The annual burden of illness created by other neurologic diseases is much smaller (approximately 51,000), due to the much lower incidence of these diseases.

This report addresses four main questions, but there are two interrelated themes that run throughout it. The first of these is that the treatment given to patients diagnosed with a swallowing disorder makes it impossible to accurately determine the sensitivities and specificities of the diagnostic test. The second is that the statistical power of many studies in this field is simply too low to find differences between groups, should such differences exist. The relatively low quality of the literature in this area is also worth noting. We recognize that our attempt to come to conclusions based on the limited evidence available to us is uncommon. However, it must also be recognized that many of today's pressing healthcare questions must be answered in the absence of strong evidence, and that failure to do so imposes serious limitations on the practical applications of an evidence-based medicine.

The first question we addressed in this report is perhaps the one most important to society. This question asks whether stroke patients fare better with a dysphagia diagnosis and treatment program than without. As such, the answer to this question depends on both the performances of the tests used to diagnose dysphagia and the efficacy of the treatments for this disorder. We conclude from our analysis of the evidence pertaining to this question that there is a correlation between use of a dysphagia program and a reduction in aspiration pneumonia rates. However, the available studies do not allow us to rule out the possibility that this reduction is due to other causes. This is because the designs of these studies do not rule out the possibility that other factors such as unspecified changes in physician practice, changes in hospital policy and procedures, or even some unknown factor may have caused the effect. This is why the efficacy of dysphagia diagnosis and treatment programs must be ascertained using randomized controlled trials. While it is probably unethical to perform a controlled trial that includes a patient group that does not receive treatment, it is ethical to compare the efficacy of different dysphagia programs. In this light, it is important to remember that the purpose of randomization is not to control for known confounds, it is to control for unknown confounds. Therefore, in the face of the current data that lack such control, we can only conclude that use of dysphagia programs is correlated with a reduction in aspiration pneumonia rates. We cannot technically conclude that these programs cause this reduction. However, it is also important to note that the effects observed in the better studies are large, so it would be imprudent to ignore them. Therefore, these results must be taken as evidence of the efficacy of dysphagia management programs.

We further address this first question in our supplemental analysis. The results of this analysis indicate that implementation of a dysphagia diagnosis and management program is cost-effective. However, the results of this supplemental analysis are subject to the same caveats as our analysis of the effectiveness of these programs, so they should not be regarded as absolute.

Our second question addressed the ability to predict or detect morbidity (in the form of aspiration, aspiration pneumonia, or malnutrition) from signs and symptoms detectable during a preliminary bedside exam (BSE). One aspect of this question concerns whether a preliminary BSE is sensitive enough to be used in isolation [without a full BSE or instrumented followup], or, alternatively, whether it can selectively identify patients who would benefit from more extensive diagnostic testing.

No single clinical sign or symptom appears to have satisfactory clinical utility for prediction of aspiration pneumonia or aspiration. The presence of a cough is probably the best detector of aspiration, but its detection of aspiration is not sufficiently reliable. However, there is no need to restrict clinicians to use of a single sign or symptom, and more research on the utility of combinations of signs and symptoms needs to be conducted.

Aspiration is not the only potential risk factor for aspiration pneumonia. While there were no clinical signs or symptoms directly related to dysphagia that predicted aspiration pneumonia well, it was found that dentition status and oral hygiene did show some predictive value for pneumonia. This is an area not yet well-explored, but worth further study.

No studies have examined signs and symptoms that may predict malnutrition, and it has not even been definitively established that dysphagia is associated with an increased incidence of malnutrition.

The third question addressed in this report compared the sensitivity and specificity of several dysphagia diagnostic tests. While there was a trend in one study that suggested that use of fiberoptic endoscopic examination of swallowing and sensory test (FEESST) in a dysphagia management program better prevented aspiration pneumonia than modified barium swallow (MBS), the difference was not statistically significant. Studies comparing MBS with fiberoptic endoscopic examination of swallowing (FEES) have generally found that MBS detects some cases of aspiration that FEES does not, while FEES detects some that MBS does not. There is no way to determine which diagnostic is correct because, in spite of common belief that MBS is a gold standard test, this has not been proven.

Full BSEs can have sensitivities for aspiration near 80 percent, and specificities near 70 percent. This, plus the fact that about half of the patients with aspiration do so silently (without a cough) and the very low pneumonia rates observed in dysphagia management programs that used full BSEs, indicates that these BSEs are capable of detecting most aspiration (even silent aspiration), or that any undetected aspiration does not contribute greatly to the pneumonia rate. In spite of these good results, the ability of the BSE to detect aspiration should be optimized in future research.

Currently available data do not allow one to determine the degree to which, or even whether, use of videofluoroscopy leads to lower pneumonia rates than the BSE. This is in spite of the fact that it is reasonable to believe that videofluoroscopy would be superior because of the additional information it provides. One reason that videofluoscopy has not been shown to be superior to the full BSE is that the results of the latter are difficult to improve upon. One result of this is that available studies do not have enough statistical power to detect the relatively small differences between diagnostic tests that one should expect.

Because of the lack of appropriate data, we compared videofluoroscopy and the full BSE in our supplemental cost-effectiveness analyses. These analyses, which are based on published protocols, assume that a preliminary test will be used to refer no more than 39 percent of all patients to a subsequent full BSE or videofluoroscopy. Under these conditions, our analysis suggests that a dysphagia diagnosis and treatment programs that employ the BSE would either save money or have very little net costs if they reduced pneumonia rates by amounts similar to those obtained in certain published studies. In addition, our results indicate that the slightly higher costs of videofluoroscopy would be offset if it provided an additional 10 percent reduction in pneumonia rates.

The fourth primary question in this report addressed the efficacy of different treatment interventions in preventing serious morbidity (pneumonia and malnutrition), their effect on short-term outcomes (weight change, energy intake, aspiration, prevention of a feeding tube, and minor feeding tube complications), and their effect on long-term outcomes [quality of life (QOL) and mortality]. Data provided by these studies are of very limited use because of poor study design. There is limited evidence from one randomized controlled trial (RCT) that a mechanically altered (soft solids and thickened liquids) diet is superior to a pureed diet for stroke patients with a history of aspiration pneumonia. Results of another randomized controlled trial on the intensity of swallow therapy on pneumonia incidence were inconclusive. Otherwise, the literature is exclusively case series, which were not comparable to one another because of differences in care settings, patient characteristics, or length of followup. Percutaneous endoscopic gastrostomy (PEG) tube literature generally indicates a superiority of PEG over nasogastric (NG) tubes, operative gastrostomy (OG), and percutaneous endoscopic jejunostomy (PEJ) tubes; these studies also report a higher rate of morbidity than do those studies of patients undergoing noninvasive therapy. This could be due either to the greater level of debilitation in patients needing a feeding tube or aspects of the treatment itself.

In conclusion, data to answer these four questions are sparse and not of the highest reliability. Determining the clinically prudent course of action in these circumstances raises ethical issues. This is because, while the superiority of any given dysphagia diagnosis and treatment has not been proved, such programs have the potential to benefit patients, and a minimal potential to harm them. They also have the potential to be cost-effective. Withholding such programs would deny these potential benefits to patients. It therefore seems prudent to conclude that active stroke management in acute care continue to include dysphagia-specific management with formal diagnosis and treatment (the specifics of which can be determined by the individual hospital) as part of standard protocol.

At the same time, current data prohibit us from specifying the precise components of these programs, and future research in this area is urgently needed. To this end, we detail in the Future Research section of this evidence report a recommended clinical trial that will answer many of the outstanding questions. We strongly suggest that this trial, or some version of it, be conducted in the near future.

Future Research

In this section, we first discuss particular shortcomings of study design and research in the available literature; we then focus on the most important areas needing research and discuss the design of a trial that would answer the above questions.

Shortcomings of Available Research

Patient Selection

As mentioned above, most patients included in the studies thus far have been stroke patients; some researchers have examined exclusively stroke patients, while others have included a patient mix of many neurologic etiologies. Because stroke is a recuperative neurologic disorder, while many others that affect the elderly are degenerative, it creates problems to pool the outcomes data obtained from these patients. We suggest that in the future, researchers make sure to include patients with dysphagia of only one etiology in each study.

The lack of research on dysphagia in patients other than stroke victims ignores the considerable burden of illness resulting from these other diseases. More research on these latter kinds of patients is needed. It must be remembered that treatment recommendations made for a patient with Parkinson's disease, for example, on the basis of how a stroke patient with similar symptoms responded to that treatment is not necessarily appropriate.

Patients should all be at the same stage of disease, or plans should be made before a study is begun, to stratify patients by disease stage. The ambiguity caused by failure to use a homogeneous patient group or to stratify is illustrated by studies reporting that they examined stroke patients a mean of 15 days post-stroke. This could mean that some patients were seen immediately after the stroke while others were seen 2 months post-stroke. Symptoms may be different at these different time periods, making results difficult to interpret. The same holds true for other patients with neurologic disease: in neurodegenerative diseases, swallowing difficulties may be more pronounced as the disease progresses. Therefore, the efficacy of diagnosis and treatment should be examined by severity of disease.

Control Groups

Issues of study design are especially important with stroke patients, because they often regain their swallowing ability spontaneously as they recover after the acute stroke episode. Thus, in case series following the treatment of a group of stroke patients after the event, there is a high potential for time confounds; this makes it extraordinarily difficult to interpret results of such studies because they do not conclusively show what caused the recovery of these patients. Similarly, in degenerative neurological disease, swallowing function could deteriorate spontaneously, thus masking the effectiveness of treatment. Only controlled studies can resolve these problems.

Also, only controlled studies can determine whether a particular treatment is best for any given patient population. Although we have argued in the Methods section of this report that comparing the efficacy of different treatments is not as interesting as comparing the efficacy of different diagnosis and treatment programs, we also recognize that refinements in treatment strategies will most likely come about as a result of controlled trials comparing treatment efficacy. In some situations (e.g., patients with advanced Alzheimer's disease), it may even be desirable to determine whether treatment is more effective than doing nothing at all.

In other situations, it would not be ethical to include a control group that receives no treatment if it has already been found that patient condition improves with treatment, and there is some limited evidence in the literature that this is the case. In the case of diet modification, there is enough evidence of treatment effect in some patients that employing an untreated control group would not be ethical. In cases where it is unethical to include an untreated control group, the control group would have to be either a group receiving a different level of treatment or a historical control. Historical controls have been used in some of the treatment studies evaluating the implementation of a dysphagia treatment program. There are, however, limitations to this type of control group, as there is no way to ensure patient or management equivalency, and patients are not chosen in the same way (consecutive versus record review). However, a historical control within the same institution is preferable to controls from a different institution. One study has been conducted comparing different levels of intervention (DePippo, Holas, and Reding, 1994 ); in such a study design with randomization, it could be ensured that patient characteristics were similar in each trial arm, and therefore not affecting results. We therefore suggest that all future research evaluating treatment efficacy include at least two randomized treatment arms in each study so that even if absolute efficacy cannot be determined, relative efficacy can.

We also recognize that refinements in diagnosis will come about primarily as a result of trials that compare the efficacy of different diagnostic modalities. Designing such studies at the current time is problematic. Determining the performance of a diagnostic method (sensitivity and specificity) requires comparison to a gold standard diagnostic that identifies every positive and negative case of disease correctly. It has not been satisfactorily demonstrated that any of the available tools are, in fact, gold standards. Further, determining diagnostic test performance requires that false-positive rates be measured. This, in turn, means that some patients who are diagnosed as having a swallowing disorder should not receive treatment. Therefore, the only ethical kind of studies comparing diagnostic test performance would have to be conducted on a patient population known not to benefit from treatment. Because disease in such patients may be more severe, it is not clear that sensitivities and specificities from such studies would be generalizable to other patient populations.

Partly for these reasons, determining the percent agreement between a diagnostic test and the modified barium swallow (MBS)is not an appropriate measure by which to gauge the efficacy of the first test. It is, after all, possible for the results of two relatively poor tests to perfectly agree with each other, and high correlations between the results of two tests do not mean that the sensitivities and specificities of the two tests are similar. This latter point can be illustrated by a hypothetical experiment in which two tests are given to ten patients. Both tests yield one false positive and four true negatives, and therefore have a specificity of 80 percent. However, one of these tests yields four true positives and one false negative, while the other yields three true positives and two false negatives. The correlation between these tests (0.9) is relatively high, but the sensitivity of the former test in 80 percent and that of the latter 60 percent, a difference that could be clinically important in spite of the apparently "good" correlation between the two tests.

Outcomes

Meaningful outcomes to consider in any study are those outcomes that measure aspects of health that are important to the patient. However, it is also important to clearly report these outcomes. It has been reported that aspiration leads to increased risk of aspiration pneumonia, and incidence of pneumonia has been the most commonly reported long-term outcome in any of the studies reviewed in this report. However, it was uncommon to find aspiration pneumonia distinguished from general pneumonia when reported as an outcome in the dysphagia literature. It would be interesting to see studies explore the changing rate of pneumonia over time after an acute stroke. With this information, studies following patients for different lengths of followup can be compared using meta-analysis, decision analysis, or other mathematical calculations. Malnutrition, quality of life (QOL), and disease-specific mortality are other potentially important outcomes of interest that have been seldom reported.

The relationship of malnutrition to dysphagia has not been definitively determined, and this relationship needs to be explored. If it does turn out that malnutrition is a serious health concern resulting from dysphagia, then this endpoint should be reported as an outcomes measure in evaluation of dysphagia diagnostic and treatment technologies. Thus far, only two studies have been conducted, both on nursing home patients (Keller, 1993; Keller, 1995; Thomas, Verdery, Gardner et al., 1991); this research should be expanded to include patients in other care settings with different severity of disease.

QOL is an important, often neglected, measure that should be more seriously considered by researchers. No diagnostic or treatment program is worth doing if it results in a patient who is just as physically, socially, or psychologically impaired as before the program was undertaken. QOL is a subjective measurement that can be made in numerous different ways, and no one measurement method has been judged to be superior, thus making this endpoint difficult to address. However, it should not be ignored.

Many studies have been conducted with such a short followup period that mortality is not addressed. In those cases where mortality has been addressed, overall mortality has generally been the only measure. It would be interesting to know the causes of mortality in dysphagia patients; obviously, death from pneumonia is a serious consideration. Another question of interest may be whether a general weakening of the patient's system increases the risk of dying from other causes. If the link between dysphagia and malnutrition is substantiated, then this is an important concern as malnutrition in the elderly has been found to weaken the immune system (Chandra, 1989, 1990). These possible links need to be explored and reported.

Other short-term outcomes of interest have seldom been reported specific to particular treatments. In the field of noninvasive swallow therapy using different postural techniques and exercises, it would be interesting to see more on changes in nutritional intake (as measured by kilocalories) or weight changes. In the field of diet modification, short-term outcomes seldom reported include the ability to maintain the recommended diet safely, nutritional intake, and weight changes. Elimination of aspiration is a specific measure of interest, which should be evaluated over an extended duration of treatment; many studies have measured it during a barium swallow as the treatment is initially being tested, but few have followed this outcome for these patients after they have attempted to use the treatment independently.

Investigators should attempt to identify particular symptoms that could be used to determine an appropriate treatment plan. It is not clear whether one set of signs and symptoms would serve all patients, or whether different signs and symptoms should be used for patients with different primary diagnoses. It is also not clear that the same set of signs and symptoms should be acted upon in the same way for patients at different stages in their disease. For example, a given set of signs and symptoms in patients with early-stage Alzheimer's disease might lead to one course of action whereas the same set of signs and symptoms in a patient with advanced Alzheimer's disease may lead to another course of action, if any action at all is taken.

With such measures, studies could be conducted over a short period of time and still be able to report meaningful treatment response measures.

Followup

To compare the outcomes of any two groups of patients, they need to have been followed for the same length of time after the onset of disease. This is because the risks of morbidity and mortality resulting directly from that disease will change as time progresses and the total incidence of these endpoints will change. Similarly, the number of patients experiencing morbidity and mortality will accumulate over time. Therefore, the outcome of a patient followed for 4 weeks after stroke cannot be compared with a patient followed for 8 weeks.

Unfortunately, most studies on dysphagia management published to date have followed patients for a mean length of followup; in acute care, this is often simply the length of stay in the hospital, which is different for each patient. However, even in other care settings, followup time has not been standardized.

Any meaningful analysis comparing study results without such standardization is impossible. It clearly is very difficult for a researcher to follow patients once they have left the inpatient care setting, and obvious confounds arise when a care setting changes. Analysis would then need to take into account care setting, and perhaps compare the outcomes of patients released to the community versus those sent to continuing care (either in rehabilitation or a nursing home).

Specific Areas of Recommendation

The current literature contains several gaps on management of dysphagia. We suggest additional research is needed in the specific areas discussed below.

Clinical Signs and Symptoms

Individual signs and symptoms for prediction of pneumonia during noninstrumented exam have been found to be unreliable; all tested symptoms have been found to have either a very low sensitivity or a low specificity. However, researchers have not examined the co-occurrence of multiple symptoms to find an algorithm that may successfully predict pneumonia. Such research is suggested for the future in order to isolate those patients who would be best served by undergoing more extensive, instrumented diagnosis and treatment.

Some individual symptoms, in particular dysphonia, have been found predictive of aspiration. As with pneumonia, however, co-occurrences of several symptoms have not been fully explored, and such research may produce a method of noninstrumented diagnosis that is accurate enough to selectively choose the appropriate patients at risk to undergo further testing.

Diagnosis

There is currently no conclusive evidence about the superiority of instrumented diagnostic tools over noninstrumented ones. It has been assumed by clinicians that imaging technology such as videofluoroscopy or fiberoptic endoscopy must be superior to a noninvasive BSE because of additional information provided. However, published research has yet to prove this superiority. Researchers must document what information provided by instrumentation but not by BSE results in better outcomes for patients.

It would also be interesting to see if there are any symptoms on BSE that are reliably predictive of particular physiological dysfunction. As there is currently some limited information on the relationship of particular physiological abnormalities and appropriate treatment, such BSE predictive value would make it a better diagnostic tool in isolation.

Treatments

Most treatment trials on noninvasive swallow therapy have thus far examined results of the program overall, rather than broken out by specific techniques. For example, a study may report that patients underwent strengthening exercises or postural techniques, and then report outcomes for all these patients together. It would be interesting to see long-term results after individual techniques, such as the chin tuck or Mendelsohn Maneuver. (Such results have been reported for the palatal training appliance and tactile-thermal stimulation.) Such studies, as discussed above, would have to be constructed with a control or comparison group of some sort, either a historical control or comparisons among different intensities of treatment; comparison of different therapies is inappropriate unless patients with identical indications could appropriately be referred for completely different treatments.

Clinical studies examining outcomes with the use of feeding tubes have demonstrated a lot of problems. The most obvious problem has been that the overwhelming majority of these studies did not isolate patients with dysphagia from patients put on a tube for other causes. Thus, the results of those with dysphagia are mixed with those suffering from dementia or paralysis that makes self-feeding difficult. It is then impossible to determine what the effects of tube feeding are for dysphagia patients specifically. Only two studies have been identified that examined dysphagia patients specifically [Norton, Homer-Ward, Donnelly et al., 1996; Spiegel, Creed, Selber et al., unpub.(a)]; one reported mortality and the other weight change. More of this type of research is recommended.

As with the other treatment literature, the feeding tube literature suffers from inconsistent followup times. The literature would benefit if results were reported using survival curves (e.g., Kaplan-Meier curves) so that the percentage of mortality, morbidities, and complications were standardized on the basis of the number of patients actually followed for the specific time intervals. We also suggest that the rates of all possible feeding tube complications be reported, even if none occurred, because currently these complications are inconsistently reported and it is unclear when they are not reported whether none occurred or whether the researchers did not consider them important.

Clinical Trial Suggestion

This section contains our suggestions for a multicenter, randomized trial that would provide useful information for relative efficacy of different dysphagia management algorithms. We do not suggest that this trial would answer all questions that are of interest. In particular, it will not determine the sensitivities and specificities of the various diagnostic methodologies. We propose not to determine these measures of test performance because their determination would require determining the number of patients who receive false-positive diagnoses. This could only be done by denying treatment to certain patients who receive a positive diagnosis, a maneuver that would be unethical. Further, knowing these performances measures is not required for determining which diagnostic methodology provides greatest benefit to patients. It is more important to know which leads to the most favorable patient outcomes. In the lexicon of the evidence hierarchy we described in the Methods section, we will here describe a trial that provides Level 5 evidence. This trial compares patient outcomes after different diagnostic methodologies are used, to determine if any specific diagnostic results in a significantly better outcome than any other diagnostic. In our second supplemental analysis, we will describe additional information that could be collected to render this a trial that provides additional information about Level 6 evidence (cost-effectiveness), the highest level of evidence possible.

We also do not suggest that this trial is the perfect trial in all aspects. In particular, practitioners and patients would not be blinded to most (but not all; see below) of the diagnostic tests that patients received, nor would they be blinded to the type of treatment given. The reason for this is simple -- such blinding is not possible (for example, treatment is often provided during the initial MBS test). Thus, the trial we describe takes clinical reality into account.

Trial Design

The trial we suggest is a multicenter, multiarm trial, with patients randomly assigned to each arm. Randomization should be accomplished by accepted means (e.g., by using a table of random numbers), and not according to patient hospital number, the day of the week on which patients arrive at the hospital, or other means not accepted by methodologists.

This trial can consist of two to four groups, depending upon the number of questions one wishes to answer. For the purposes of this description, we will describe the trial as if it contained four groups, with the understanding that the number of groups can be reduced.

Patient Population

The patient population should be as homogeneous as possible. Thus, only patients in a certain age range should be enrolled. This is required inasmuch as very old patients may have had more co-morbidities (or have been otherwise more debilitated) than relatively younger ones. Additional patient inclusion criteria may also be desirable to ensure that the study is conducted on a homogeneous group. Regardless of the specific criteria, however, the characteristics of the enrolled patients should be recorded to determine, after the conclusion of the trial, whether the patients in any of the four groups were different from those in other groups.

Patients should be consecutively enrolled. We recommend that the trial be limited to patients with a single disease, preferably stroke. This ensures that the trial will study the population most likely to benefit from dysphagia programs. It also makes accruing patients to the trial easier than it would be if other patients were studied. This means that the trial can be concluded more rapidly. Further, we are unable to estimate how effective dysphagia diagnosis and treatment programs are in other patient populations. Enrolling patients from these other populations could, therefore, have an impact on our statistical power calculations (see below), making it possible that many more patients than we have specified might need to be enrolled. In practice, this could yield a worst-case scenario in which the trial found no effect because it was underpowered.

Diagnosis and treatment should begin at a uniform time, and as soon as is practical. Delaying diagnosis and treatment too long would mean they would be offered to patients increasingly less likely to benefit from a dysphagia program. Because the trial should be restricted to stroke victims immediately after the stroke event, only acute-care centers should participate.

Diagnostic Methods

All patients in each arm undergo the dysphagia diagnostic method to which they are randomized, regardless of whether they are exhibiting clinical symptoms of dysphagia. In the first arm (the control group), all patients receive diagnosis by a noninstrumented test and then treatment. Although it is commonly believed that noninstrumented tests are inferior to instrumented diagnostics, no definitive evidence of this exists and, in fact, there is some limited evidence that dysphagia programs using the BSE are effective in preventing aspiration pneumonia [Odderson, Keaton, and McKenna, 1995; Spiegel, Creed, Selber et al., unpub.(a)]. Therefore, there is no compelling evidence to suggest that inclusion of such a group would be unethical. If suspicions about such a group remain, it is important to understand that the trial we describe would be subject to the normal ethical stopping rules. Thus, accrual to this group could be immediately terminated should it be found during the trial that patients in this group fared significantly worse than did patients in another group. At the same time, we strongly recommend that this group be included, no matter how many groups the trial ultimately contains. Inclusion of this group answers a most fundamental question: Are instrumented exams more efficacious than structured formal BSEs?

Patients in the other three arms would also receive the noninstrumented test. The results of this exam, however, would not be used in patient management. In fact, physicians and other caregivers would be blinded to the results of this test (the reason for these BSEs are described below). We recommend blinding of providers to the results of these BSEs to ensure that interpretation of the results of the subsequent tests is not biased by the results of the BSE.

These patients would then randomly receive diagnosis with a single instrumented test (to be chosen by the researchers). Numerous instrumented tests are currently used in dysphagia management, and the clinical literature is currently equivocal about the superiority of any one over any other. We do not specifically recommend the inclusion of any particular instrumented tests, because to include only the widely used tests might be construed as exclusionary of new, emerging (and potentially superior) technologies; however, to recommend the newer technologies would discount the refinement and extensive development of the established technologies. We therefore defer to the clinical establishment to choose which instrumented exams to include. Treatment would be determined based upon the results of these diagnostic tests.

Because all patients in this trial receive noninstrumented diagnosis, it is important that this exam be standardized across centers. The specific components of this exam could be determined by a panel of experts (which, if desired, could include patient representatives). We recommend, however, that this exam include (but not be limited to) assessment of patients' oral hygiene habits, dysphonia, the ability to voluntarily cough (rated on a four-point scale), and the 3-ounce water test. As discussed previously in this report, limited data suggest these tests are effective at predicting aspiration. For reasons that will become apparent below, a thorough and standardized patient history should be taken as part of this test. Again, the expert panel could determine specific elements of this history.

Treatments

As with the diagnostics discussed above, treatments in different centers should also be standardized to the greatest extent possible; standardization should be accomplished by the same expert panel that standardizes the diagnostic methods. This is not meant to imply that all patients receive the same treatment; rather, patients are assigned to treatments based on symptomatology. However, the methods of determining the appropriate treatment (i.e., the symptom-based treatment choices) should be the same across the different diagnostic methods. We recognize that this is likely to be problematic. However, it is important to accomplish this standardization to reduce the variability of the trial's results (thus increasing its statistical power) and to ensure that apparent effects of diagnostics are not confounded by treatment differences.

Because this standardization should ensure that centers or individuals providing therapies are equally well trained, it is important that none of the sites (or providers) are in a startup phase of speech-language therapy, when results might be poorer than those obtained after more experience is gained.

Outcome Measures

The primary outcome measure of the trial should be pneumonia, as it is the most serious morbidity that may result from dysphagia. We recommend this as the primary outcome for several purely pragmatic reasons. First, the cost of treating pneumonia is among the greatest of the costs of treating morbidities. Second, although death rates due to dysphagia are of extreme importance, one could obtain low death rates by curing pneumonia, not by preventing it. Thus, any effect of dysphagia diagnosis and treatment programs observed in such a trial would be contaminated by the effectiveness of pneumonia treatment. Finally, there is so little information on other morbidities resulting from dysphagia, one cannot, on the basis of current data, even be assured that these other morbidities pose a major health problem. In fact, one of the purposes of this trial is to provide information about the health problem caused by these other morbidities. Thus, secondary outcomes should include, but not be limited to, measures of: (a) the number of patients placed on feeding tubes, (b) weight change, (c) body mass index, (d) serum albumin (e) dehydration (measured by the blood urea nitrogen:creatinine ratio), (f), morbidities resulting from feeding tubes, (g) dysphagia-related mortality, and (h) all-cause mortality. Data should be analyzed on an intent-to-treat basis.

The results of the BSEs can also be considered among the outcomes of this trial. These results will be used to provide two kinds of information. First, they will allow one to determine whether any signs and symptoms, and particularly any combination of signs and symptoms, predict morbidity (and thus whether the BSE can be used as a stand-alone diagnostic). Second, these results will allow one to determine, if warranted, which BSE results can be used to selectively refer appropriate patients for subsequent instrumented diagnostic tests. It may be possible to pool the BSE results from the three groups that receive subsequent diagnostics to enhance the statistical power of the analyses that will be required; such analyses will involve multivariate statistics, which are lower-power tests than univariate analyses. The fact that analysis of these data will be conducted using a form of multivariate statistics (and, more specifically, a form of multiple regression analysis) means that the results of any one trial, including the one we describe here, will not provide results completely generalizable to all settings. It is well-known that multiple regression equations conducted in one setting are less predictive when used in another (this reduction in predictive ability is termed shrinkage). It is even possible that some variables that appear to be predictive in this clinical trial will not be found predictive in later work. In other words, it is possible that the multivariate results obtained in a well-controlled clinical trial that employs a strict protocol may not be entirely generalizable to actual clinical practice, where protocols are often less strict. This does not imply that the results of these analyses are not worthwhile but it does mean that they will have to be checked, and possibly further refined based on how predictive these signs and symptoms are in actual clinical practice.

Followup

Patients should be followed for a uniform period of time, at least 1 year. This requires active monitoring of patients after they leave acute care. Following patients for an average (or mean) length of time is inappropriate. Means are distorted by outliers, and would be misleading if a few patients or any group(s) were followed for abnormally short or long periods. We recognize the difficulty in performing this type of followup, so it will most likely be prudent to offer patients remuneration to enhance compliance.

Even patients who receive negative diagnoses for swallowing disorders should be followed. This is required to ensure that results are analyzed on an intent-to-treat basis.

Statistical Power Issues

Table 20. Number of Patients Required in Proposed Trial
Baseline pneumonia rateProgram ratePercent decreaseN per groupTotal N 2 groupsTotal N 3 groupsTotal N 4 groups
2%1.5%25%10,79521,59032,38543,180
2%1.0%50%2,3184,6366,9549,272
An a priori power analysis for this trial is required. The purpose of this analysis is to estimate the number of patients the trial should enroll. Our estimates of the number of patients required are shown in the following table:

The lefthand column displays the rate of aspiration pneumonia that, for the purposes of our statistical power calculations, we assume will occur in patients whose treatment is based on only the results of the BSE (see our discussion of Question 1 in the Evidence Report and our first Supplemental Analysis for further information about this rate). In the next column from the left, we portray two hypothetical pneumonia rates that might occur in a dysphagia diagnosis and treatment program that employs an instrumented diagnostic. These hypothetical rates translate to 25 percent and 50 percent reductions in the rate of aspiration pneumonia, as shown in the third column from the left. The next column shows the number of subjects needed in each group of the trial for each of the two hypothetical reductions in pneumonia rates. The final three columns show the total number of patients the trial must contain for each of these two rate reductions for a two-, three-, and four-group trial, respectively. Thus, if a program results in a 25 percent decrease in the rate of aspiration pneumonia, 10,795 patients will be required for each group, meaning that the trial must contain 21,590 to 43,180 patients, depending on the number of groups. Similarly, for a program resulting in a 50 percent reduction in the pneumonia rate, 2,318 patients per group will be required, meaning that the trial must contain 4,636 to 9,272 patients.

For these preceding calculations, we sought a trial that would give 80 percent statistical power at an alpha level of 0.05, and assumed that 2 would be the test statistic (the reasons for this assumption are discussed below).

For this area of research, this is a relatively large trial. However, it is not large when compared with the size of other trials reported in the medical literature. The size of this trial, however, is one reason for our recommendation that this trial be a multicenter trial. Using a number of centers will make it easier to enroll this number of patients and will decrease the length of time required to complete the trial.

In constructing this table, we have chosen a 25 percent decrease in pneumonia as the lowest effect size of interest on the assumption that a smaller decrease in the aspiration pneumonia rate would be cost-prohibitive. This in fact may not be the case, so we believe it prudent to describe the calculations we used to reach this conclusion. In this way, others can both evaluate our results and use our calculations to employ their own judgment about what is and what is not cost-prohibitive. These calculations are based upon this trial's costs for preventing one case of aspiration pneumonia.

In a trial containing two groups, each patient would receive a BSE at a cost of $141. (The sources of this cost information are described in our first Supplemental Analysis.) Half of the patients in the trial would receive an instrumented exam at a cost of $218, so the average cost of instrumented exams would be $109 per trial enrollee. We also assumed that about 38 percent of all patients would receive diagnosis-directed treatment (derived from the estimate of aspiration in this population by Daniels, Brailey, Priestly et al., 1998), at a cost of $242 per treated patient, which yields an average cost per enrollee of $92. Summing the costs yields an approximate cost of $251 per enrollee. If a dysphagia program employing an instrumented diagnostic decreases the rate of aspiration pneumonia by 25 percent (i.e., from 2 to 1.5 percent), then the program will prevent one case of pneumonia in every 200 enrollees in a two-group trial. This means that the total cost of preventing one case of pneumonia in such a trial is $68,400. Similarly, in a trial with three groups, the cost to prevent one case of pneumonia is $75,600, and in a trial with four groups, the cost is $79,400. Although these numbers are admittedly crude, they far exceed the highest estimate of the cost of treating pneumonia, $19,000, in the literature (Aviv, unpub.). Hence, our assumption that a 25 percent program-related decrease in pneumonia is the smallest effect of interest is probably a liberal one.

Our estimates of the number of patients required for this trial are highly sensitive to the pneumonia rate predicted to occur on those patients whose treatment is determined solely by the BSE and to the anticipated magnitude of the effect of a dysphagia program. For example, the above table assumes that an in instrumented diagnostic will decrease pneumonia rates to 1 percent (instead of 1.5 percent), meaning that a four-group trial will require 33,908 fewer patients. Similarly, if we assume a 4 percent base pneumonia rate and, as above, a 25 percent decrease as a result of employing instrumented diagnostics, then only 21,100 patients are needed for the four-group trial, which is 22,080 fewer than if the base rate is 2 percent. This sensitivity arises because the power function is relatively steep in the area in which we are working, and not as a result of any inaccuracies in our calculations. Because of this sensitivity, it is wise to consider a pilot study to better estimate the base pneumonia rates and the anticipated difference in pneumonia rates between patients who receive only the BSE and those who receive an instrumented exam.

Our calculations of the number of patients needed for this trial assume that the statistical analysis of the data will be conducted to maximize power. This reduces the number of patients required for the study, and without analyses that maximize power, this number could become prohibitively large. The implication is that focused contrasts should be used. Taking the four-group trial as an example, a relatively low-power analysis could result if the analysis were conceived as a 2 analysis of a 2 X 4 design in which the four treatment groups are the column headings and the presence or absence of pneumonia are the row headings. On the other hand, it is possible to analyze these data using three orthogonal contrasts. These contrasts might, for example, ask whether the results of the BSE are different from those of all other tests, whether the results of the MBS exam are different from those of the two fiberoptic exams, and whether the results of the two fiberoptic exams are different from each other. By employing these contrasts, one reduces the degrees of freedom from three (for the omnibus 2 test) to one, and there is a resulting increase in statistical power. Obviously, the focused contrasts can also be used in the three-group trial and are not relevant to the one group trial, which is already analyzed on 1 degree of freedom.

Some Final Remarks

Although the trial need not contain all four groups we have described, using fewer groups should be approached cautiously. As a rule, data obtained from a single trial that directly compares the diagnostic strategies of interest are stronger that those of several trials, each of which only makes some of the comparisons of interest. This remains true even if the aggregate of all trials makes all of the desired comparisons. However, this would be feasible if the same centers took part in several, smaller studies, using the same documented diagnostic and treatment criteria.

It is also possible to add additional groups to this trial, groups that incorporate combinations of diagnostics. Similarly, some diagnostic combinations may be substituted for the groups we have outlined here.

The results of this trial can be extended to answer other questions of interest, even though it (or any other feasible clinical trial) will likely contain too few patients to yield appropriate statistical power. This can be accomplished by using the data from this trial to construct a decision analysis. Such an analysis is further discussed in Appendix F.

Supplemental Analysis

Cost-Effectiveness Analysis of a Dysphagia Diagnosis and Treatment Program

Introduction

In this supplemental analysis, we conduct an analysis to determine whether dysphagia diagnosis and treatment programs are a cost-effective means to prevent aspiration pneumonia. As in our main analysis, we focused on stroke patients, because they are the largest group of elderly adults experiencing dysphagia. The clinical measure of effectiveness for this supplemental analysis is prevention of aspiration pneumonia, because it is the major cause of serious morbidity, mortality, and expense for patients with dysphagia. Also, as discussed in the main assessment, we were unable to find adequate literature to estimate the effect of dysphagia diagnosis and treatment on the other important patient outcomes of malnutrition, dehydration, and quality of life (QOL). Finally, we focused on the acute-care setting because patients are at greatest risk for aspiration pneumonia during this time, because the greatest efforts of dysphagia programs are at these times, and because most of the data required for cost-effectiveness analysis was for the acute-care setting. However, we also discuss the implications of our results for nursing home patients.

We performed this analysis by constructing a decision tree that specifically addresses the question of whether a typical program to diagnose and treat dysphagia is effective enough in preventing pneumonia to justify the cost of diagnosis and treatment. We first addressed the question for a dysphagia program that uses a preliminary bedside exam (BSE) followed by a full BSE for patients judged to have dysphagia by the preliminary exam. We next examined the cost-effectiveness of a program that uses a preliminary exam to refer dysphagia patients to a videofluoroscopic swallowing study (VFSS). To answer each question, we set up the decision tree that mimics a hypothetical randomized controlled trial of a cohort of acute stroke patients.

There are two primary arms of this decision tree. One arm models the historical situation of a typical acute-care hospital with no special dysphagia program. In this arm, the swallowing competence and needs of stroke patients are assessed informally by attending physicians, nurses, and dieticians without special training in diagnosis and treatment of dysphagia, and without the services of speech-language pathologists (SLPs) who are specially trained to diagnose and treat dysphagia. In this situation, a major involvement of physicians is in placing enteral tubes in patients judged to have a swallowing problem that could make it difficult for them to maintain hydration and nutrition or that could lead to aspiration of food and liquids. Patients judged less severely impaired may be provided with pureed food and possibly a soft mechanical diet including thickened liquids. The historical proportion of stroke patients who acquire aspiration pneumonia is used as the effectiveness measure of this arm. There is a branch for the proportion of patients who get aspiration pneumonia during a typical acute-care stay of 2 weeks. The tree counts the number of patients who get pneumonia and the number who remain free of pneumonia by setting the effectiveness payoff for the former patients at 0 and the effectiveness payoff for the latter patients at 1. Thus, the measure of effectiveness in this analysis is number of pneumonia cases prevented. The cost of pneumonia is calculated as the Medicare reimbursement rate for pneumonia. The expected value for this entire arm is calculated as the pneumonia treatment costs to Medicare per stroke patient.

Using Medicare reimbursements instead of true provider costs allows us to estimate Medicare's net costs for dysphagia diagnosis-treatment minus pneumonia costs. We could find no recent literature sources for comprehensive provider costs of dysphagia diagnosis-treatment and pneumonia, whereas Medicare reimbursement rates are known. If Medicare reimbursements underestimate the provider costs of treating pneumonia to a greater extent than dysphagia diagnosis-treatment costs, then any cost savings found by our analysis will be even greater for providers than for Medicare. Thus, our estimates will be conservative and for the minimum savings.

The second primary arm of the tree models a typical dysphagia program in the same setting as above. For the BSE program, clinicians, typically nurses, carry out a preliminary swallowing evaluation as described by Odderson et al. (Odderson, Keaton, and McKenna, 1995) (described in the main evidence report in the Results section under Question 1) and refer stroke patients suspected of dysphagia to SLPs specially trained to diagnose and treat dysphagia. These dysphagia specialists carry out a formal BSE. Treatment of these patients may include (alone or in combination) enteral intubation, positional maneuvers, strengthening exercises, and diet modifications such as the restriction of certain consistencies and the addition of thickening agents to certain consistencies. Patients also receive 15 minutes a day of therapy for 6 days to reinforce and maintain these measures. We took the incidence of aspiration pneumonia observed in recent studies of BSE dysphagia programs as the effectiveness of this arm. For the cost of this arm, we used the Medicare reimbursement rate for diagnosis and treatment of dysphagia and the reimbursement rate of treating pneumonia. Therefore, the costs associated with this arm are the net costs for dysphagia diagnosis-treatment and pneumonia per stroke patient. The decrease in the number of PEG tubes is also considered in the costs. However, rather than adding PEG costs to both arms, we merely subtracted the decrease from the dysphagia program arm.

The simple cost-effectiveness of each arm is calculated as the costs divided by the proportion of pneumonia-free patients. However, this is not a particularly useful measure, because we are interested in the costs for all patients. Of primary interest is the overall difference in cost-effectiveness between the two arms, called the marginal or incremental cost-effectiveness. This tells the cost-effectiveness of adding the dysphagia program to the routine care of stroke patients. The marginal cost-effectiveness between the two arms is calculated as the difference in total costs for each arm divided by the difference in effectiveness for each arm. This gives the cost (or savings) per pneumonia case prevented. These are the net direct medical costs (or savings) of preventing pneumonia.

It is important to place these medical costs in perspective, which can only be accomplished by adjusting the cost of pneumonia by some measure of QOL and the threat of death from pneumonia. Meaningful pneumonia-related QOL data are not currently available, so it is necessary for individual users of this analysis to employ their own data and judgment for this evaluation. Also, it is necessary to realize that because we were limited by published data to calculating only the pneumonia cost-effectiveness, one is forced to take pneumonia as representative of the other important patient outcomes such as dysphagia-related malnutrition, dehydration, and QOL. Dysphagia diagnosis and treatment costs relating to these have unavoidably been included here in our pneumonia analysis, while the costs of these additional patient outcomes and the savings for their prevention have not. Thus, the additional savings (monetary or QOL) from preventing or ameliorating these must be considered, although we were unable to explicitly calculate these.

Methods

We used the results and evidence tables of the main analysis as the probabilities and costs in the decision tree. The methods for searching the literature and for extracting and analyzing the data are described in detail in the Methods section of the main analysis. Briefly, the peer-reviewed literature was searched for publications containing information on the epidemiology, burden of disease, diagnosis, treatment and costs of stroke, aspiration pneumonia, and dysphagia in the elderly population. The titles and abstracts in the search results were examined independently by two analysts, and publications that appeared to have relevant data were ordered. The set of ordered publications was manually examined independently by two analysts to extract data, and bibliographies were examined for additional publications that were not in the searches. The only exclusion criteria were that the studies contain 10 or more patients and be conducted in the United States. In some cases where no appropriate data was available meeting those requirements, smaller studies and studies outside the United States were included, with appropriate indication of these exceptions.

The variables used in the decision trees are shown in S-1 and S-2. These same data, along with additional data provided for comparison purposes, are presented in the evidence tables at the end of the supplemental analysis.

The decision tree(s) were constructed, calculated, and illustrated using DATATM decision analysis software (version 3.0.18: TreeÂge Software, Cambridge, MA).

Cost-Effectiveness of a BSE Dysphagia Program Compared with No Program

Tree Construction
No-program arm

As described above, the no program arm of the tree was analogous to a historical control group composed of patients in a typical acute-care hospital that has no special dysphagia program. The probabilities and costs used in this comparison arm are presented in Table S-1. These data, along with additional data provided for comparison purposes, are presented in the evidence tables at the end of the supplemental analysis. The proportion of stroke patients who acquire aspiration pneumonia without a dysphagia program is 0.082, which is the N-weighted mean for Haerer and Smith, 1974; Young and Durant-Jones, 1990; Odderson, Keaton and McKenna, 1995; and Barker and Mullooly, 1997. These are the historical control data adjusted for time of followup that were discussed in the main body under Question 1, and are described in more detail in Appendix D. We assumed the cost of treating a case of aspiration pneumonia in a stroke patient is $2,164, which is the 1997 Medicare reimbursement difference between stroke (DRG17) and stroke with comorbid condition (DRG16). Under the current prospective payment plan, this is meant to cover all provider costs in this situation. As noted above, this may underestimate the true provider costs for pneumonia. The rationale for using Medicare reimbursement rates was explained above.

BSE dysphagia program arm

All the probabilities and costs used in the arm of the decision tree that represents a BSE dysphagia program, along with names of the appropriate variables, are presented in Table S-2. The BSE dysphagia program in this arm is modeled after Odderson, Keaton and McKenna, 1995. In such a program nurses give a preliminary bedside assessment. In that study 39 percent of acute stroke patients failed the preliminary assessment, and these were referred for a full BSE (1.5 billable hours) by an SLP. Those patients are also given followup of 15 minutes SLP therapy daily for 6 days. In addition, Odderson et al. (Odderson, Keaton, and McKenna, 1995) reported that the placement of percutaneous endoscopic gastrostomy (PEG) tubes went from 8.3 percent of stroke patients in the year prior to their BSE program to 7.3 percent during their program. This is a difference of 1 percent. Therefore, we subtracted 1 percent of the cost of PEG tube placement from the costs of the BSE program arm. In the tree, the variable Etube represents these effects of the program on tube placement rates.

The effect of the dysphagia program on aspiration pneumonia is calculated in the tree as the reduction in relative risk (RRR) subtracted from 1 and multiplied times the probability of pneumonia with no-program. This method of calculating the pneumonia probability for the dysphagia program allows us to carry out sensitivity analyses in which either or both the RRR or the no-program pneumonia probabilities are changed. RRR is taken as 0.84. The Odderson et al. program had an RRR of 100 percent. We considered it unrealistic to assume that all dysphagia programs would have this perfect result. Therefore, we averaged this with results reported in the only other two reports we found of dysphagia programs for consecutive acute stroke patients, Nilsson et al., 1998, and Daniels et al., 1998 (Evidence Table S-4). The latter study used VFSS on all patients; however, it reported a result intermediate between the Odderson and the Daniels studies. Thus, we felt it could be averaged with the others. We did not weight this mean, because it is not clear that the differences in RRR among these studies is due only to random variation attributable to study size. At the same time, we were forced to accept this mean RRR for the three studies (84 percent) as the best available estimate of RRR for dysphagia programs, because it is unlikely that any of these studies were large enough to resolve the real differences in RRR that might result from the different methods used in the dysphagia programs.

The costs of a full BSE, SLP therapy, and PEG tube placement are shown in Table S-2 and are detailed along with comparison costs in Evidence Table S-6. Aspiration pneumonia costs are the same as described above in the no-program arm. The rationale for using Medicare reimbursement rates as costs was explained above in the introduction.

The decision tree comparing a no-program arm with a BSE program arm is shown in Figure S-1. Each branch of the tree is described above the branch, and the probability of entering each branch appears below the branch using the tree variable names given in Tables S-1 and S-2. A pound sign (#) in the probability place indicates that the probability is automatically calculated by subtracting the probability of the complementary branch from one. Figure S-2 depicts the same tree with all of the variables and their values shown under the root branch and the formulas used to calculate the costs and effectiveness payoffs at each terminal node. Each pneumonia-free branch provides an effectiveness payoff of 1; each aspiration pneumonia branch provides an effectiveness payoff of 0. There are no pneumonia costs associated with the aspiration pneumonia-free branches; and the cost of aspiration pneumonia is assigned to each aspiration pneumonia branch. In addition, all branches in the dysphagia program arm are assigned the cost of a full BSE and the cost of SLP therapy, all multiplied by the proportion of patients referred by the preliminary assessment to SLP BSE and therapy (Evidence Table S-2).

Results

Figure S-3 presents the tree with the expected costs calculated. Expected costs are the average costs for each arm. Figure S-4 presents the tree with expected pneumonia rate calculated. The expected values in this tree represent the proportion of pneumonia-free patients in each arm. Results are shown in text form in Table S-3. The dysphagia program arm has an expected cost of $174 per stroke patient, and an effectiveness of 98.7 percent pneumonia-free patients. The no-program arm has an expected cost of $177, and an effectiveness of 91.8 percent pneumonia-free patients. Thus, the BSE dysphagia program arm is both more effective and less expensive than no-program. In cost-effectiveness analysis, this is termed dominance.

The difference in cost between the two arms is $4, in favor of the BSE program. The difference in effectiveness is 6.9 percent, in favor of the BSE program. In other words, the BSE program saved $4 per stroke patient and prevented 6.9 percent of the stroke patients from getting aspiration pneumonia.

This effectiveness is not a new finding calculated by the tree but strictly reflects the 84 percent RRR fed into the tree combined with the no-program pneumonia risk of 8.2 percent fed into the tree: 0.84 RRR X 8.2 percent pneumonia = 6.9 percent of stroke patients prevented from getting pneumonia. A tree is not required to obtain that result, rather it is a given derived from the analysis in the main body of the assessment, and with certain caveats presented there. The tree takes this given effect and calculates the net cost of obtaining this effect. In this case, there was a negative net cost, or a savings.

Sensitivity Analysis

We carried out sensitivity and threshold analyses to determine how the results of the tree were sensitive to changes in the variables in the tree. To determine the relative sensitivity of the marginal cost-effectiveness to changes in the variables, we varied each variable by a fixed ratio of plus or minus 25 percent, a method called tornado diagram analysis. The tornado diagram is shown in Figure S-5 and the text results of the sensitivity analysis are shown in Table S-4.

Variations in the base risk of pneumonia in patients who do and do not undergo a dysphagia program has the largest impact on incremental cost-effectiveness. A 25 percent reduction from the base risk causes the incremental cost-effectiveness of the dysphagia program to increase to a net cost of $1,228 per pneumonia case. The amount of pneumonia risk reduction for the dysphagia program has a similar impact on the incremental cost-effectiveness. The proportion of patients being treated for dysphagia has somewhat less influence on incremental cost-effectiveness, and costs of specific procedures have even less influence.

The results of the tornado analysis show two things. First, the numbers of pneumonia cases, with and without the dysphagia program, most strongly influence the incremental cost-effectiveness of a dysphagia program. Costs of dysphagia diagnosis and therapy have less influence. Efforts to determine more precisely the precise values of these variables should focus on pneumonia risk without a dysphagia program and the amount of reduction of that risk provided by a program. Second, even if a more expensive diagnostic procedure were used (for example, VFSS), moderate changes in this cost do not lead to large values for the incremental cost-effectiveness of the dysphagia program (this possibility is analyzed further below in a tree modeling a VFSS dysphagia program). Over all variables, the results are favorable enough that it seems unlikely that moderate changes in input variables will result in the dysphagia program's having an unacceptable cost-effectiveness. As noted in the introduction to this supplemental analysis, this should be true because we used conservative cost assumptions.

Threshold Analysis

We also calculated the threshold value for each variable. A threshold is that point at which a change in any of the variables causes a change in the conclusions derived from the tree. In this case, the threshold is between a dominance situation, where the diagnosis and treatment of dysphagia would lower costs and improve outcomes, and a situation in which diagnosis and treatment of dysphagia would increase costs while still improving outcomes. In the latter situation, the individual making a decision on whether a dysphagia program is worthwhile must make his or her own judgment about what is a reasonable price to pay to avoid one case of aspiration pneumonia.

Table S-5 shows the threshold value for each variable, beyond which the BSE program arm becomes more costly than the no-program arm (the probabilities entered into the tree do not allow for the BSE program arm to be less effective than no-program).

These thresholds are of interest, because they demonstrate how much the probabilities and costs we used in the tree would have to change in order to reverse the conclusion that a dysphagia program saves money. However, it should be understood that considering the threshold to be where dominance ends (that is, where the dysphagia program begins to cost money rather than save) is not the most realistic way to carry out threshold analysis. Most consumers and policymakers would not require that a program save money to be a success, but would be willing to spend some practical amount to prevent pneumonia. In that case one should choose the cost of pneumonia prevention that is considered the practical limit and let that be the threshold. However, that practical limit may vary for different readers, and we have no basis for choosing such a limit; therefore, we confine ourselves to the dominance threshold. Because most readers would be willing to pay some amount to prevent pneumonia, using the dominance threshold means we are underestimating the amount these variables would have to change to reach any practical threshold. In other words, this is necessarily a very conservative threshold analysis.

The variable closest to a threshold is the probability of aspiration pneumonia for the no-program arm. When pneumonia without a program is less than the threshold of 7.5 percent, there will be so few cases of aspiration pneumonia compared with the number of dysphagia patients diagnosed and treated, that the savings from prevented pneumonia will not completely offset the costs of dysphagia diagnosis and treatment, and the program will begin to cost a small amount of money to prevent pneumonia. This has implications for care settings with a lower frequency of pneumonia than that found in stroke patients within the first 2 weeks of acute care. Rehabilitation centers and nursing homes are likely to have lower pneumonia frequencies than acute-care facilities. However, these latter facilities may not need to diagnose and treat the same proportion of patients as an acute-care facility (39 percent in our base case).

Another variable that is of interest is the cost of SLP therapy. This would need to increase by only 18 to begin to increase costs of a dysphagia program over those of not having such a program. This may be of little concern, because the increase would not be steep, and there may be important quality-of-life benefits for patients that would be worth a small increase in cost. This is further discussed below, in the section on two-way sensitivity analysis.

Two-Way Sensitivity Analysis

While a threshold analysis is useful for determining which variables may possibly affect the cost-effectiveness of a swallowing program, it only tests one variable at a time. Two-way sensitivity analysis allows testing of simultaneous changes in two variables.

The DATA software used to create the decision trees and perform one-way sensitivity analysis is not capable of two-way sensitivity analysis of incremental cost-effectiveness. Therefore, separate two-way analyses of cost and effectiveness were performed, with the results exported into text files. The text files were imported into a Microsoft Excel spreadsheet (Office 97 SR-1: Microsoft Corp., Redmond WA) developed by us. The spreadsheet calculates incremental cost and incremental effectiveness for each pair of values, then calculates incremental cost-effectiveness, and draws a three-dimensional graph of the results.

The first two-way analysis examined the effects of variations in aspiration pneumonia rate in patients not undergoing the dysphagia program, and variations in risk reduction brought about by the program. These are the variables that determine pneumonia rates in the two groups. Figure S-6 and Table S-6 show the results of this analysis for the BSE dysphagia program tree. This analysis demonstrates why two-way analysis is valuable. Only when both of these variables are relatively unfavorable (back corner of graph) does the dysphagia program greatly increase net costs. If one variable is favorable and the other unfavorable (front edge of graph and right edge of graph), net cost is decreased by the dysphagia program. In other words, the dysphagia program is dominant over most of the range considered. Even at the least favorable combination considered (base pneumonia rate: 7.4 percent, risk reduction with dysphagia program 70 percent), the dysphagia program costs only $639 per pneumonia case prevented.

The next two-way sensitivity analysis was performed on the most important cost variables: the cost of the BSE and the cost of swallow therapy performed by the SLP. Results are shown in Figure S-7 and Table S-7. They show that the net cost per pneumonia case prevented increases greatly if the cost of swallow therapy is increased. But therapy by an SLP may cost little more than $200. The peak incremental cost-effectiveness is about $6,000 per pneumonia case prevented when the exam costs about $400 and therapy costs about $1000.

The final two-dimensional analysis is useful for assessing the tradeoff between cost and effectiveness of the diagnostic test. The variables in the analysis are the cost of the BSE and pneumonia risk reduction (effect size) of the total dysphagia program. With this analysis, one can determine the incremental cost-effectiveness of a dysphagia program with a different test just by substituting the appropriate cost and pneumonia reduction values into Figure S-8 and Table S-8. The figures show that variations in both cost and risk reduction have a substantial effect on cost-effectiveness. The maximum incremental cost-effectiveness over the variable ranges included in the analysis (exam cost $400, effectiveness 70 percent RRR) is just over $2,000 per pneumonia case prevented.

Cost-Effectiveness of a VFSS Dysphagia Program Compared with No Program

Tree Construction

Next we altered the cost-effectiveness decision tree to model a program that used a VFSS (modified barium swallow) rather than a BSE for the SLP evaluation of patients suspected of dysphagia at a preliminary exam. In the main evidence report we found no reliable data on the extent of improvement in aspiration pneumonia reduction that might occur with the use of VFSS compared with BSE. Thus, for the base-case tree for a VFSS program, we were forced to use the same effectiveness probability as for the BSE program above; however, in the sensitivity analysis for this tree we examined various levels of improvement in pneumonia prevention for the VFSS compared to the BSE. We changed the cost of the exam to reflect the 1997 Medicare reimbursement for an SLP administered VFSS ($218: 74230, cinema X-ray throat/esophagus; 92525, oral function evaluation; 1.5 SLP billable hours). For this tree, the variables in the no-program arm are the same as above. In addition, all the variables in the VFSS dysphagia program arm are also the same as above, except for the cost of exam. Therefore, the structure of the trees is the same as those for the BSE dysphagia program (see Figures S-1 through S-4).

Results

Table S-9 shows the results of the cost-effectiveness analysis for a VFSS dysphagia program. If the VSS program were no more effective in reducing aspiration pneumonia frequency than a BSE program, the VFSS program would cost an additional $26 per stroke patient (net), or $380 per case of aspiration pneumonia prevented. Furthermore, because of the known ability of VFSS to detect more cases of aspiration than a BSE, there is the potential to offset some of this additional cost by preventing more cases of aspiration pneumonia. Because of a lack of data we do not know how much more aspiration pneumonia would be prevented by VFSS; however, with threshold analysis we can estimate how much improvement in prevention would be required to offset the additional cost.

Also, as in the above discussion on BSE threshold analysis, placing the threshold where dominance ends is an extremely conservative way to perform a threshold analysis. A well-designed clinical trial to determine the comparative effectiveness of the two dysphagia programs would be required to determine exactly how much improvement VFSS might provide in terms of pneumonia prevention.

Table S-10 shows the results of the sensitivity analysis of the VFSS program tree and Figure S-9 shows the tornado diagram. The tree is most sensitive to the risk of pneumonia in the no-program arm and the amount of RRR in the VFSS dysphagia program arm. The next most sensitive variable is the proportion of patients referred for VFSS diagnosis and treatment. The tree is only moderately sensitive to any of the costs.

Table S-11 shows the thresholds for the variables in the VFSS program tree. The base-case reduction in relative risk for aspiration pneumonia used in the tree was 84 percent. The threshold analysis indicated that this would need to increase to 91 percent to make the VFSS program dominant in cost-effectiveness (both more effective and less expensive than no program). This would be an 8.3 percent proportional increase in effect in pneumonia prevention. Considering that some cases of aspiration are missed by a BSE and are detected by VFSS, it seems possible that a VFSS program could achieve this much improvement over a BSE dysphagia program.

Two-Way Sensitivity Analysis

Our first two-way analysis examined the effects of aspiration pneumonia rate in patients not undergoing the dysphagia program and risk reduction brought about by the program. These are the variables that determine pneumonia rates in the two groups. Figure-S-10and Table S-12show this analysis for the BSE tree. Only when both of these variables are relatively unfavorable (back corner of graph) does the dysphagia program increase costs. If one is favorable and the other unfavorable (front edge of graph and right edge of graph), net cost is decreased by the dysphagia program. In other words, the dysphagia program is dominant over most of the range considered. At the least favorable combination considered (base pneumonia rate 7.4 percent, risk reduction with dysphagia program 70 percent), the dysphagia program costs $1,219 per pneumonia case prevented.

The next two-way sensitivity analysis was performed on the most important cost variables: the cost of the VFSS and the cost of swallow therapy performed by the SLP. Results are shown in Figure S-11and Table S-13. They show that net cost per pneumonia case prevented increases greatly if the cost of swallow therapy is increased. But therapy by an SLP may cost little more than $200. The peak incremental cost-effectiveness is about $6,000 per pneumonia case prevented when the cost of the exam is around $500 and the cost of therapy is around $1,000.

Our final two-way analysis is useful for assessing the tradeoff between cost and effectiveness of the diagnostic test. The variables in the analysis are cost of the VFSS and pneumonia risk reduction (effect size) of the total dysphagia program. With this analysis, one can determine the incremental cost-effectiveness of a dysphagia program with a different test just by substituting the appropriate cost and pneumonia reduction values into Figure S-12 and Table S-14. The figure shows that both cost and effect size have a substantial effect on cost-effectiveness. The maximum incremental cost-effectiveness is just under $3,000 per pneumonia case prevented, if the cost of the exam is around $500 and the reduction in relative risk is around 70 percent.

Supplemental Analysis Conclusions

The conclusions derived from this decision tree and cost-effectiveness analysis should be viewed as our best guess, given the available data. Many of the assumptions included in this model come from evidence of only moderate reliability. For example, the effectiveness data comes from historically controlled case series that could be confounded by changes in stroke patient management other than dysphagia management. There is extensive discussion of the reliability of these data in the main analysis. Nevertheless, taking these historically controlled case series results as crude estimates of the effectiveness of a BSE dysphagia program, and taking the present Medicare reimbursements as the costs, our cost-effectiveness analysis indicates that there would likely be little or no change in the net cost of managing stroke patients in an acute-care setting if a dysphagia program is implemented to reduce aspiration pneumonia. The costs of dysphagia diagnosis and treatment would be approximately balanced by the savings in aspiration pneumonia treatment. Because of limitations in available data, this limited outcome of pneumonia prevention must be taken as representative of the other important patient outcomes of dysphagia-related malnutrition, dehydration, and QOL that we were unable to include in the analysis. Dysphagia diagnosis and therapy costs relating to these have unavoidably been included here in our pneumonia analysis, while the monetary and QOL costs of these conditions and the savings for their prevention have not. Thus, the additional savings (monetary or QOL) from preventing or ameliorating these must be considered in addition to the savings from pneumonia prevention, even though we were unable to calculate this.

Our sensitivity and threshold analyses give some idea of how much our cost and effectiveness estimates would have to be changed to invalidate this result. If certain single variables were changed by about 10 percent, then a BSE dysphagia program would begin to increase per-patient costs, unless another variable changed to offset that cost (e.g., if fewer patients were referred for SLP BSE evaluation and therapy). Such a change in this latter variable would in fact be likely if the aspiration pneumonia frequency decreased because the patient population had less severe neurologic disease.

Assuming no improvement in pneumonia prevention, a VFSS dysphagia program would cost more than a BSE program because of the additional expense of the exam. All other variables being equal to the BSE program, a VFSS program would not be dominant over the no-program scenario, but would instead cost an additional $26 per stroke patient, or $380 per case of aspiration pneumonia prevented. However, the VFSS program would only have to improve aspiration pneumonia prevention by 8 percent proportionally over a BSE program to become dominant (less expensive and more effective) over a no-program situation. In light of the expected increase in detection of aspiration by VFSS, such a small improvement in pneumonia prevention is possible. VFSS is preferred by many clinicians, not only because of the improvement in aspiration detection, but also because of the additional information on structural and functional aspects of dysphagia that aid in the planning and management of diet and therapy. Our cost-effectiveness analysis indicates that these additional advantages would be provided by VFSS with little or no increase in costs likely.

[Supplemental Analysis Tables]

Table S-1. Probabilities and Costs for the No-program Arm
ItemTree variable nameProbability or costSource
Proportion of stroke patients acquiring aspiration pneumoniapAPNoProg0.082N-Weighted mean for Haerer and Smith, 1974; Young and Durant-Jones, 1990; Odderson, Keaton and McKenna, 1995; Barker and Mullooly, 1997
Cost of aspiration pneumonia as comorbidity for stroke patientCostAspPneum$2,1641997 Medicare reimbursement: difference between stroke (DRG17) and stroke with comorbid condition (DRG16)
Table S-2. Probabilities and Costs for the BSE Dysphagia Program Arm
ItemTree variable nameProbability or costSource
Reduction in relative riskRRR0.84Nilsson, Ekberg, Olsson et al., 1998 evaluable patients; Odderson, Keaton, and McKenna, 1995 2nd year of program; Daniels, Brailey, Priestly et al., 1998: unweighted mean adjusted for 2 week LOS
Proportion of stroke patients who fail preliminary test and are referred to SLP for BSE and therapyPDiag_Treat0.39Odderson, Keaton, and McKenna, 1995
Cost of SLP BSECostBSE$1411997 Medicare reimbursement
Cost of SLP therapyCostSLPTherapy$2421997 Medicare reimbursement for 15 min. sessions with SLP on 6 days: 6 X $28.96 92526 Oral function therapy+ 0.25X$45.23 15 min. of SLP billable hours
Cost of PEG tube placementCostPEG$4161997 Medicare reimbursement: $339 43246 Place gastrostomy tube + $77 74230 Cinema Xray throat/esophagus
Decrease in proportion of stroke patients who receive a PEG tube, if dysphagia program is implementedEtube0.01Odderson, Keaton, and McKenna, 1995
Cost of case of aspiration pneumoniaCostAspPneum$2,1641997 Medicare reimbursement -- difference between stroke DRG17 and stroke with comorbid condition DRG16
Table S-3. Results of the Comparison of a No-program Arm with a BSE Program Arm
No-programBSE program
Effectiveness91.8% aspiration pneumonia-free patients98.7% aspiration pneumonia-free patients
Marginal Effectiveness6.9% of stroke patients had aspiration pneumonia prevented
Cost$177 per stroke patient$174 per stroke patient
Marginal Cost$4 per stroke patient
Marginal Cost-effectivenessDominated by BSE program$4 saved per stroke patient, 6.9% of stroke patients had aspiration pneumonia prevented
Table S-4. Results of Sensitivity Analysis - BSE
VariableBase caseLower limit (-25%)Upper limit (+25%)Maximum incremental CE
Base pneumonia risk8.2%6.15%10.25%$647
Risk reduction with swallow program84%63%100% 1$647
Proportion diagnosed and treated39%29%49%$500
Cost of pneumonia$2,164$1,623$2,705$485
Cost of swallow therapy$242$182$303$290
Cost of BSE$141$106$177$142
Effect of swallow program on need for PEG tube-1%-0.75%-1.25%2
Cost of PEG tube$416$312$5202
1 Increased by less than 25%. 2 Dysphagia program dominated no program at all tested values of the variable.
Table S-5. Thresholds for Variables in BSE Program versus No Program
VariableThresholdValues used in treeProportional change in tree value required to pass threshold
Reduction in relative risk<0.740.84-12%
Probability of aspiration pneumonia for no-program<0.0750.082-11%
Proportion of stroke patients who fail preliminary test and are referred to SLP for BSE and therapy>0.4350.39+12%
Cost of case of aspiration pneumonia<$1,935$2,164-11%
Cost of SLP therapy>$286$242+18%
Cost of SLP BSE>$185$141+31%
PEG tube placement absolute decrease11%
Cost of PEG tube placement1$416
1 No threshold found
Table S-6. Two-way Sensitivity Analysis for Base neumonia Rate and Risk reduction with BSE Dysphagia Program
Incremental Cost-Effectiveness
P(AP)
RRR7.4%7.8%8.1%8.5%8.8%9.2%9.6%9.9%10.3%10.6%11.0%
100%dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
97%dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
94%dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
91%dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
88%$66dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
85%$145$37dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
82%$229$118$17dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
79%$320$205$100$4dominatesdominatesdominatesdominatesdominatesdominatesdominates
76%$418$298$189$89dominatesdominatesdominatesdominatesdominatesdominatesdominates
73%$524$399$286$182$86dominatesdominatesdominatesdominatesdominatesdominates
70%$639$509$391$282$183$91$6dominatesdominatesdominatesdominates
Table S-7. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Cost of Swallow Therapy with BSE Dysphagia Program
Incremental Cost-Effectiveness
C(BSE)
C(therapy)$50$85$120$155$190$225$260$295$330$365$400
$1,000$3,721$3,919$4,117$4,315$4,513$4,712$4,910$5,108$5,306$5,504$5,702
$910$3,211$3,409$3,607$3,806$4,004$4,202$4,400$4,598$4,797$4,995$5,193
$829$2,702$2,900$3,098$3,296$3,494$3,692$3,891$4,089$4,287$4,485$4,683
$730$2,192$2,390$2,588$2,786$2,985$3,183$3,381$3,579$3,777$3,976$4,174
$640$1,682$1,881$2,079$2,277$2,475$2,673$2,871$3,070$3,268$3,466$3,664
$550$1,173$1,371$1,569$1,767$1,966$2,164$2,362$2,560$2,758$2,956$3,155
$460$663$861$1,060$1,258$1,456$1,654$1,852$2,050$2,249$2,447$2,645
$370$154$352$550$748$946$1,145$1,343$1,541$1,739$1,937$2,135
$280dominatesdominates$40$239$437$635$833$1,031$1,229$1,428$1,626
$190dominatesdominatesdominatesdominatesdominates$125$324$522$720$918$1,116
$100dominatesdominatesdominatesdominatesdominatesdominatesdominates$12$210$408$607
Table S-8. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Risk Reduction with BSE Dysphagia Program
Incremental Cost-Effectiveness Ratio
Cost of Bedside Swallow Exam
RRR$50$85$120$155$190$225$260$295$330$365$400
100%dominatesdominatesdominatesdominatesdominates$6$173$339$506$672$839
97%dominatesdominatesdominatesdominatesdominates$73$245$417$588$760$932
94%dominatesdominatesdominatesdominatesdominates$145$322$499$676$853$1,030
91%dominatesdominatesdominatesdominates$38$221$404$587$770$953$1,136
88%dominatesdominatesdominatesdominates$113$302$491$681$870$1,059$1,248
85%dominatesdominatesdominatesdominates$194$389$585$781$977$1,173$1,369
82%dominatesdominatesdominates$77$280$483$686$889$1,092$1,295$1,498
79%dominatesdominatesdominates$162$373$583$794$1,005$1,215$1,568$1,787
76%dominatesdominates$35$254$473$692$911$1,130$1,349$1,568$1,787
73%dominatesdominates$125$353$581$809$1,037$1,265$1,493$1,721$1,949
70%dominatesdominates$223$461$669$937$1,174$1,412$1,650$1,888$2,126
Table S-9. Results of the Comparison of a VFSS Program with a No-program Arm
No-programVFSS program
Effectiveness91.8% aspiration pneumonia-free patients98.7% aspiration pneumonia-free patients
Marginal effectiveness6.9% of stroke patients had aspiration pneumonia prevented
Cost$177 per stroke patient$204 per stroke patient
Marginal cost$26 per stroke patient
Marginal cost-effectiveness$380 per case of aspiration pneumonia prevented
Table S-10. Results of Sensitivity Analysis
VariableBase caseLower limit(-25%)Upper limit(+25%)Maximum incremental CE
Base pneumonia risk8.2%6.15%10.25%$1,228
Risk reduction with swallow program84%63%100% 1$1,228
Proportion diagnosed and treated39%29%49%$1,048
Cost of pneumonia$2,164$1,623$2,705$921
Cost of swallow therapy$242$182$303$726
Cost of MBS$218$164$273$692
Effect of swallow program on need for PEG tube-1%-0.75%-1.25%$395
Cost of PEG tube$416$312$520$395
1 Increased by less than 25%
Table S-11. Thresholds for Variables in VFSS Program versus No Program
VariableThresholdValues used in treeProportional change in tree value required to pass threshold
RRR<0.910.84+ 8%
Probability of aspiration pneumonia for no program<0.0880.082+ 7%
Proportion of stroke patients who fail preliminary test and are referred to SLP for BSE and therapy>0.360.39- 7%
Cost of case of aspiration pneumonia<$2,335$2,164+ 8%
Cost of SLP therapy>$209$242-14%
Cost of SLP BSE>$185$218-15%
PEG tube placement absolute decrease11%
Cost of PEG tube placement1$416
1 No threshold found
Table S-12. Two-way Sensitivity Analysis for Base Pneumonia Rate and Risk Reduction with VFSS Dysphagia Program
Incremental Cost-Effectiveness Ratio
P(AP)
RRR7.4%7.8%8.1%8.5%8.8%9.2%9.6%9.9%10.3%10.6%11.0%
100%$204$94dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
97%$277$164$61dominatesdominatesdominatesdominatesdominatesdominatesdominatesdominates
94%$355$238$132$34dominatesdominatesdominatesdominatesdominatesdominatesdominates
91%$438$318$208$107$14dominatesdominatesdominatesdominatesdominatesdominates
88%$527$402$288$184$89$1dominatesdominatesdominatesdominatesdominates
85%$622$493$375$267$168$77dominatesdominatesdominatesdominatesdominates
82%$724$590$468$356$254$159$71dominatesdominatesdominatesdominates
79%$834$695$568$452$345$247$156$72dominatesdominatesdominates
76%$952$807$676$555$444$342$248$160$79$3dominates
73%$1,080$929$792$667$552$445$347$256$171$92$18
70%$1,219$1,062$919$788$668$557$455$360$271$189$112
Table S-13. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Cost of Swallow Therapy with VFSS Dysphagia Program
Incremental Cost-Effectiveness Ratio
C(MBS)
C(therapy)$100$140$180$220$260$300$340$380$420$460$500
$1,000$4,004$4,230$4,457$4,683$4,910$5,136$5,363$5,589$5,816$6,042$6,269
$910$3,494$3,721$3,947$4,174$4,400$4,627$4,853$5,080$5,306$5,533$5,759
$820$2,985$3,211$3,438$3,664$3,891$4,117$4,344$4,570$4,797$5,023$5,249
$730$2,475$2,702$2,928$3,155$3,381$3,607$3,834$4,060$4,287$4,513$4,740
$640$1,966$2,192$2,418$2,645$2,871$3,098$3,324$3,551$3,777$4,004$4,230
$550$1,456$1,682$1,909$2,135$2,362$2,588$2,815$3,041$3,268$3,494$3,721
$460$946$1,173$1,399$1,626$1,852$2,079$2,305$2,532$2,758$2,985$3,211
$370$437$663$890$1,116$1,343$1,569$1,796$2,022$2,249$2,475$2,702
$280dominates$154$380$607$833$1,060$1,286$1,513$1,739$1,966$2,192
$190dominatesdominatesdominates$97$324$550$776$1,003$1,229$1,456$1,682
$100dominatesdominatesdominatesdominatesdominates$40$267$493$720$946$1,173
Table S-14. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Risk Reduction with VFSS Dysphagia Program
Incremental Cost-Effectiveness Ratio
Cost of MBS
RRR$100$140$180$220$260$300$340$380$420$460$500
100%dominatesdominatesdominatesdominates$173$363$553$744$934$1,124$1,314
97%dominatesdominatesdominates$49$245$441$637$833$1,030$1,226$1,422
94%dominatesdominatesdominates$120$322$524$727$929$1,132$1,334$1,536
91%dominatesdominatesdominates$195$404$613$822$1,031$1,240$1,449$1,658
88%dominatesdominates$59$275$491$708$924$1,140$1,356$1,572$1,789
85%dominatesdominates$138$361$585$809$1,033$1,257$1,480$1,704$1,928
82%dominatesdominates$222$454$686$918$1,150$1,382$1,614$1,846$2,078
79%dominates$72$312$553$794$1,035$1,276$1,516$1,757$1,998$2,239
76%dominates$160$410$660$911$1,161$1,411$1,662$1,912$2,162$2,413
73%dominates$255$516$777$1,037$1,298$1,558$1,819$2,080$2,340$2,601
70%$87$359$631$903$1,174$1,446$1,718$1,990$2,261$2,533$2,805

[Supplemental Analysis Figures]

Figure S-1. Cost-effectiveness Decision Tree for BSE

graphic element

Figure S-2. Cost-effectiveness Decision Tree for BSE Program Showing Variable Base-case Values and Terminal Node Payoffs

graphic element

Figure S-3. Cost-effectiveness Decision Tree for BSE Program Showing Expected Costs

graphic element

Figure S-4. Cost-effectiveness Decision Tree for BSE Program Showing Expected Pneumonia Cases

graphic element

Figure S-5. Tornado Diagram for BSE Dysphagia Program

graphic elementNo bars are shown for effect of swallow program on PEG rate or for cost of PEG placement because the dysphagia program is dominant and no cost-effectiveness ratio can be calculated for these variables.

Figure S-6. Two-way Sensitivity Analysis for Base Pneumonia Rate and Risk Reduction with BSE Dysphagia Program

graphic element

Figure S-7. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Cost of Swallow Therapy with BSE Dysphagia Program

graphic element

Figure S-8. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Risk Reduction with BSE Dysphagia Program

graphic element

Figure S-9. Tornado Diagram for Dysphagia Program

graphic element

Figure S-10. Two-way Sensitivity Analysis for Base Pneumonia Rate and Risk Reduction with VFSS Dysphagia Program

graphic element

Figure S-11. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Risk Reduction with VFSS Dysphagia Program

graphic element

Figure S-12. Two-way Sensitivity Analysis for Cost of Diagnostic Test and Risk Reduction with VFSS Dysphagia Program

graphic element

Supplemental Evidence Tables

Appendix A. Epidemiology

This section will discuss the incidence and prevalence of several neurological disorders that result in dysphagia. This section also discusses the diagnosed occurrence of dysphagia in general, and within patient populations suffering from each of these neurological disorders. The purpose is to determine, using these rates in conjunction with vital statistics data, approximately how many people in the United States may be affected by dysphagia and resulting complications and morbidities (found in the Burden of Illness section). This information allows us to determine which disease causes the most individuals to suffer from dysphagia, and with this information, determine which patients should receive the most diligent management.

We have limited the scope of our review of the epidemiological literature on neurological disorders to studies conducted in the United States because there are often cultural differences in the epidemiological rates of many diseases. We have also limited the evidence base to studies published within the past 10 years because the rates of many diseases have been found to change over time. Studies on the occurrence of dysphagia within groups of patients suffering from particular diseases were not limited to U.S. populations, nor to the past 10 years, because while the rates of the diseases themselves appear to vary by culture or time, there is no reason to believe that the symptomatology of these diseases would differ among cultures or change over time.

Epidemiological Methods

Table 7 in the Epidemiology section of this report shows a summary of the findings for the incidence and prevalence of various neurologic diseases, and the diagnosed occurrence of dysphagia within each of these diseases. Estimates are wide-ranging, and, when examining the literature, there is not always an obvious reason for these variations. Such differences can be created most readily by differences in populations (ethnic makeup, age-distribution, sex-ratio); as a result, if a study does not show age-, sex- and race-adjusted results, the findings will reflect only that population in which the study was performed, rather than the U.S. population as a whole. Many of the epidemiology studies considered here did indicate age- and sex-adjusted findings; however, few, if any, included a race-mix adjustment. This unfortunately causes limitations in data generalization from such respected disease epidemiology centers as the Mayo Clinic, which samples from a 96 percent Caucasian population.

Another sampling method that can affect results is the residence of the subject. A common method of disease surveillance involves review of hospital records. The limitations of this method are obvious-a hospital population does not necessarily represent the state of health in the population as a whole (usually leading to an overestimate of disease prevalence), and patient charts may include miscodings or omissions leading to miscounting of diseases. Door-to-door surveys suffer from some of the same problems. Very often, these community interviews will miss people who are in institutions, such as nursing homes or hospitals, thus underestimating the rates of diseases.

The method used to confirm the disease is another important consideration. In hospital record review, it often isn't known whether the admitting diagnosis was confirmed on examination, or by what method. For example, a patient may be admitted with a diagnosis of acute stroke, but it may be the case that this is not confirmed on CT examination. The patient's record, however, may still include the DRG code for acute stroke. Any epidemiological study that does not impose diagnostic requirements for disease confirmation runs similar risks. Some studies on major neurological diseases may identify patients through associations devoted to that disease, plus clinicians in the area who treat such diseases. Suspected cases are then examined by a study physician, and only included as a case if certain diagnostic criteria are met. This diagnostic confirmation is really the only way to absolutely control prevalence estimates; however, this example included a way of identifying patients that may be flawed. When examining the epidemiology of a disease in a defined geographical area, many researchers survey local physicians and associations to locate possible cases. This requires the assumption that all cases would have contacted the association or the relevant specialist within that defined area. It therefore requires a very isolated population. Such places are rare in the United States, and there is always the possibility that a patient traveled to a major medical center in the nearest metropolitan area for treatment, or has not sought treatment at all.

If epidemiology is determined retrospectively (through case review), the accuracy (as mentioned above) and extensiveness of the records are important, especially in prevalence studies. When examining annual incidence, it may be fairly easy to contact patients and physicians to obtain information about recent health status. However, when prevalence is of interest, accurate records are needed that extend back several years. Many hospitals do not keep such extensive records, and their records are generally kept by admission DRG code rather than by patient; this allows for the probability of double-counting some patients.

An alternative to this is offered by the Mayo Clinic, which has an extensive medical record linkage system that tracks cases by patient rather than by diagnosis or admission. This system tracks both inpatient and outpatient medical visits in the Olmstead County Minnesota area, and has records back to 1917. This makes it one of the most valuable and often-used tools for epidemiological research in the United States. Its only limitation is a largely white, upper-middle class population whose rates of disease may not be generalizable to the U.S. population as a whole.

The statistical measures are important as well. To compare the results from various studies, we must be able to evaluate exactly what each study is measuring. Thus, if a study appears to measure prevalence, it is important to determine if they measured it using a point prevalence method (i.e., all cases that exist on a particular point prevalence date), or a range prevalence method (all cases during the years 1988-1996). These two different methods are not easily comparable, and the latter may actually be more suited to determining average annual incidence. Similar problems occur when examining studies on annual incidence; some report an incidence per year, while others report an incidence per person-year. These methods also may not be validly comparable.

Given the possibilities of study limitations as discussed above, the best study or studies about each neurologic disorder will be described below, with explanations about why that particular study (or studies) is superior to the others located in the published literature.

Epidemiology of Dysphagia in the U.S. Population

Dysphagia can include any swallowing problem originating in the mouth, pharynx, or esophagus (see the section entitled Dysphagia Defined for further discussion). Because we are focusing on oropharyngeal swallowing disorders, we will parse out any statistics pertaining specifically to these disorders whenever possible when reviewing the epidemiology literature in this section.

Studies on the epidemiology of dysphagia in the United States (without limitation to specific disorders or diseases) are shown in Evidence Table 1. Again, it must be stressed that these studies do not report definitive statistics on the epidemiology of dysphagia; rather, they report on the diagnosed occurrence, which will vary depending on the diagnostic tool(s) used.

In surveys of hospital patients (a commonly used tool, but biased because the patients surveyed are not representative of the population-at-large), dysphagia has been diagnosed in 12.5 percent (Groher and Bukatman, 1986) to 16.6 percent (Layne, Losinski, Zenner et al., 1989) of people of all ages, fairly consistent findings. However, swallowing function may gradually deteriorate as a function of aging (Shaker and Lang, 1994), and it is also a symptom of several neurological disorders more common among the elderly than the rest of the population, including stroke, Alzheimer's disease, Parkinson's disease, and motor neuron disease (to be discussed below). Therefore, the prevalence of dysphagia should be higher in an elderly or nursing home population than in the general population.

However, research investigating this has yielded mixed results. Rates for dysphagia in the elderly have ranged from 14.3 percent (Baum and Bodner, 1983) to 34.1 percent (Layne, Losinski, Zenner et al., 1989). The lowest rate was a survey of community-dwelling elderly, while the highest was a survey of elderly in a nursing home unit of a hospital. Diagnostic methodology was similar. Thus, differences in this rate are probably largely accounted for by the functional level of the population. Because of this increased incidence in nursing home patients, and because of their general tendency to experience increased morbidity from debilitation, it is especially important to ensure that dysphagia is diagnosed and treated correctly in these patients.

Epidemiology of Dysphagia Originating from Neurologic Conditions

Table 21. Neurologic Causes of Oropharyngeal Dysphagia in the Elderly
Central nervous system cerebrovascular accidents/strokes
  • Anterior cortical stroke

  • Brainstem stroke

  • Wallenberg's syndrome

  • Pseudobulbar palsy

Parkinson's disease
Multiple sclerosis
Alzheimer's disease
Other neuromuscular disorders
  • Motor neuron disease (MND)

  • Polymyositis and dermatomyositis

  • Muscular dystrophies

  • Myasthenia gravis

  • Hypothyroidism or hyperthyroidism

  • Bulbar poliomyelitis

  • Peripheral neuropathy (secondary to diabetes mellitus)

A list of neurological causes of oral-pharyngeal dysphagia is shown in Table 21.

No information was found in a Medline search on the occurrence of dysphagia in bulbar poliomyelitis, muscular dystrophy, hyperthyroidism or hypothyroidism, peripheral neuropathy, or on the overall prevalence of these disorders in the U.S. population. Therefore, we will focus on those relatively common disorders in the elderly for which there are data on both the overall incidence or prevalence of the disorder and on the diagnosed occurrence of dysphagia in individuals with that disorder, including stroke (cerebrovascular accident, or CVA); Parkinson's disease, Alzheimer's disease, amyotrophic lateral sclerosis (ALS), and a few other less common degenerative neurologic diseases. Neoplasms are also not considered in this report.

Epidemiology of Stroke in the United States

Six published studies were located that reported the annual incidence rate1 of CVA in the United States using various measurement techniques (Barker and Mullooly, 1997; Brown, Whisnant, Sicks et al., 1996; Kittner, White, Losonczy et al., 1990; McGovern, Burke, Sprafka et al., 1996; Modan and Wagener, 1992; Sacco, Boden-Albala, Gan et al., 1998). Prevalence rates cannot be examined for CVA, because it is an event rather than an ongoing disorder. The prevalence of cerebrovascular disease, on the other hand, can be examined; however, we are primarily interested in dysphagia that occurs shortly after an acute accident. Cerebrovascular disease overall was reported in the 1995 National Nursing Home Survey as the primary admitting diagnosis in 2.4 percent of elderly individuals (Dey, 1997). It is the third leading cause of death among individuals 65 and older (Department of Health & Human Services, 1997 ).

Evidence Table 2 shows six published studies on CVA incidence in the United States. Incidence rates for all people (all races, ages, and both sexes) were reported by two studies, and range from an estimate of 145 per 100,000 people per year (Brown, Whisnant, Sicks et al., 1996), to 289.7 per 100,000 people per year (Modan and Wagener, 1992). The lower rate comes from the Mayo Clinic records, and therefore records all strokes, whether hospitalized or not; this provides a more accurate reflection of incidence rates than a hospital study, which produced the higher estimate listed above. However, as mentioned before, since the Mayo Clinic represents a largely Caucasian population, this may skew results, as African-Americans and Hispanics have been found to demonstrate higher rates of stroke (Sacco, Boden-Albala, Gan et al., 1998).

A sex discrepancy, in which women suffered fewer strokes than men, was noted in all studies reporting these statistics. Within each of these studies, there is a consistent trend that the stroke risk ratio for women to men is approximately 1:1.5; this seems to indicate that men have a 50 percent higher risk of having a stroke than do women.

Table 22. Annual Incidence of Stroke per 100,000 Individuals Age 65 to 74 Years
StudyWomenMen
McGovern, Burke, Sprafka et al., 1992514.9 (age 70-74)818.2 (age 70-74)
Brown, Whisnant, Sicks et al., 1996524885
Gillum and Ingram, 19961,172 to 1,805 (depending on geographic region) per 100,000 person-year at risk1,676 to 2,315 (depending on geographic region) per 100,000 person-year at risk
Barker and Mullooly, 1997736953
Age-specific annual incidence rates for elderly people aged 65 to 74 were available from four studies shown in Table 22. Rates among both men and women varied more than three-fold. Two studies (McGovern, Burke, Sprafka et al. 1992; Barker and Mullooly, 1997) were hospital records surveys; these two studies may be not be accounting for strokes treated outside a hospital system.

Table 23. Annual Incidence of Stroke per 100,000 Individuals Age 85 Years and Over
StudyWomenMen
Brown, Whisnant, Sicks et al., 19962,5741,563
Barker and Mullooly, 19971,5221,816
Incidence rates for individuals 85 years and older were available from two of these studies, and are shown in Table 23. Of particular interest is the Brown study, in which the rates of women were considerably higher than those of men (Brown, Whisnant, Sicks et al., 1996). Interestingly, these same researchers also found that the mean age of onset was 69 years for men and 77 years for women; the combination of these two findings might suggest that women simply have delayed onset in stroke incidence compared with men rather than an actual lower rate.

To summarize, the annual incidence of stroke in the United States has been estimated to be between 145 and 290 per 100,000 individuals. Caucasians have a lower incidence than African-Americans and Hispanics, and women have lower incidence than men, except among the oldest elderly. Incidence increases with age: populations age 65 to 74 show incidence of 515 to 2,315 per 100,000; those age 85 and over range from 1,522 to 2,574 per 100,000. Rates vary depending on the study methodology (hospital survey versus medical records/physician survey) as well as population differences (age, sex, race, and place of residence). It therefore appears that both environmental and genetic influences affect risk for a CVA.

Epidemiology of dysphagia in stroke patients

The location and severity of a stroke lesion will determine its effects on swallowing function. Hemispheric and brainstem strokes are both associated with dysphagia; however, unilateral hemispheric strokes are less likely to cause problems because there is bilateral representation in the brainstem (Morrell, 1992). Also, posterior lesions that do not affect the motor cortex may not lead to dysphagia (Logemann, 1983c).

Brainstem and anterior cortical strokes most often lead to dysphagia characterized by a delay in triggering of the swallow reflex. Dysfunction of the cricopharyngeal muscle is also common (Logemann, 1983a; Morrell, 1992). In cortical strokes, lingual and pharyngeal paresis may be unilateral, thereby allowing for compensation by turning the head toward the paretic side while swallowing (Mendez, Friedman, and Castell, 1991).

Because dysphagia is so often a temporary result of an acute stroke, we have limited our discussion to those studies that specifically examined the occurrence of dysphagia shortly after an acute stroke event using the water swallow test, Bedside examination (BSE), videofluoroscopic swallowing study/modified barium swallow (VFSS/MBS), or fiberoptic endoscopic examination of swallowing/fiberoptic endoscopic examination of swallowing and sensory test (FEES/FEESST). Any studies that did not define their diagnostic method, tested patients for dysphagia later than 7 days post-stroke, included transient ischemic attacks (TIAs), or that did not test 100 percent of the consecutive stroke population for dysphagia were excluded. This eliminated 8 out of 19 studies identified (Berg and Mor, 1995; Chua and Kong, 1996; Gresham, 1990; Hamdy, Aziz, Rothwell et al., 1997; Robbins, Levine, Maser et al., 1993; Selley, Roche, Pearce et al., 1995; Taub, Wolfe, Richardson et al., 1994; Teasell, Bach, and McRae, 1994). Evidence Table 3 shows the study results from 13 worldwide studies of the diagnosed occurrence of dysphagia after a CVA.

Many of these studies were hospital surveys; this may be an advantage; such studies interview patients just after the acute CVA because dysphagia is often most severe immediately after the event (Barer, 1989; Smithard, O'Neill, England et al., 1997; Smithard, O'Neill, Parks et al., 1996). This allows us to assess specifically the occurrence of dysphagia across studies without the confound of time variations.

Of studies conducted exclusively in a hospital setting, the reported occurrence of dysphagia among stroke patients ranged from 19.4 percent (Nilsson, Ekberg, Olsson et al., 1998) to 90 percent (Kidd, Lawson, Nesbitt et al., 1993). Diagnosis in the Kidd study was made on the basis of videofluoroscopic study; this method often increases the number of positive tests because it detects silent aspiration as well as symptoms that are not detected by and may never affect the patient. This is supported by the fact that these same authors assessed the same 60 patients using a water swallow test, and only 25 patients (41.7 percent) were diagnosed dysphagic on the basis of coughing, choking, or wet voice. There is an overall trend in the hospital-based literature that diagnosis made using results of VFSS result in higher prevalence rates [65.5 percent (Daniels, Brailey, Priestly et al., 1998); 74.6 percent (Daniels, McAdam, Brailey et al., 1997); and 90 percent (Kidd, Lawson, Nesbitt et al., 1993)] than those studies using other methodologies such as observation and BSE [19.4 percent (Nilsson, Ekberg, Olsson et al., 1998); 29.4 percent (Barer, 1989); 30 percent (Axelsson, Asplund, Norberg et al., 1989); 31.5 percent (Smithard, O'Neill, England et al., 1997); 41.7 percent (Kidd, Lawson, Nesbitt et al., 1993); 45.1 percent (Gordon, Hewer, and Wade, 1987); 49.6 percent (Smithard, O'Neill, Parks et al., 1996); 59.0 percent (DePippo, Holas, and Reding, 1994)].

One study did not limit its patient pool to hospitalized patients (Wade and Hewer, 1987). Wade and Hewer (1987) reviewed all records of stroke patients for whom evaluation of swallowing function (as measured by a water swallow test) was available within the first 7 days after the stroke event. They did not report overall swallowing disorders or dysphagia, but rather reported occurrence of individual symptoms: 14.2 percent choked, 6.2 percent displayed an abnormal swallowing pattern, and 22.6 percent demonstrated an abnormally slow swallow. It is impossible to determine the overall rate of swallowing disorders from this study because many patients may have shown more than one symptom; however, this study does provide a portrait of the typical symptoms experienced by a stroke patient.

The occurrence of dysphagia in a stroke population may differ depending on where the stroke occurred (left hemisphere, right hemisphere, both, or brainstem). It used to be thought that dysphagia occurred primarily after brainstem strokes, but more recent evidence suggests that it occurs at a similar rate after hemispheric strokes (Lugger, 1994). Of the studies discussed above, most of those that reported lesion location had a patient population made up largely of hemispheric stroke patients (85 to 100 percent of patient pool) (Barer, 1989; Daniels, Brailey, Priestly et al., 1998; Gordon, Hewer, and Wade, 1987; Kidd, Lawson, Nesbitt et al., 1993; Nilsson, Ekberg, Olsson et al., 1998). The reported occurrence of dysphagia in these patient populations ranged from 19.6 percent (Nilsson, Ekberg, Olsson et al., 1998) to 65.5 percent (Daniels, Brailey, Priestly et al., 1998). No studies exclusively examined a brain-stem stroke population.

There is consistent evidence that dysphagia is a transient symptom for many stroke patients. Barer (1989) followed 357 hemispheric stroke patients admitted to a hospital within 48 hours of the event. Swallowing was assessed with a water swallow test. At day 1 assessment, 105 of 357 patients demonstrated swallowing impairment (29 percent). This decreased steadily until, at 6 months, only 1 out of 248 (0.4 percent) assessed still demonstrated swallowing dysfunction (Barer, 1989).

The rate of swallowing impairment was related to the age of the patients at initial assessment: while 23 percent of those patients under age 70 were swallowing impaired, 36 percent of those over age 70 were. This was a significant difference (p<0.01). This difference disappeared by 1 week, however. The authors suggest that this may be due to the high mortality rate of the elderly patients with dysphagia, leading to a decrease in statistical power of the findings (Barer, 1989).

Smithard et al. (1997) followed 121 consecutive stroke patients admitted to a hospital within 24 hours of a stroke. They followed the patients for up to 180 days. At admission, 61 (51 percent) demonstrated a compromised swallow using a BSE method of assessment. At day 180, 8 out of 73 assessed (11 percent) demonstrated dysphagia. Overall, 6 cases out of 61 original cases (9.8 percent) were known to persist to 6 months. Some cases developed after the initial assessment (24 total); all but 2 were transient (Smithard, O'Neill, England et al., 1997; Smithard, O'Neill, Parks et al., 1996).

In conclusion, dysphagia is a symptom experienced by many stroke patients just after the event, occurring in 20 to 90 percent of patients, depending on the diagnostic methodology. It is impossible to determine which of these studies provides the most accurate rate, because, as mentioned before, dysphagia is not a defined disease, but a cluster of symptoms that may not occur all the time and the diagnosis of which may be influenced by subjective judgment of the clinician. A median of similar studies provides the best overall estimate: hospital studies using videofluoroscopy had a median diagnosis rate of 74.6 percent (Daniels, McAdam, Brailey et al., 1997). Those using BSE or a water swallow test had a median of 41.7 percent (Kidd, Lawson, Nesbitt et al., 1993). The difference between these two estimates is accounted for both by silent aspiration and physiologic dysfunction that may not affect the patient's health. If it were found that all of the cases of dysphagia diagnosed with videofluoroscopy caused significant disability to the patient, then it might be warranted to test all acute stroke patients for dysphagia. However, there is no evidence that this is the case.

Limited evidence suggests that dysphagia occurs more often in older stroke victims. It appears that hemispheric strokes are commonly characterized by dysphagia, despite commonly held beliefs that dysphagia usually occurs only after a brainstem stroke. Dysphagia is most often a transient symptom of acute stroke, with a large majority of cases spontaneously recovering within 6 months.

Epidemiology of Parkinson's Disease in the United States

Parkinson's disease was reported as the admitting diagnosis in 2.1 percent of all elderly admitted to nursing homes in the 1985 National Nursing Home Survey conducted by the U.S. Department of Health and Human Services (Hing, Sekscenski, and Strahan, 1989). Three other studies on prevalence rates of Parkinson's disease in the United States were found in the medical literature and are shown in Evidence Table 4.

The methods of these three studies are fairly distinct from one another. Lilienfeld et al. (1990), who reported the lowest rates, reviewed acute-care hospital discharges over a five year period (excluding individuals over 75 years of age) (Lilienfeld, Sprafka, Pham et al., 1991). Mayeux et al. (1995) established a community registry by polling health maintenance organizations, hospitals, and nursing homes; they interviewed everyone (except those over 75 years old) suspected of having Parkinson's disease for confirmation (Mayeux, Marder, Cote et al., 1995). Mitchell et al. (1996), who reported the highest rates, focused solely on nursing home residents, examining ICD-9 diagnoses (Mitchell, Kiely, Kiel et al., 1996). Their rates ranged from 45 per 100,000 (Lilienfeld, Sprafka, Pham et al., 1991) to 2,250 per 100,000 (Mitchell, Kiely, Kiel et al., 1996) when only primary Parkinson's disease is considered (Parksinson's disease not secondary to any other disorder). It is clear from these studies that the elderly experience more Parkinson's disease, and those in nursing homes have an even higher rate. One study (Lilienfeld, Sprafka, Pham et al., 1991) reported that women have a lower rate than men (45 versus 65 per 100,000).

Epidemiology of dysphagia in Parkinson's disease

Seven studies were identified that analyzed the diagnosed occurrence of dysphagia or swallowing problems in patients with Parkinson's disease (see Evidence Table 5 for details on study characteristics and results). Study designs of particular interest were those that withheld levodopa therapy (L-dopa) during the day of dysphagia testing, and those that reported disease severity on widely used scales (such as Hoehn and Yahr). Estimates of the occurrence of dysphagia in patients with PD range from 23 to 81 percent. These rates may be affected by duration and severity of disease, whether patients were taking medication at the time of testing, and the method of measuring dysphagia. Most of these researchers examined patients with mild to moderate Parkinson's disease (stage 1 to 3 out of 5 on Hoehn and Yahr Scale). Three studies reported on the relationship between severity of illness and severity of dysphagia; two studies (Fuh, Lee, Wong et al. 1997; Wintzen, Badrising, Roos et al. 1994) reported no correlation when severity of disease was measured using the Hoehn and Yahr scale. One study (Coates and Bakheit, 1997) reported a strong correlation between both severity and length of illness and severity of dysphagia, when severity of illness was measured using the United Parkinson's Disease Rating Scale (UPDRS) to assess disease severity and the Chicago Assessment Scale to assess level of swallowing dysfunction.

Those researchers who measured dysphagia using videofluoroscopy (VFSS or MBS) (three studies) found rates of dysphagia varying from 63.2 to 81.1 percent (Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997; Wintzen, Badrising, Roos et al., 1994). Questionnaires (3 studies) yielded rates from 22.9 to 76.9 percent (Coates and Bakheit, 1997; Hartelius and Svensson, 1994; Singer, Weiner, and Sanchez-Ramos, 1992). Most researchers reported rates in the 60 to 80 percent range. The mean age and duration of illness seem comparable in most of these studies and thus probably do not affect the variation of the rates. However, whether the patients were receiving L-dopa at the time of the study may have affected results. It is unclear whether severity of swallow dysfunction correlates with severity of Parkinson's disease.

Because L-dopa appears to affect swallow function in some patients (Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997), it appears that the best study would test patients while not taking the medication. Two studies did this (Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997), and found rates of 75 percent and 63.2 percent in groups of 20 and 19 patients, respectively. Both used MBS. There is no apparent advantage to either of these studies, so the mean will be used in the calculation of burden of illness, 69.1 percent.

Epidemiology of Alzheimer's Disease

Patients with Alzheimer's disease and other degenerative brain disorders made up 2.7 percent of all nursing home admissions in the 1985 National Nursing Home Survey (Hing, Sekscenski, and Strahan, 1989) (This was classified separately from senile dementia.) Two other epidemiological studies were found for the United States, one which looked at the population in general through review of clinical records, and one which focused on the African-American community in a cross-sectional interview study (Beard et al. 1991), using records from the Mayo Clinic record linkage system, identified all medical cases of Alzheimer's disease on prevalence date January 1, 1980. They reported age- and sex-adjusted prevalence rates of 259.8 per 100,000 overall and 2,634.1 per 100,000 people age 65 and older. Age-specific information indicated that this rate rose from 236/100,000 in those age 65-69, to 18,018/100,000 in those aged 85 and older (Beard, Kokmen, Offord et al., 1991). The Mayo Clinic serves a 96 percent Caucasian community; however, this was the only study on the overall population, and therefore is the study we will use for calculation of burden of illness.

Epidemiology of dysphagia in Alzheimer's disease patients

One study was located that examined the prevalence of swallowing disorders in Alzheimer's patients (see Evidence Table 7 for details). Horner et al. (1994), in a prospective case series, examined 25 patients age 55 and over with moderate or severe Alzheimer's disease using videofluoroscopy. Twenty-one out of 25 patients (84 percent) showed some sort of swallowing abnormality on videofluoroscopy. Six of these patients with abnormalities showed evidence of aspiration (24 percent of all patients). The most prevalent abnormality was delayed reflex initiation, followed by hesitancy of oral preparation and deficient pharyngeal clearance (Horner, Alberts, Dawson et al., 1994). This study suggests that an overwhelming majority of Alzheimer's patients experience swallowing problems; however, it is only one study and included only a small patient pool. No evidence is available about the occurrence of dysphagia in patients with mild Alzheimer's disease; it is therefore currently not possible to determine if there is a correlation between severity of Alzheimer's disease and severity of dysphagia.

Epidemiology of Dysphagia in Other Degenerative Neurologic Diseases

Several other less common neurologic diseases contribute to the prevalence of dysphagia in the elderly population. Discussed here only are those for which we could find published evidence that dysphagia is a common side effect: multiple sclerosis (MS), progressive supranuclear palsy (PSP), motor neuron disease (MND) including ALS, also known as Lou Gehrig's disease, and Huntington's disease. Those epidemiological studies examining the incidence or prevalence of these neurological disorders in the U.S. population are shown in Evidence Table 8; those studies examining the occurrence of dysphagia within each of these disorders are shown in Evidence Table 9.

Multiple sclerosis

The onset of MS usually occurs between the second and fifth decades (McFarlin and McFarland, 1982a) but because of its slow, insidious progression over several years, many elderly also suffer from it. It is characterized by the idiopathic demyelination of the white matter of the central nervous system. When certain cranial nerves are affected by this demyelination, dysphagia will occur.

MS was recorded in 0.6 percent (600 per 100,000) of all nursing home admissions documented by the 1985 National Nursing Home Survey (Hing, Sekscenski, and Strahan, 1989). Three other studies were identified that examined prevalence of MS in the United States. Two of these studies were investigations of possible cluster areas of MS (Helmick, Wrigley, Zack et al., 1989; Hopkins, Indian, Pinnow et al., 1991), and therefore are not appropriate for determining a U.S. generalizable burden of illness.

In Olmstead County, Minnesota, Wynn et al. (1990) examined patient records of the Mayo Clinic records linkage system, and discovered a sex- and age-adjusted prevalence of 170.8 per 100,000 on prevalence date January 1, 1985. This rate was significantly higher for women (231.9) than for men (79.8). Of those age 65 and older, the overall prevalence was 265.7 per 100,000, again, higher for women (369.1) than for men (89.7) (Wynn, Rodriguez, O'Fallon et al., 1990). Even though the Mayo Clinic records are very thorough, these figures may not be generalizable to the entire population, because of a latitudinal gradient of MS prevalence that has been discovered in North America (McFarlin and McFarland, 1982b), in which more southern areas have higher rates.

We identified one study that examined the diagnosed occurrence of swallowing disorders in individuals with MS. Hartelius and Svensson (1994) sent a questionnaire to members of the Multiple Sclerosis section of the Association for the Neurologically Disabled in Sweden. A total 278 questionnaires was sent and 203 returned (response rate of 73 percent). Of respondents, 68 percent were females, consistent with the observation that women are at higher risk of MS than men (Wynn, Rodriguez, O'Fallon et al., 1990); 33 percent reported that mastication and swallowing were more difficult than prior to disease onset; 13 percent reported having difficulty swallowing liquid at least fairly often; 16 percent reported problems swallowing solids at least fairly often; and 27 percent reported choking on food or drink at least fairly often (Hartelius and Svensson, 1994). An overall prevalence rate of dysphagia was not possible to determine given the self-report nature of this study. The occurrence of dysphagia reported in this study may be unrealistically high because of self-selection bias.

Motor neuron disease

MND is a family of neurodegenerative diseases in which there is a selective progressive depletion of motor neurons (Kirshner, 1989). There are three common subtypes of MND: ALS, the most common type, involves both the upper and lower motor neurons of the pyramidal tracts; progressive muscular atrophy (PMA) is characterized by motor neuron depletion in the anterior horn of the spinal cord; and bulbar palsy (BP) mainly affects the lower cranial nerve nuclei (Leighton, Burton, Lund et al., 1994). Progression of this disease is rapid and therefore life expectancy is generally short (less than 5 years).

The epidemiology of motor neuron disease in the United States was investigated in one published study, and ALS specifically investigated in two others. Lilienfeld et al. (1991) examined hospital discharge data for a 5-year period in a large metropolitan area; the average annual age-adjusted incidence of MND was estimated at 6 per 100,000 in 1984, the most recent year covered. For individuals age 65 to 74, men showed an incidence of 25 per 100,000, and women, 18 per 100,000 (Lilienfeld, Sprafka, Pham et al., 1991). The prevalence rate of MND was not calculated. These incidence rate estimates may be low because of the focus on hospital discharges.

Two studies looking specifically at ALS reported annual incidence rates of 1.14 (Annegers, Appel, Lee et al., 1991) and 1.8 (McGuire, Longstreth, Koepsell et al., 1996) per 100,000. In both studies, women appeared to have a slightly lower incidence than men. Incidence rates were higher among the elderly; Annegers et al. (1991) reported that individuals age 75 and over had a rate of 4.01 per 100,000; McGuire et al. (1996) reported higher rates, at 7.18 per 100,000 men, and 5.54 per 100,000 women. The Annegers study results came from surveys of neurologists, many of whom refused participation; therefore, these rates may be unrealistically low.

We identified two published studies that investigated the diagnosed occurrence of swallowing disorders in MND. Reported dysphagia rates were 51.1 percent (Leighton, Burton, Lund et al., 1994) and 71 percent (Mayberry and Atkinson, 1986). Dysphagia rates for ALS reported by Leighton were 29 percent; this study is likely superior to that of Mayberry et al. because it does not suffer from self-selection bias as a result of survey methods.

Progressive supranuclear palsy

PSP is often mistaken for Parkinson's disease because of its Parkinsonian symptom characteristics. It has a late middle-age onset, and is characterized by a high density of neurofibrillary tangles and neuropil threads in the basal ganglia and brainstem (Litvan, Agid, Calne et al., 1996). Symptoms include bradykinesia and rigidity, severe gait difficulty, and pseudobulbar signs. It progresses more rapidly than Parksinson's disease (Golbe, Davis, Schoenberg et al., 1988) and does not generally respond to L-dopa treatment. Onset of these symptoms has been reported to begin at a mean age between 55 and 70 years (Litvan, Agid, Calne et al., 1996).

Two studies were located that examined incidence or prevalence of PSP; Golbe et al. (1988), in a survey and exam, determined a prevalence rate of 1.39 per 100,000 (Golbe, Davis, Schoenberg et al., 1988). Bower et al. (1997), using the Mayo Clinic records linkage system, calculated an annual incidence rate of 1.1 per 100,000; this rose to 5.3 per 100,000 for those age 50 and older (Bower, Maraganore, McDonnell et al., 1997). These are the only two studies we found in the published literature on PSP, and they are not comparable given that one study calculated prevalence and one calculated incidence.

The most frequent cause of death in PSP is pneumonia (Litvan, Agid, Calne et al., 1996); therefore, determination of the occurrence and characteristics of swallowing problems is important. Litvan et al. (1997) assessed 27 patients with PSP using a swallowing questionnaire and ultrasound imaging of swallowing. Results indicated that all PSP patients had at least one swallowing complaint; 15 (55.6 percent) complained of difficulty swallowing-5 had more difficulty swallowing solids than liquids, and 10 had more difficulty with liquids than solids. Five (19 percent) demonstrated aspiration (Litvan, Sastry, and Sonies, 1997). This was the only published study identified that investigated swallowing problems in PSP patients.

Huntington's disease

Huntington's disease is a hereditary neurodegenerative disease affecting the basal ganglia, causing movement disorders, dementia, and emotional impairment (Kagel and Leopold, 1992; Leopold and Kagel, 1985). Dysphagia is very common in these patients, characterized by lack of control of the rate of food intake, difficulty with liquids, bolus retention in the buccal recesses, and choking. Many of these problems are side effects of cognitive impairment rather than directly the result of neurologic or neuromuscular dysfunction, per se (Leopold and Kagel, 1985).

One study was located that examined the prevalence and incidence of Huntington's disease in a U.S. population. Kokmen et al. (1994), at the Mayo Clinic, examined records from 1950 to 1990 and identified 10 definite cases of Huntington's disease. The prevalence on January 1, 1990 was 1.9 per 100,000 overall (1.8 for women and 2.0 for men). The average annual incidence rate between 1970 and 1989 was 0.2 per 100,000 (0.3 for women and 0.1 for men) (Kokmen, Ozekmekci, Beard et al., 1994).

One study examined the diagnosed occurrence of dysphagia in Huntington's disease; Kagel and Leopold (1992) examined 35 patients with Huntington's disease (mean age 45.5) using a clinical questionnaire, assessment of feeding, and videofluoroscopic exam. Twenty-seven patients (80 percent) coughed or choked on both solids and liquids; six were symptomatic with food only, and two with liquid only. Thus, 100 percent of patients with Huntington's disease showed dysphagia of one kind or another. In the hyperkinetic variant of Huntington's disease (30 of 35 patients), 29 (96.7 percent) demonstrated oral dysphagia, 27 (90 percent) pharyngeal dysphagia, and 11 (36.6 percent) esophageal dysphagia during clinical assessment. A total of 100 percent of hypokinetic-Huntington's disease patients (five patients) demonstrated oral and pharyngeal swallowing dysfunction; none demonstrated esophageal dysphagia (Kagel and Leopold, 1992). Selection methodology of these patients was not described; therefore, there is a possibility that the researchers specifically selected patients with dysphagia for study.

Summary

Dysphagia is quite commonly diagnosed in all of these neurological patients. Estimates have been very high for stroke, Parkinson's disease, Alzheimer's disease, and Huntington's disease; however, fairly low estimates have also been reported for those diseases on which there was more than one study published. These variations indicate interstudy differences in diagnostic criteria and methodology. It is therefore impossible to determine if dysphagia is prevalent enough in any one of these diseases that 100 percent of these patients should be tested for dysphagia; a determination whether or not to do this would be based on clinical judgment. It must also be pointed out that dysphagia, per se, is not a risk factor for pneumonia; it is the aspiration of certain substances that leads to pneumonia. Dysphagia, on the other hand, may put a patient at risk for malnutrition if the patient is unable to safely swallow adequate amounts of food.

Appendix B. Burden of Illness of Dysphagia and Its Complications in Neurologic Diseases

This section provides an in-depth discussion of the number of people in the United States affected by neurogenic dysphagia and resulting complications and death. This subject was touched upon briefly in the Burden of Illness section, and a summary of the figures is shown in Tables 8 and 9. We must stress that these numbers are rough estimates, as they are taken from several different small clinical trials. Often, only one number was available, and therefore used by default, not because it was necessarily an accurate number. A determination of the number of people affected in the United States is done using vital statistics data provided by the U.S. government.

Stroke

The rate of stroke in the United States cannot be definitively determined from the literature, as estimates have ranged widely. For the calculation of burden of illness, therefore, the most reasonable way to proceed is to offer a probable low estimate and high estimate. Brown et al. (1996) at the Mayo Clinic estimated the annual incidence of stroke to be 145 per 100,000. This included both hospitalized and nonhospitalized cases (Brown, Whisnant, Sicks et al., 1996). They made several adjustments to avoid double counting cases and to ensure only new incident cases were counted and prevalent cases were excluded. On the other hand, Modan and Wagener (1992), using the National Hospital Discharge Survey, reported an annual incidence of 290 per 100,000. Their methodology has a strong chance of double counting and including prevalent cases, even though the researchers limited their survey to acute-care facilities and excluded chronic and rehabilitative care. On the other hand, the population surveyed by Brown et al. was a predominantly white population, which has a lower stroke rate than other races, and therefore may not be representative of the entire U.S. population.

As of 1996, there were slightly more than 265 million people in the United States (U.S. Department of Commerce, 1997). Using these rates to model low and high possibilities, we calculated that 384,662 to 768,528 Americans suffer a stroke each year.

Of those 65 and older, the annual incidence rate of stroke has been estimated to be 953 per 100,000 men, and 736 per 100,000 women in a health care plan in Oregon (Barker and Mullooly, 1997); this was the only study to report incidence rates in the elderly group as a whole. In 1996, there were approximately 14 million men over 65, and 20 million women [Centers for Disease Control and Prevention (CDC), 1998]. Thus, the total burden of illness for stroke in the U.S. elderly population is approximately 279,338 individuals. This makes up 36.3 to 72.6 percent of all strokes in the population from the calculations above. Elderly patients normally make up the majority of all stroke incidents; three studies report that the proportion of stroke patients that are elderly range from 59 to 79 percent (Broderick, Phillips, Whisnant et al., 1989; Brown, Whisnant, Sicks et al., 1996; Taub, Wolfe, Richardson et al., 1994). This indicates that either the total stroke incidence rate reported by Modan et al. is too high, or that the rate for the elderly reported by Barker et al. is too low; one factor may be that Barker et al. allowed fewer ICD-9 stroke codes for inclusion than did Modan, but which of these methods is the more accurate is impossible to determine.

Not enough published data are available to calculate the number of nursing home patients affected by an acute stroke event each year.

Dysphagia

CDC indicates that there are almost 44 million people in the United States over the age of 60 [Centers for Disease Control and Prevention (CDC), 1998]. Of those, one study (the only study reporting this statistic) estimated that approximately 14.2 percent suffer from dysphagia (Baum and Bodner, 1983). Thus, the total elderly population affected by dysphagia is approximately 6,228,116.

There were not enough data to calculate the rate of dysphagia in a nursing home population specifically.

Dysphagia in Stroke

The annual incident stroke population, as calculated above, constitutes 385,000 to 769,000 people each year, depending on which study estimate is used. The rate of dysphagia within stroke has been calculated in various ways; for the purposes of this discussion, we will examine the occurrence of dysphagia as measured by videofluoroscopic swallowing study (VFSS) and bedside examination (BSE) separately. In general, more patients are diagnosed dysphagic when a VFSS is used than when a BSE is used.

Three studies examined the rate of dysphagia in hospitalized stroke patients shortly after the event (within 5 days) using VFSS (see Evidence Table 3) (Daniels, Brailey, Priestly et al., 1998; Daniels, McAdam, Brailey et al., 1997; Kidd, Lawson, Nesbitt et al., 1993). Given that none of these studies shows any particular superiority over any other, we will use the median occurrence rate of dysphagia reported by these studies, 74.6 percent (Daniels, McAdam, Brailey et al., 1997). Using this figure and the incident stroke statistics reported above, it appears that approximately 287,000 to 573,000 individuals are diagnosed with dysphagia by VFSS after a stroke each year.

Several studies have also examined the rate of dysphagia in hospitalized stroke patients as measured by BSE, water swallow test, or structured observation. Again, these studies report widely varying rates but do not appear to have study or patient characteristics that account for the differences. The median rate reported by those studies examining hospitalized patients within 5 days after stroke was 41.7 percent (Kidd, Lawson, Nesbitt et al., 1993). This calculates to an overall burden of illness of 160,000 to 320,000 individuals per year.

For elderly stroke patients specifically, the occurrence of dysphagia appears to be higher than the population as a whole, when measured by BSE. More than 279,000 people over age 65 suffer from stroke each year [Barker and Mullooly, 1997; Centers for Disease Control and Prevention (CDC), 1998]. One study examined specifically the rate of occurrence of dysphagia immediately after the stroke event in patients age 70 and older; using BSE, the rate was 35.5 percent (Barer, 1989). This calculates to a burden of illness of about 99,165 elderly stroke patients with dysphagia each year.

Malnutrition in Stroke

The results from a single, European study suggest that within 1 week of an acute stroke, 26.4 percent of patients are malnourished (Davalos, Ricart, Gonzalez-Huix et al., 1996). This calculates to 101,551 to 202,891 possible cases of malnutrition occurring within 1 week after an acute stroke.

The same European study as discussed above reported that 48.4 percent of all stroke patients with dysphagia became malnourished within 1 week of the event (Davalos, Ricart, Gonzalez-Huix et al., 1996). If we assume that 160 to 320 stroke patients have dysphagia, then the burden of illness resulting from malnutrition in dysphagia is 77,636 to 155,110 individuals.

Aspiration in Stroke

Aspiration, in particular, is a condition that can lead to serious morbidity, especially in stroke patients. As stated above, stroke affects 385,000 to 769,000 people each year. Several studies have examined the occurrence of aspiration shortly after the stroke incident. Of studies reporting this rate in acute hospital strokes within 5 days after the event as measured by VFSS or modified barium swallow (MBS), the median reported rate is 33.5 percent (an average from two median studies reporting 32.2 percent and 34.7 percent) (Daniels, McAdam, Brailey et al., 1997; Kidd, Lawson, Nesbitt et al., 1993). This calculates to an annual burden of illness of 128,862 to 257,457 stroke patients with documented aspiration on VFSS.

Pneumonia

The annual incidence rate of pneumonia most recently calculated by CDC from 1994 estimated 1,600 cases per 100,000 people per year (Adams and Marano, 1995). Calculated with the total population for 1996 (U.S. Department of Commerce, 1997), approximately 4,244,544 people each year contract pneumonia.

The rate of pneumonia increases as age increases (Houston, Silverstein, and Suman, 1995); elderly people are particularly susceptible to this illness. The elderly over 65 number almost 34 million [Centers for Disease Control and Prevention (CDC), 1998]. The rate of pneumonia among this subset of the population is estimated at 3,032 per 100,000 per year (Houston, Silverstein, and Suman, 1995). Thus, the total number of elderly people affected is approximately 1,026,662 each year. This is just over 24 percent of all pneumonia cases.

Pneumonia After Stroke

An acute stroke incident occurs in 385,000 to 769,000 people each year, as calculated above. The rate of pneumonia after stroke (as shown in Evidence Table 13) is best represented for the entire population by Young et al. (1990), who examined hospitalized patients with no age limits (Young and Durant-Jones, 1990). The hospital diagnosis rate of pneumonia after stroke reported by these researchers was 13 percent. Thus, the total number of individuals afflicted with pneumonia during an acute-care hospital stay after stroke is approximately 50,006 to 99,909 each year.

Dysphagia that leads to pneumonia after stroke

Dysphagia is an important symptom of stroke if it leads to decreased quality of life (QOL), serious morbidity, or mortality. As reported above, approximately 160,000 to 320,000 individuals suffer from dysphagia as a result of stroke each year, as measured by BSE; 287,000 to 573,000 as measured by VFSS. One study reported the rate of pneumonia after dysphagia was diagnosed with a water swallow test in hospitalized acute stroke patients at a facility with no directed dysphagia treatment program (Nilsson, Ekberg, Olsson et al., 1998). (No studies were available that reported dysphagia and pneumonia after VFSS or MBS in a hospital with no directed dysphagia treatment program.) The reported rate was 14.3 percent. Because dysphagia was diagnosed in this study using a noninstrumental method, we must apply the pneumonia rate to the dysphagic-stroke rate calculated by BSE (16,000 to 320,000); this therefore makes it impossible to correlate aspiration-related morbidity with dysphagia-related morbidity because they are diagnosed by two very different methods.

The calculated burden of illness is therefore 22,938 to 45,828 individuals who would contract pneumonia while in the hospital after acute dysphagic stroke each year if no dysphagia treatment program exists. (Contrarily, in a hospital where a dysphagia program does exist, rates of pneumonia after swallow therapy in acute care have ranged from 0 to 5.0 percent (Nilsson, Ekberg, Olsson et al., 1998; Odderson, Keaton, and McKenna, 1995), which would result in a burden of illness of 0 to 16,024 individuals with pneumonia each year.

As calculated above, more than 99,000 elderly stroke patients are diagnosed with dysphagia on BSE. No pneumonia statistics specific to the elderly were available on stroke patients with dysphagia (except from two studies that specifically examined nursing homes). If we assume the elderly have a similar pneumonia rate to the dysphagic stroke population as a whole, then 14,181 cases of pneumonia co-occur with dysphagic stroke in the elderly each year.

Aspiration that leads to pneumonia after stroke

Aspiration is diagnosed in 129,000 to 257,000 stroke patients each year using VFSS. Aspiration places patients at particular risk of contracting aspiration pneumonia. The only study that examined stroke patients in an acute-care hospital reported that 21.6 percent of all aspirating stroke patients contracted aspiration pneumonia if given no specific dysphagia treatment (Schmidt, Holas, Halvorson et al., 1994). Thus, approximately 27,834 to 55,611 stroke patients with aspiration will contract pneumonia each year.

Nondysphagic stroke to pneumonia

It is important to point out that stroke patients with dysphagia are not the only stroke patients to contract pneumonia. Those without dysphagia, while their chances are smaller, do place a burden of illness on society resulting from pneumonia and pneumonia deaths as well. It would also be interesting to compare the burden of illness that these patients create compared with stroke patients with dysphagia.

By subtracting dysphagic stroke from total stroke (discussed above), we calculate that 224,000 to 448,000 stroke patients each year appear not to be dysphagic. We can assume that these patients do not receive any directed swallow therapy, no matter whether the hospital incorporates such a therapy into the treatment paradigm or not. Reynolds, Gilbert, Good et al. (1990) reported that 19.8 percent of nonstroke patients with dysphagia (as measured by BSE) came down with pneumonia in an acute hospital setting (Young and Durant-Jones, 1990). Therefore, approximately 44,403 to 88,714 nonstroke patients with dysphagia will contract pneumonia each year.

Proportion of post-stroke pneumonia co-occurring with dysphagia

We have calculated that pneumonia occurs in approximately 50,000 to 100,000 individuals after an acute stroke incident. Dysphagic stroke leads to pneumonia in approximately 23,000 to 46,000 individuals. Thus, the proportion of pneumonia in stroke that co-occurs with BSE-diagnosed dysphagia is estimated at 45.9 percent when using a combination of medical literature and government population statistics. However, BSE-diagnosed dysphagia does not include many cases of aspiration, and therefore this estimate is likely under-representing the burden of illness for this population.

If instead we add together our estimates of dysphagic-pneumonia and nondysphagic-pneumonia, the total rate of pneumonia in stroke is estimated at 67,000 to 134,000, three times the rate as calculated using stroke-to-pneumonia statistics. When we calculate the burden of illness using the dysphagia plus nondysphagia calculations, the proportion of pneumonia cases co-occurring with dysphagia shrinks to 34.1 percent. It is impossible to determine which of these estimates is correct.

Proportion of post-stroke pneumonia co-occurring with aspiration

Approximately 28,000 to 56,000 aspirating stroke patients contract pneumonia each year. Thus, the proportion of pneumonia in stroke that co-occurs with diagnosed aspiration is 55.7 percent.

Pneumonia Mortality

Pneumonia resulting in pneumonia-specific death

Pneumonia was the reported cause of death for 81,972 people in 1996 [Centers for Disease Control and Prevention (CDC), 1998]. This includes all etiologies (not just stroke or patients with dysphagia). (We assume, for the purposes of this discussion, that once an elderly person contracts pneumonia, his/her probability of dying is the same no matter the etiology of the pneumonia.) The number of elderly people who reportedly died of pneumonia or influenza in 1995 (the most recent year reported) was 74,297 (Department of Health & Human Services, 1997 ). These figures suggest that the large majority (90.6 percent) of pneumonia deaths occur in the elderly. While the death rate for pneumonia for all age groups is 1,900 per 100,000 cases (1.9 percent), we calculated that the death rate for the elderly is 7,200 per 100,000 (74,297 out of 1,026,662) (7.2 percent), 10,300 per 100,000 of those hospitalized (71,000 out of 690,000) (10.3 percent) (using Mayo Clinic and government data) [Centers for Disease Control and Prevention (CDC), 1998a; Centers for Disease Control and Prevention (CDC), 1998b; Department of Health & Human Services, 1997 ; Houston, Silverstein, and Suman, 1995]. This is somewhat lower than the rates reported in the medical literature (Evidence Table 20), where 14 to 40 percent of pneumonia patients died from this illness (a rate of 19.8 percent was used below for burden of illness calculations). It is possible that government-collected data do not count all cases because the cause of death listed on death certificates is not always accurate.

Dysphagia leading to pneumonia death

Above, we calculated that 23,000 to 46,000 stroke patients with dysphagia contract pneumonia each year. If 19.8 percent of these patients die of pneumonia, that is a burden of illness of 4,542 to 9,074 deaths per year possibly attributable to dysphagia in stroke victims. (However, these statistics do not imply causation; any attribution is purely speculative.)

Aspiration leading to pneumonia death

We have calculated that 28,000 to 56,000 aspirating stroke patients will contract pneumonia. If we assume a 19.8 percent mortality rate for pneumonia, this suggests that 5,511 to 11,011 pneumonia deaths each year will result after aspirating stroke. These numbers appear to be conundrums when compared with the dysphagia figures above, because there are more aspirators dying than dysphagics dying, and aspiration is a subset of dysphagia. However, these numbers are not comparable because the dysphagia numbers are calculated from BSE diagnosis, while the aspiration numbers are calculated from VFSS diagnosis rates.

Nondysphagic stroke leading to pneumonia death

As calculated above, about 44,000 to 89,000 nonstroke patients with dysphagia will contract pneumonia each year. Again assuming a pneumonia-specific mortality rate of 19.8 percent, the burden of illness is 8,792 to 17,565 individuals with non-dysphagic stroke dying of pneumonia each year. This is a substantially lower burden of illness than that resulting after dysphagic or aspirating strokes.

Summary

There are many discrepancies in the information presented above because the statistics have come from many different sources, and the reliability of the information from the medical literature is questionable because of varying followup times, different diagnostic methods, small patient populations, and methodologies that were not described adequately.

Other Neurologic Disorders

Table 13. Total Number of Patients with Dysphagia in Other Neurologic Diseases
Disease/disorderNumber/ percentSourceComments
Parkinson's diseaseIncidence
Total population265,284,000U.S. Department of Commerce, 1997
Rate of PD13/100,000Mayeux, Marder, Cote et al., 1995Only study did total pop; rates may not be reliable because this population does not racially represent the nation and has not been sex-, age-, or race-adjusted in any way.
Total affected34,487
Dysphagia in Parkinson's
Total PD population, incidence34,487
Rate of dysphagia69.1%Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997Mean of 2 studies in which L-dopa was withheld. MBS
Total affected23,830
*Questionnaires appear to yield about the same rate when L-dopa is withheld.
Motor neuron diseaseAnnual incidenceComments
Total population265,284,000U.S. Department of Commerce, 1997
Rate of MND6/100,000Lilienfeld, Sprafka, Pham et al., 1991Only study. 1984 rates.
Total affected15,917
Dysphagia in MNDComments
Total MND population15,917
Rate of dysphagia51.20%Leighton, Burton, Lund et al., 1994Distribution of subsets of disease not same as overall MND population.
Total affected8,150
Amyotrophic lateral sclerosisPrevalenceComments
Total population265,284,000U.S. Department of Commerce, 1997
Rate of ALS1.8/100,000McGuire, Longstreth, Koepsell et al., 1996
Total affected4,775
Dysphagia in ALS
Total ALS4,775
Rate of dysphagia29%Leighton, Burton, Lund et al., 1994Only study.
Total affected1,385
Progressive supranuclear palsyAnnual incidenceComments
Total population265,284,000U.S. Department of Commerce, 1997
Rate of PSP1.10/100,000Bower, Maraganore, McDonnell et al., 1997
Total affected2,918
Dysphagia in PSP
Total PSP2,918
Rate of dysphagia55.6% Litvan, Sastry, and Sonies, 1997Only study
Total affected1,622
HDAnnual incidence
Total population265,284,000
Rate of HD0.2/100,000Kokmen, Ozekmekci, Beard et al., 1994
Total affected531
Dysphagia in HD100%Kagel and Leopold, 1992
531
Total burden of illness of dysphagia in neurologic diseases other than stroke51,435
Total with stroke, low estimate338,393
High estimate624,757
Table 11 in the Burden of Illness section shows a summary of the burden of illness (i.e., the number of people affected) caused by dysphagia resulting from Parkinson's disease, motor neuron disease (MND), amyotrophic lateral sclerosis (ALS), and progressive supranuclear palsy (PSP). Table 13 in that same section give details on the calculations described below. Not enough data are available to determine the burden of illness resulting from pneumonia or death in these patients. All rates discussed are annual incidence rates, in order to calculate an annual burden of disease.

Parkinson's Disease

Only one study looked at the incidence rate of Parkinson's disease for the entire population (all ages and both sexes): Mayeux et al. (1995) reported an incidence rate of 13 per 100,000 per year (Mayeux, Marder, Cote et al., 1995) from a Mayo Clinic study. As there are more than 265 million people in the United States (Department of Commerce, 1997), the number of people affected by Parkinson's disease each year is 34,487.

The diagnosed occurrence of dysphagia in Parkinson's disease has been reported by several studies, using both videofluoroscopy and noninstrumental methods. Of two studies using videofluoroscopy that withheld L-dopa on the day of testing (which can affect dysphagia), the mean rate, 69.1 percent, was calculated (Bushmann, Dobmeyer, Leeker et al., 1989; Fuh, Lee, Wang et al., 1997). This indicates that of the 34,487 individuals affected by Parkinson's disease each year, 31,394 of them demonstrate dysphagia on videofluoroscopy. Data are not available to determine burden of illness resulting from malnutrition, pneumonia, or death.

Motor Neuron Disease

The annual incidence rate of MND, reported by Lilienfeld, Sprafka, Pham et al., (1991), was 6 per 100,000 in 1984, the most recent statistics available. In a population of more than 265 million, this means that 15,917 individuals are affected by MND each year.

There was only one potentially reliable number on the rate of dysphagia in MND, provided by Leighton, Burton, Lund et al. (1994), who reported that 51.2 percent of MND patients demonstrated swallowing graded as moderate or poor on videofluoroscopy. The total annual burden of illness is therefore 8,150 individuals affected by dysphagia in MND.

Amyotrophic Lateral Sclerosis

The most common form of MND is ALS. The annual incidence of ALS has been reported at 1.8 per 100,000 (Annegers, Appel, Lee et al., 1991). Thus, the total number of people affected by ALS in the United States each year is 4,775.

Only one study examined the rate of dysphagia in ALS. Leighton, Burton, Lund et al. (1994) reported a rate of 29 percent of poor or moderate swallowing as demonstrated on VFSS. Thus, 1,385 people are affected by dysphagia resulting from ALS.

Progressive Supranuclear Palsy

The annual incidence rate of PSP has been reported at 1.10 per 100,000 (Bower, Maraganore, McDonnell et al., 1997) (the prevalence has not been reported). Thus, approximately 2,918 people are affected by PSP each year.

Only one study reported the rate of dysphagia in PSP; Litvan, Sastry and Sonies (1997) reported that 55.6 percent of PSP patients complained of swallowing problems. This calculates to 1,622 people each year experiencing swallow problems as a result of PSP.

Huntington's Disease

The annual rate of new cases of Huntington's disease is 0.2 per 100,000 (Kokmen, Ozekmekci, Beard et al., 1994). The burden of illness is therefore approximately 531 each year in the U.S. population. The diagnosed occurrence of dysphagia was reported by a single study, at 100 percent of all Huntington's disease patients (Kagel and Leopold, 1992). Thus, the total number of new dysphagic Huntington's disease patients each year is 531.

Summary

Approximately 51,435 people this year will be affected by dysphagia resulting from these neurologic disorders. The large majority of these cases are the result of Parkinson's disease (more than 34,000). Combined with stroke, approximately 340,000 to 625,000 people will be affected by dysphagia resulting from a variety of neurologic disorders this year. Many of these patients may subsequently be affected by aspiration or pneumonia as the result of a disordered swallow although, presently, lack of data in the published literature makes the calculation of these rates impossible. If, however, we were to assume that the rates of aspiration and pneumonia were the same for all neurologic patients as for stroke, then we would estimate that approximately 14.3 percent of these patients with dysphagia would contract pneumonia (48,390 to 89,340 cases of pneumonia each year). If pneumonia death occurs at a rate of 19.8 percent, then approximately 9,581 to 17,689 patients will possibly die as a result of their untreated dysphagia.

Appendix C. Detailed Search Strategies for Data Collection

Medlars

(Parallel strategies were constructed in Dialog, Ovid, and PubMed syntax.)

To identify swallowing disorders not associated with cancer

  • 1

    explode deglutition disorders or deglutition

  • 2

    (tw) all swallow: or dysphagia

  • 3

    neoplasms(px) [(px) indicates a pre-explosion of a MeSH term]

  • 4

    1 or 2

  • 5

    4 and not 3

  • 6

    5 and aged(px)

  • 7

    6 and human

  • 8

    7 and not case report

To identify documents related to barium swallow in stroke patients

  • 1

    barium sulfate or barium adj sulfate

  • 2

    explode deglutition disorders

  • 3

    explode cerebrovascular disorders or stroke(tw)

  • 4

    1 and 2 and 3

To identify documents related to barium swallow in patients over 65

  • 1

    barium sulfate or barium adj sulfate

  • 2

    explode deglutition disorders

  • 3

    aged(px)

  • 4

    1 and 2 and 3

To identify therapies

  • 1

    explode deglutition disorders or explode deglutition

  • 2

    (tw) all swallow: or dysphagia

  • 3

    1 and di&(px) [this is a pre-explosion of a subheading category]

  • 4

    1 and th&(px)

  • 5

    3 or 4

  • 6

    1 or 2

  • 7

    6 and <speech therapy or speech-language pathology>

  • 8

    6 and <barium sulfate or barium(tw) or fluoroscopy or cineradiography or all videofluoroscopy:(tw)>

  • 9

    5 or 7 or 8

  • 10

    9 and aged(px)

  • 11

    10 and human

To identify complications associated with and costs of PEG tubes

  • 1

    gastrostomy

  • 2

    percutaneous(tw) and endoscopic(tw) and gastrostomy(tw) or <PEG(tw) and all tube:(tw)>

  • 3

    1 and instrumentation(sh)

  • 4

    2 and equipment a#d supplies(px)

  • 5

    3 or 4

  • 6

    1 or 2

  • 7

    6 and <fatal outcome or treatment outcome or mortality or survival or explode morbidity>

  • 8

    6 and explode economics

  • 9

    7 and aged(px)

  • 10

    5 or 7 or 8 or 9

  • 11

    10 and human

  • 12

    neoplasms(px)

  • 13

    11 and not 12

  • 14

    13 and not <infant(px) or child(px)>

To assess issues related to enteral nutrition

  • 1

    enteral nutrition or intubation, gastrointestinal

  • 2

    1 and <nasogastric(tw) or nasointestinal(tw)>

  • 3

    1 contains NG

  • 4

    2 or 3

  • 5

    4 and <adverse effects(sh) or complications(sh) or mortality(sh) or hazard:(tw) or safe(tw) or safety(tw)>

  • 6

    5 and not child(px)

  • 7

    gastrostomy and <percutaneous(tw) or PEG(tw)>

  • 8

    6 and not 7

  • 9

    8 and human

To assess epidemiology and economics of various disorders and their relation to dysphagia

Alzheimer's Disease
Strategy 1

  • 1

    Alzheimer disease or all Alzheimer:(tw)

  • 2

    (tw) dysphagia or all swallow:

  • 3

    explode deglutition disorders or deglutition

  • 4

    2 or 3

  • 5

    1 and 4

  • 6

    5 and not adult(px)

Strategy 2

  • 1

    Alzheimer disease or all Alzheimer:(tw)

  • 2

    1 and not <explode dementia or explode dementia, vascular>

  • 3

    2 and <explode economics or economics(sh)>

  • 4

    2 and <epidemiology(sh) or incidence or prevalence>

  • 5

    4 and human

Cervical Osteophytes

  • 1

    (tw) cervical and osteophyte:

  • 2

    swallow:(tw) or dysphagia(tw) or explode deglutition disorders

  • 3

    1 and 2

  • 4

    3 and human

Dementia
Strategy 1

  • 1

    explode dementia or explode dementia, vascular

  • 2

    explode deglutition disorders or deglutition

  • 3

    (tw) dysphagia or all swallow:

  • 4

    2 or 3

  • 5

    1 and 4

Strategy 2

  • 1

    explode dementia or explode dementia, vascular

  • 2

    1 and <explode economics or economics(sh)>

  • 3

    1 and <epidemiology(sh) or incidence or prevalence>

  • 4

    2 or 3

  • 5

    4 and human

Lou Gehrig's Disease (ALS)

  • 1

    explode myotrophic lateral sclerosis

  • 2

    ALS(tw) or gehrig(tw) or (lateral adj sclerosis)

  • 3

    swallow:(tw) or dysphagia(tw) or explode deglutition disorders

  • 4

    1 or 2

  • 5

    3 and 4

  • 6

    5 and human

Neurodegenerative Diseases

  • 1

    explode *neurodegenerative diseases/statistics and numbers

  • 2

    explode *neurodegenerative diseases/epidemiology

  • 3

    1 or 2

  • 4

    3 and aged(px)

  • 5

    4 and human

[Note: the category neurodegenerative diseases includes: alzheimer disease; demyelinating diseases; multiple sclerosis; spinal cord diseases (including amyotrophic lateral sclerosis).]

Parkinson's Disease

  • 1

    *Parkinson disease/epidemiology

  • 2

    1 and human

[Note: we originally included the terms incidence or prevalence in the search strategy for Parkinson's disease but excluded them to minimize extraneous retrieval.]

Pneumonia

  • 1

    pneumonia, aspiration or silent adj aspiration

  • 2

    1 and <incidence or prevalence>

  • 3

    pneumonia, aspiration/epidemiology or pneumonia, aspiration/statistics and numbers

  • 4

    2 or 3

  • 5

    4 and human

Raynaud's Disease
Strategy 1

  • 1

    Raynaud disease or all Raynaud:(tw)

  • 2

    tw) dysphagia or all swallow:

  • 3

    explode deglutition disorders or deglutition

  • 4

    2 or 3

  • 5

    1 and 4

  • 6

    5 and human

Strategy 2

  • 1

    Raynaud disease or all Raynaud:(tw)

  • 2

    1 and <explode economics or economics(sh)>

  • 3

    1 and <epidemiology(sh) or incidence or prevalence>

  • 4

    2 or 3

  • 5

    4 and human

Stroke
Strategy 1

  • 1

    *stroke

  • 2

    1 and <explode economics or economics(sh)>

  • 3

    1 and epidemiology(sh)

  • 4

    2 or 3

Strategy 2

  • 1

    stroke

  • 2

    explode deglutition disorders or deglutition

  • 3

    (tw) dysphagia or all swallow:

  • 4

    2 or 3

  • 5

    1 and 4

  • 6

    5 and not infant(px)

  • 7

    6 and not child(px)

  • 8

    7 and not adult(px)

  • 9

    8 and human

Strategy 3

  • 1

    explode cerebrovascular disorders

  • 2

    1 and not cerebral ischemia, transient

  • 3

    explode hospitals or explode nursing homes

  • 4

    clinical protocols or clinical pathways or outcome assessment (health care) or patient care team

  • 5

    2 and 3 and 4

Strategy 4

  • 1

    explode cerebrovascular disorders/complications and explode pneumonia/etiology

Supranuclear Palsy

  • 1

    supranuclear palsy, progressive

  • 2

    supranuclear adj palsy

  • 3

    swallow:(tw) or dysphagia(tw) or explode deglutition disorders

  • 4

    1 or 2

  • 5

    3 and 4

  • 6

    5 and human

Xerostomia

  • 1

    Xerostomia

  • 2

    swallow:(tw) or dysphagia(tw) or explode deglutition disorders

  • 3

    1 and 2

  • 4

    3 and human

To assess weight loss

  • 1

    *weight loss or *cachexia or *wasting syndrome

  • 2

    aged(px)

  • 3

    1 and 2

  • 4

    neoplasms(px)

  • 5

    3 and not 4

  • 6

    5 and human

[Note: we considered but excluded the following terms: eating; eating disorders; nutritional requirements; nutritional status; energy intake; activities of daily living.]

To assess the cost of VFSS mobile units

  • 1

    explode mobile health units and explode economics

  • 2

    1 and United States

PsycINFO

To identify material describing dysphagia in the elderly

  • 1

    s swallowing or sh=50920

  • 2

    s (aged or geriatrics or gerontology or elder care or geriatric patient) or sh=(16364 or 20950 or 01370 or 20970 or 21000)

  • 3

    s s1 and s2

  • 4

    s s3 not children

  • 5

    s s4/eng

To assess quality of life in the elderly

  • 1

    s quality of life or life satisfaction or QOL or satisfaction

  • 2

    s aged or geriatrics or gerontology or elder care or geriatric patients or QOL

  • 3

    s s1 and s2

  • 4

    s s3 and literature review

  • 5

    s s3 and ((control? Or random?)(2n)(trial?)

  • 6

    s s4 or s5

  • 7

    s s6 not child?

To identify controlled trials, the above strategies were combined with

  • 1

    randomized controlled trials or randomized controlled trial(pt) or <random:(tw) and controlled(tw) and trial#(tw)> or random allocation or double-blind method or single-blind method

  • 2

    controlled clinical trials or controlled clinical trial(pt)

  • 3

    <explode clinical trials or all clinical trial:(pt) > and controlled(tw)

  • 4

    meta-analysis or meta-analysis(pt) or <meta adj analysis>

  • 5

    1 or 2 or 3 or 4 (randomized controlled trials OR randomized controlled trial[pt] OR (random*[tw] AND controlled[tw] AND trial*[tw]) OR random allocation OR double-blind method OR single-blind method OR controlled clinical trials OR controlled clinical trial[pt] OR ((clinical trials OR clinical trial*[pt]) AND controlled[tw]) OR meta-analysis OR meta-analysis[pt] OR meta-analysis[tw] OR "meta analysis")

Appendix D. Time Distribution of Pneumonia Cases in Acute Stroke Patients

Because the cumulative number of patients with pneumonia increases with time, it is inappropriate to directly compare the pneumonia rates reported by studies that used different followup times. To allow us to make such comparisons, we derived a time curve that would allow us to adjust pneumonia rates from studies that employed different followup times to a single common time. Analysis of available data on the incidence of pneumonia in patients hospitalized for stroke suggests that pneumonia risk decreases substantially after the first week or two after the stroke. Evidence Table 26 presents data on pneumonia rates in patient groups not getting swallow therapy. Odderson's trial (Odderson, Keaton and McKenna, 1995), which had a mean followup of 1 week, had a pre-program pneumonia rate of 6.7 percent. Young's trial (Young and Durant-Jones, 1990), which had a mean followup of 4.4 weeks, had a pneumonia rate of 13 percent. This suggests a steep fall-off in pneumonia rates with time but other differences between the two trials could account for some or all of the effect. Therefore, one needs a single trial reporting pneumonia rates at two or more different times in order to determine the effect of time on pneumonia rate.

The most detailed data of this type were reported by Davenport, Dennis and Wellwood (1996). We determined the pneumonia rate as a function of time by measuring the graph published in their report. Although they used the broader criterion of chest infection, we assumed that the narrower criterion of pneumonia would follow the same time pattern following stroke. That is, the absolute number of diseased patients would be different by the two criteria, but the proportion of the cumulative total that occurred within a given time interval would be approximately the same for both criteria.

Twelve points were measured, representing times of 0 to 27.5 days post-stroke. Data were entered into a spreadsheet (Excel 97, Microsoft Corp., Redmond WA) for fitting and plotting. We tested four different models: a single exponential, a single exponential plus constant, a single-exponential fit to the first 12 points, and a power series. Exponentials were fitted automatically using the linear regression function of the spreadsheet, while the power series was fitted using SPSS software (version 7.5, SPSS Inc., Chicago). The power series [y = B0 + Bl ln(t), where B0 and B1 are constants and t is time] gave the closest fit to the Davenport et al. data, both by subjective (visual inspection) and objective (RMS error) methods. Therefore, correction terms were derived from the power series results.

To adjust rates that are figured at different followup times, the ratio of expected prevalences at each time can be calculated. This is an approximate correction because the followup times are expressed as means.

Appendix E. Methods for Evaluation of Diagnostic Tests

Sensitivity and Specificity

In general, medical imaging procedures and other diagnostic tests are intended to distinguish patients with a particular condition from those without the condition. In the typical case of a binomial decision (whether the patient has the condition), there are four outcomes of the test:

  • True positive (TP) (patient has condition, test detects condition)

  • False negative (FN) (patient has condition, test fails to detect it)

  • False positive (FP) (patient is normal, test mistakenly detects condition)

  • True negative (TN) (patient is normal, test finds patient normal)

The outcomes can be expressed as a 2-by-2 table:

Actual Condition
Test ResultsPositiveTPFP
NegativeFNTN

The accuracy of the test (i.e., the fraction of results that are correct) is of only limited use in assessing the value of the test. The consequences of a false positive may be much greater or much less than the consequences of a false negative. Therefore, it is best to separately report the false-negative and false-positive rates.

FNR (false-negative rate) = FN / (TP + FN)
FPR (false-positive rate) = FP / (TN + FP)

Sensitivity and specificity (also called likelihoods) are the inverse of these quantities. They are commonly used to report results of clinical trials of a diagnostic test.
Sensitivity = the proportion of patients with the disease who are detected by the test.

Sensitivity = 1 - FNR
Sensitivity = TP / (TP + FN)
Specificity = the proportion of patients without the disease who are correctly diagnosed as negative.
Specificity = 1 - FPR
Specificity = TN / (TN + FP)

For purposes of clinical decisionmaking, it is useful to know the probability that, if a patient tests positive, the patient actually has the condition. That figure is called the positive predictive value (PPV; also called the post test probability). Its converse is the negative predictive value (NPV).

PPV = TP / (TP + FP)
NPV = TN / (TN + FN)

But the prevalence of the condition in a population other than the initial study population may be different.

Prevalence = (TP + FN) / (TP + FN + FP + TN)

If the ratio of disease positives to negatives changes, both PPV and NPV change. These parameters are not externally valid (results obtained from one patient population are not the same as results from a different patient population) so they are not usually used as the figures by which the test is judged, despite their decisionmaking significance.

The mathematical relationship between PPV and disease prevalence is described by Bayes' rule:

PPV = Sensitivity xPrevalence
(Sensitivity x Prevalence) + ((1-Specificity) x (1-Prevalence))

As prevalence increases, PPV also increases.

Sensitivity and specificity are theoretically independent of prevalence. However, sensitivity is dependent on the spectrum of severity of disease in the test population. Most diagnostic tests are more sensitive for severe disease. Also, specificity is dependent on the prevalence of comorbidities with confounding symptoms. The possibility of these types of bias should be considered whenever one examines the results of a clinical trial. Ideally, the study population of cases and controls will be selected in such a way that the prevalence of disease, spectrum of severity, and prevalence of comorbidities with confounding symptoms are the same as they would be in the population routinely examined in clinical practice.

Receiver Operating Characteristic (ROC) Analysis

Interpretation of any image or other diagnostic test requires establishing a threshold of normal variance beyond which a result will be called abnormal. This threshold may be quantitative, such as a normal limit for the concentration of something in the blood, or it may be a subjective opinion of the person interpreting the test, such as a radiologist's determination that the size and shape of the liver are not consistent with cirrhosis. Whatever the type, the threshold can be adjusted so as to increase or decrease the number of results called abnormal

Selection of a test threshold has important effects on the sensitivity and specificity of a test and on the economic and clinical value of the test (Figure E-1 to E-4). The optimum threshold of a test may vary depending on the circumstances. For example, a preliminary screening test for HIV should have very high sensitivity because the consequences of failing to detect a person with the disease are dire. The specificity of the test is less important because all positive screening results would be confirmed with a second test. In that confirmatory test, specificity is much more important.

The relationship among sensitivity, specificity, and threshold can be described by plotting sensitivity as a function of specificity for all possible thresholds. This graph, adopted from signal detection theory, is called a receiver operating characteristic (ROC) curve. By convention, it is drawn with an inverted scale (1 - specificity) on the specificity axis. That is to say, true-positive rate is plotted as a function of false-positive rate (this is the same as a plot of the likelihood ratios at various thresholds).

The area under the ROC curve (Az) is widely considered to be a useful figure for assessing the effectiveness of a test and comparing it with that of other tests. There are a few reasons why Az is not endorsed with unanimity. First, contributions from the end of the curve are weighted equally with contributions from the center of the curve, where clinically useful thresholds are more likely to be found. Measurement of Az only within the upper left quadrant (sensitivity and specificity both more than 0.50) has been proposed as an alternative figure. Another way of summarizing the ROC into a single statistic is the partial area index, in which the area under the curve is calculated only over a clinically useful range of sensitivities. To avoid bias, that range must be selected before the meta-analysis is conducted.

Another alternative means of comparing tests using the ROC is to construct a challenge ROC using known or estimated values for the costs and benefits of each test outcome (true positive, false negative, false positive, true negative). A desired level of cost utility or net cost is chosen. For each value of specificity, the sensitivity at which the cost is exactly the chosen level is calculated. The calculated points are then connected into an ROC curve. Experimental sensitivity and specificity results for the test in question are superimposed on the graph, and it can be determined whether the test meets or does not meet the cost criteria. Finally, cost information can be used to determine an optimal operating point on the ROC, and curves can be compared at their respective optimum points.

In many situations (i.e., base case for a cost-effectiveness model), reporting of diagnostic effectiveness as a single sensitivity/specificity point is desirable. Various ways of choosing such points have been reported:

  • Sensitivity at an arbitrary false-positive rate

  • False-positive rate at an arbitrary sensitivity

  • Sensitivity and specificity at the point where sensitivity equals specificity

  • Sensitivity and specificity at the point where their sum is greatest

Each of these methods has the disadvantage of describing results at a point on the ROC curve that may be outside the range of thresholds at which the test is actually used. This is particularly bad for tests in which either sensitivity or specificity is especially valued, such as screening tests. The sensitivity/specificity at mean threshold (SMT) avoids these problems and is used in our evaluation of diagnostic tests.

Calculation of a Summary ROC

Because of the tradeoff between sensitivity and specificity, calculating an average sensitivity or specificity by averaging the sensitivity or specificity from multiple trials will underestimate the true sensitivity or specificity. For the same reasons, pooling of results from multiple trials in order to calculate a pooled sensitivity or specificity will also underestimate the true value. An exception can be made for tests that have a fixed threshold and are not subject to the interpretation of a human observer. Such tests are rare because the threshold between positive and negative results is relative and is rarely fixed absolutely, even in laboratory evaluated tests. The principle is best explained graphically. The curve in Figure 9.2 represents the sensitivity and specificity of a hypothetical test. If the results of two trials of the test (points A and B) are averaged (point M), it can be seen that the resulting mean sensitivity and specificity underestimate the value of the test.

A better way to combine the results of the trials is to generate a summary ROC (SROC) curve. Littenberg and Moses (1993) have developed a linear regression method for calculation of a summary ROC. In this method, the sensitivity and specificity are transformed into a coordinate space where there is a linear relationship between sensitivity and specificity. The least-squares method can then be used to fit a straight line to the experimental results. The regression line is then back-transformed into sensitivity/specificity space to yield an SROC.

Littenberg and Moses found that logits (log odds ratios) gave the desired linear relationship necessary for the calculation. The logit of the sensitivity and specificity results of each trial is calculated according to the following equations:

logit(TPR) = ln[TPR / (1 - TPR)]
logit(FPR) = ln[FPR / (1 - FPR)]

Then parameters S and D are calculated (see equation 9-9) so that the regression will not favor TPR (true-positive rate) fitting over FPR (false-positive rate) fitting, as would be the case if the regression were to fit TPR as a function of FPR, or vice versa. S corresponds to the threshold used to distinguish between positives and negatives in a particular trial, while D corresponds to the effectiveness of the test as measured in that trial. For some tests, D may depend strongly on S.

S = logit(TPR) + logit(FPR)
D = logit(TPR) - logit(FPR)

Where sensitivity or specificity is either 0 or 1, the logit is undefined. This error arises when, due to case mix, study size, and/or test effectiveness, there is a zero value for the number of patients with results [true positive (TP), false positive (FP), false negative (FN), or true negative (TN)]. Before any such study is included in the regression, 0.5 is added to each of the categories, a correction term suggested by Littenberg and Moses. We do not add the correction term to data from studies with no zero values for TP, TN, FP, or FN. In most cases in which the correction is necessary, the zero group is either false positives or false negatives, so the effect of the correction is to very slightly underestimate the true sensitivity and specificity found in the study.

S and D are plotted (see Figure 9.3), and a straight line is fitted to them (with S defined as the independent variable and D as the dependent variable) by the least-squares method. Confidence intervals on the fitted values of D can also be calculated. The fitted line is then back-transformed into ROC space.

logit(TPR) = (S + D) / 2
logit(FPR) = (S - D)/2

sensitivity = elogit(TPR) / (1 + elogit(TPR))
specificity = 1 - [elogit(FPR) / (1 + elogit(FPR))]

Extrapolating the line beyond the maximum or minimum S values of the original data should be avoided. This prevents calculation of Az, the area under the ROC curve, which is a widely used (but sometimes criticized) measure of test effectiveness.

Experiments using simulated data have shown that the Littenberg and Moses method gives a close approximation to real ROC curves. Without a priori knowledge of the form of the curve, it is impossible to know what the true fit is. Actual ROC data for different tests fit different curve forms, but the logit regression method is an accepted compromise.

To obtain the sensitivity/specificity at mean threshold, we averaged the S values for all trials in the meta-analysis. The corresponding value of D was calculated from the regression equations, and the logit-space (S, D) point was back-transformed into ROC space (sensitivity, specificity) to yield sensitivity/specificity at mean threshold. This mean threshold method is an objective way to select one clinically relevant or typical data point from an SROC curve to use as a base case in decision analysis.

For this technology assessment, regression calculations were carried out using the SPSS statistical software system (version 6.1 for Windows, SPSS Inc., Chicago, Illinois). Results from the SPSS system were transferred to an Excel spreadsheet (Microsoft Corp., Redmond, Washington) for plotting and to allow comparison of one set of results with another.

Figure E-1. Threshold Effects in Diagnosis graphic elementThe left peak represents the frequency distribution of the nondiseased population, and the right peak represents the diseased population. The value of the phenomenon measured by the diagnostic test increases horizontally from left to right. The vertical line represents the threshold value. In B specificity has been increased by raising the threshold, which decreases the number of false positives, but which decreases sensitivity and increases the number of false negatives.

Figure E-2. Mean Sensitivity and Specificity are Not Accurate graphic element

Figure E-3. Logit Transform of Sensitivity/specificity Results graphic element

Figure E-4. Regression Line Transformed into ROC Space graphic elementThe outer X marks on these summary ROC plots represent the minimum and maximum thresholds observed in the clinical trials going into the summary ROC. Portions on the curve beyond these thresholds are extrapolations; these sensitivities and specificities may not be attainable in clinical practice.
The X in the center of each curve represents the sensitivity and specificity at the mean threshold. This point is the one best estimate of the sensitivity and specificity of the test.

Appendix F. Extension and Cost-Effectiveness Analysis of Suggested Clinical Trial

In the Future Research section of this evidence report, we described a multicenter, multiarm, randomized trial that would examine the impact of two to four different dysphagia diagnosis and treatment strategies on patient outcomes. The four diagnostic methods of interest were a noninstrumented test alone (the control group), and three groups of noninstrumented followed by a single instrumented exam. We also suggested that the primary intergroup comparison be on the incidence rate of aspiration pneumonia after treatment, and noted that this trial (or any practical trial) might not have the statistical power to detect certain differences. Therefore, we suggested that a simulated trial be conducted on the basis of the results of the suggested clinical trial. Such a simulation involves decision analysis. Another advantage of incorporating the suggested trial's results into a decision analysis is that we can use the same model to determine the incremental cost-effectiveness of the various diagnostic strategies. When these latter types of analyses are performed, small between-group outcome differences can sometimes be seen to result in substantial cost-effectiveness differences.

Figure F-1displays a possible structure of such a decision analysis. The instrumented tests shown are used simply as examples and should not be taken as suggestion of which instrumented tests to include in the proposed trial; we leave this up to the investigators. The beginning of the management pathway is at the leftmost side of the tree, and, depending on how patients were randomized in the clinical trial, they pass through the branches of the tree representing each diagnosis and treatment combination until they experience their ultimate outcome - aspiration pneumonia or no pneumonia - at the terminal nodes on the right side of the tree. Aspiration pneumonia is shown because the purpose of the figure is only to illustrate how decision analysis can be used to analyze results. Other outcomes, such as death, malnutrition, or need for a feeding tube, could be substituted for pneumonia or, better yet, also included in this decision tree.

The figure also illustrates one way the results of the clinical trial can be extended by a decision analysis. Thus, what is shown is how data from the trial can be used to determine whether certain patients given certain treatments fare better than those given others. In other words, one can determine whether patients given diet modification fare as well as, say, patients given diet modification plus speech-language therapy.

We suggest that this decision tree also be used to perform a cost-effectiveness analysis. In this case, costs would be stored at the terminal node at the end of each branch on the right side of the tree and would consist of all medical costs incurred during diagnosis and treatment, including any costs for complications that occur as a result of the dysphagia.1 For example, even if the outcome of interest were pneumonia, costs for a temporary feeding tube should be included, in addition to the costs of treating the pneumonia. The cost stored in the tree, then, would be the average cost incurred by each patient in the clinical trial undergoing that particular diagnosis-treatment strategy. Distributions of the incurred costs could also be incorporated into the tree (this would be done for a Monte Carlo analysis; see below for further discussion of these analyses). Cost-effectiveness would then be measured as the incremental cost-effectiveness of a given strategy and/or treatment.

To closer approximate cost-effectiveness from the societal perspective, the decision tree could include additional costs that society would ultimately pay, such as any costs that family members might incur as a result of transporting a patient to a hospital or other care facility. This would provide information at the highest level of evidence in the hierarchy we describe in the Methods section of this evidence report. We note, however, that such analyses are not often conducted because of their difficulty. More practical might be to incorporate information about the prevalence of disease (such as that contained in the Burden of Illness section of this report) to extend the analysis to estimate the total national direct costs of the disease.

Other ways in which the results of the clinical trial can be extended are shown in Figure F-2 Thus, Figure F-2a shows how it is possible use a decision tree to compare the outcomes of those who do and do not experience morbidity. Separate branches can be constructed for each morbidity to determine which is most likely to result in death, and/or which is most costly. Figure F-2b illustrates how the outcomes of patients with different clinical courses can be compared. Thus, in this hypothetical decision tree, patients are followed on the basis of their weight status; feeding tube usage is the short-term outcome of interest. Then the proportion of patients needing feeding tubes and who experience morbidity are compared with the proportion of such patients who do not experience morbidity. This particular analysis, for example, could be conducted to determine: (a) whether patients who lose weight but do not receive a feeding tube fare better or worse than those who do receive a feeding tube, (b) whether such patients fare better than those who maintain or gain weight but who do not get a feeding tube, or (c) whether it is cost-effective to place patients who lose weight on a feeding tube.

It is prudent to consider analyzing the results of any decision tree that results from the suggested clinical trial as a Monte Carlo simulation. In effect, such simulations allow one to model what would happen to a much larger group of patients than one could attain in a clinical trial setting. Further, the results of Monte Carlo simulations are expressed in ways not unlike the results of analyses of clinical trials. Thus, such simulations result in a mean value, standard deviations, and so on. This enables one to ask questions such as: Given that the model suggests that treating silent aspiration improves patient outcomes, what percentage of individuals in the simulated population can be expected to benefit?

The data for a Monte Carlo analysis will be available from the results of the proposed clinical trial. Thus, one will know not only the probability of each outcome, but also the distribution of each outcome. This includes even distributions for dichotomous variables expressed in terms of a proportion (e.g., the proportion of patients who developed pneumonia). This is because from this proportion and the 95 percent confidence intervals (C.I.s) (or any other measure of dispersion) that can surround it, one can calculate the shape of the binomial distribution from which the proportion was drawn. Thus, this tree can be run as a second-order Monte Carlo analysis in which each probability and each cost entered into the tree has an accompanying distribution. In this way the decision tree will include all of the relevant trial data, which will make it a simulated clinical trial that is as similar as possible to an actual clinical trial.

Figure F-1. Suggested Design for Decision Analysis of Diagnosis and Treatment of Dysphagia graphic elementNote: Instrumented exams included in this figure are examples only, and not meant as recommendations for trial inclusion.

Figure F-2. Examples of Other Extensions of Suggested Clinical Trial
F-2a. Morbidity and Mortality Resulting from Each Diagnostic Method
graphic element
F-2b. Comparing Patients with Different Clinical Courses
graphic element

Appendix G. The Effect of Treatment on Diagnostic Test Results

The worth of a diagnostic test is inextricably bound up with the effectiveness of the treatment being given based on the results of the test. Therefore, clinical trials measuring after-treatment outcomes are preferable in assessing the impact of the test. The problem with this approach is that effective treatment will decrease the apparent effectiveness of the test.

This is best illustrated by the example tabled below, which represents the results of a hypothetical diagnostic test.

Patient's Actual Condition
Test ResultPositive20 TP10 FP67% PPV
Negative10 FN60 TN86% NPV
67% sensitivity86% specificity

Now assume that a treatment is added between the time of the test and the time the outcome (patient's condition) is measured. The treatment would be given only to those who test positive. Assume also that the treatment is 50 percent effective, so out of the 20 patients with a true-positive (TP) test result, the condition of 10 will change from positive to negative. [The 10 false-positive (FP) cases will also get treated, but since these patients do not have the disease in question, the treatment has no effect on patient outcome.] The results of this hypothetical trial are shown in the table below.

Patient's Actual Condition
Test ResultPositive10 TP20 FP33% PPV
Negative10 FN60 TN86% NPV
50% sensitivity75% specificity

Notice that sensitivity and specificity both decreased as a result of treatment. Positive predictive value (PPV) decreased substantially, but negative predictive value (NPV) was unchanged.

The magnitude of this treatment bias depends on the effectiveness of treatment. If the hypothetical treatment were 100 percent effective, then all 20 true positives will become false positives, and the sensitivity and PPV will both fall to zero:

Patient's Actual Condition
Test ResultPositive0 TP30 FP0% PPV
Negative10 FN60 TN86% NPV
0% sensitivity67% specificity

Treating all patients regardless of test results will not eliminate this bias. It will, however eliminate its effect on sensitivity because the numbers of true positives and false negatives (FN) will each be reduced by the same proportion (treatment effectiveness). This is shown in the table below.

Patient's Actual Condition
Test ResultPositive10 TP20 FP33% PPV
Negative5 FN65 TN93% NPV
67% sensitivity76% specificity

Treatment, if given to all patients, will almost always cause PPV to decrease and NPV to increase. Sensitivity will be unchanged, while specificity will decrease. Exceptions to this rule exist when the test results are actually worse than chance. (We stress, however, that such a test is unlikely to ever be used in clinical practice.) In this case, specificity would increase, as shown in the table below.

10 TP20 FP33% PPV5 TP25 FP17% PPV
40 FN30 TN43% NPV20 FN50 TN71% NPV
20% sensitivity60% specificity20% sensitivity67% specificity

Appendix H. Original Calculations in This Report

This evidence report contains numerous original calculations, the purpose of which was to derive data not presented in the published articles or to convert data to standard metrics to facilitate between-study comparisons. In this section, we concentrate only on those original calculations that were performed to assist us in arriving at the conclusions of this report. A number of other original calculations in the appendices are not discussed here.

Epidemiology and Burden of Illness Sections

These two sections (and their associated appendices) contain 99 original calculations. These calculations include all of the totals and rates shown in the in-text tables in these sections and most of the percentages shown in Evidence Tables 1-23. Calculations to convert rates of disease to total burden of illness (total number of people affected by the disease) were calculated using disease rates reported in the clinical literature (per 100,000 individuals) multiplied by the total number of individuals in the relevant population as reported by government statistics.

Question 1

  • Pneumonia rates standardized: This question concerned pneumonia rates (cumulative proportional incidence) in patients in dysphagia programs compared with historical controls. Pneumonia rates were usually reported as raw numbers of cases observed in a certain number of patients. For comparison purposes, these fractional rates (all having different denominators) were all converted by ECRI to percentages. One or more percentages were calculated and tabled for 10 studies (Evidence Tables 24-29).

  • Mean pneumonia rates for multi-interval dysphagia programs: Two studies reported pneumonia rates for dysphagia programs after more than just one followup period. For descriptive purposes, rates for these years were averaged for each program.

  • Pneumonia rates adjusted to standardized time interval: Each study collected data for a different period of followup [usually hospital length of stay (LOS)]; some researchers reported on different followup periods for different groups of patients within a single study. We adjusted all of these pneumonia rates to a common interval following stroke as described in Appendix D. Briefly, we found two studies that reported chest infection rates out to 30 days following stroke. We fitted a curve to these data using SPSS statistical software and used this curve to interpolate or extrapolate the proportion of chest infection cases for any chosen interval of followup after stroke. (We assumed the proportion of pneumonia cases that accumulated over a given interval would be similar for all the studies, although the proportion of patients acquiring pneumonia would be unique for each study.) We then used the pneumonia rate reported in each study, as the proportion given by our fitted curve, to calculate the expected rate (called the adjusted pneumonia incidence) over an arbitrary standardized interval of 2 weeks, chosen by us for its proximity to all of the reported intervals. This time adjustment was calculated for seven historical control pneumonia rates and three dysphagia program rates.

  • 95 Percent Confidence intervals (CIs): None of the studies reported confidence intervals for their pneumonia rates. Therefore, we calculated and tabled the upper and lower 95 percent confidence limits for each rate, both time-adjusted and unadjusted. We also calculated the 95 percent CI for the within-study or between-study means calculated by ECRI. In all, we calculated upper and lower 95% confidence limits for 30 rates (60 calculations).

  • Exploratory meta-analysis of historical control pneumonia rate for acute stroke: Some of the dysphagia program studies were case series. As such, they had no control group. Therefore, to estimate the pneumonia rate following stroke in the absence of a dysphagia management program, we carried out an exploratory analysis that involved pooling the results of the four historical control studies reported in the published literature. We calculated a range, an unweighted between-study mean, and a pooled N-weighted between-study mean; all three measures were presented as both time-adjusted and unadjusted.

  • Effect sizes for individual studies (absolute difference between historical controls and dysphagia program pneumonia rates): Two of the four studies reporting pneumonia rates in dysphagia management programs (one for acute care of stroke patients and one for dysphagia patients in nursing homes) included within-study historical controls. However, neither of these studies calculated and reported an effect size. We calculated effect sizes using the difference between each dysphagia management program pneumonia rate and the within-study historical-control pneumonia rate. For the two studies with no within-study controls, we used the historical control pneumonia rate from our above-mentioned meta-analysis to calculate an effect size. In addition, for consistency, the one acute-care stroke study with a within-study control group was also contrasted with this out-of-study historical control from our meta-analysis, and an effect size was calculated for comparison to the effect sizes calculated for the studies without in-study controls. Altogether, five single-study effect sizes were calculated, both time-adjusted and unadjusted (10 calculations).

  • Statistical significance tests: To test the statistical significance of the above effect sizes, we calculated the upper and lower 95 percent CIs around the above five effect sizes (the absolute differences), both time-adjusted and unadjusted. Any interval that did not include zero was reported by us as a statistically significant effect size (with alpha = 0.05).

  • Meta-analysis of acute stroke dysphagia program studies' mean effect size: The three acute stroke dysphagia program studies were pooled and the range, unweighted between-study mean and N-weighted between-study mean pneumonia rate was calculated, both time-adjusted and unadjusted. We calculated the upper and lower 95 percent confidence limits for all of these means. These weighted and unweighted means were contrasted with the weighted and unweighted means from the above historical control meta-analysis, and the effect sizes were calculated by us as the absolute difference for each set of means (weighted and unweighted), both time-adjusted and unadjusted. The statistical significance for these four effect sizes was tested by calculating the 95 percent confidence limits for each difference and assessing whether each confidence interval included zero.

  • Proportional reduction of risk calculated as a secondary effect size: For small proportions such as the pneumonia rates considered above (mean historical rates, 6.7 to 13 percent; dysphagia management program rates, 0 to 2.8 percent), substantial reductions of risk can appear misleadingly small when only the absolute difference is reported (6.4 to 9.2 percent in these studies). To balance this, we also calculated the proportional reduction in risk (74 to 100 percent in these studies) as a secondary effect size for each individual study and for all weighted or unweighted between-study means, both time-adjusted and unadjusted if appropriate (20 calculations).
    Calculations in Evidence Tables: To facilitate identification of the above-described original calculations in evidence tables, we provide the following list:

  • Evidence Table 24: All fractions converted to percents for pneumonia rates.

  • Evidence Table 25: All fractions converted to percents for pneumonia rates, mean calculated for 2 years, adjusted pneumonia rates to 1-week interval for pre-program year, years 1 and 2 and mean for 2 years, calculate 95 percent CI for pre-program year, years 1 and 2, and mean of 2 years, all with and without time adjustment (8 intervals X 2 limits each = 16 calculations), calculate differences between pre-program year and years 1 and 2 and mean, with and without time adjustment, calculate 95 percent CI around above differences (6 intervals X 2 limits = 12 calculations), assess statistical significance (6 intervals), calculate proportional reduction of risk for years 1 and 2 and mean, with and without time adjustment (6 calculations).

  • Evidence Table 26: Calculate pneumonia rates, convert all fractions to percents, all rates adjusted to 2-week interval, 95 percent CI calculated for each rate, with and without time adjustment (6 X 2 X 2 = 24), calculated between-studies range, unweighted mean, and N-weighted mean, with and without time adjustment (6 calculations), 95 percent CI calculated for weighted mean, with and without time adjustment (4 calculations).

  • Evidence Table 27: For pneumonia rates, convert all fractions to percents, time-adjusted all rates (three calculations), 95 percent CI calculated, with and without time adjustment (3 X 2 X 2 limits = 12 calculations), calculate difference from historical control pneumonia rate, with and without time adjustment (6 calculations), calculate 95 percent CI around above differences (3 X 2 X 2 limits = 12 calculations), statistical significance assessed (6 calculations), proportional reduction of risk calculated, with and without time adjustment (3 X 2 = 6 calculations).

  • Evidence Table 28: Calculate pneumonia rate and 95 percent confidence intervals.

  • Evidence Table 29: For pneumonia rates convert all fractions to percents, compute 95 percent CI calculated for pre-program year and program year, calculate difference between pre-program and program year, calculate 95 percent CI around above difference, assessment of statistical significance, compute proportional reduction in risk.

Question 2

  • Sensitivity and specificity of clinical signs and symptoms: To examine the ability of the bedside examination (BSE) and its subtests to detect aspiration, we abstracted data from six different trials that described 24 subtests. For each of these data points, we calculated sensitivity and specificity (if it was not calculated already) or we verified the authors' calculations. The points were then plotted on a receiver operating characteristic (ROC) graph. For each of the nine different subtests, we examined the available data to determine whether it was appropriate to combine data from different trials in a meta-analysis. In most of these cases, there were too few data for a valid meta-analysis to be performed. In the others, design of the original studies varied too much to make them combinable.

  • Meta-analysis of the sensitivity and specificity of the BSE: Six trials measured the ability of a complete BSE to predict aspiration (with barium swallow results used as the gold standard). After evaluating these trials, we determined that they could be combined in a meta-analysis. ECRI's standard method for meta-analysis of diagnostic tests is based on the logit regression method of Littenberg and Moses (1993) and is portrayed as a summary ROC curve. Following the method of Littenberg and Moses, we applied a correction term to one data point with reported sensitivity of zero; the zero value would cause the logit to be invalid. The details of this meta-analytic method are provided in Appendix E but, briefly, this meta-analysis was performed using SPSS-based meta-analysis routines developed at ECRI. Regression results were entered into a spreadsheet for calculation of the summary ROC curves. The spreadsheet calculated and plotted the summary ROC curve, the 95 percent confidence interval on the summary ROC curve, and the sensitivity and specificity values given by the curve at the mean, maximum, and minimum thresholds observed in the clinical trials. We also examined available data comparing the BSE to the barium swallow in terms of pneumonia prediction and found too few data to permit meta-analysis.

  • Meta-analysis of the sensitivity and specificity of the 3-ounce water test. The summary ROC meta-analysis was repeated for the four clinical trials measuring the ability of the three-ounce water test to predict aspiration.

  • Calculations in evidence tables All of the calculations of sensitivity, specificity, PPV, and NPV shown in Evidence Tables 33 through 38 are ECRI-performed.

Question 3

The original calculations in this section appear primarily in evidence tables. For this reason, these calculations are listed by table.

  • Evidence Tables 41 and 42: Original calculations include the sensitivities, specificities, PPVs, and NPVs, as well as many of the percentages shown in these tables.

  • Evidence Table 43: For pneumonia rates, converted all fractions to percents, calculated 95 percent CI, calculated difference, calculated 95 percent CI around difference, assessed statistical significance, and computed proportional reduction in risk.

  • Evidence Table 44: Calculated test characteristics (sensitivity, specificity, PPV, NPV) for modified barium swallow (MBS), fiberoptic endoscopic examination of swallowing (FEES), and MBS with FEES.

  • Evidence Table 45: Calculated positive agreement, negative agreement, MBS positive and FEES negative, FEES positive and MBS negative for four studies.

Question 4

  • Performed statistical power calculations.

Synthesis of Results Common to the Four Questions

  • Evidence Table 69: For the first four studies in this table, the percents, differences, confidence intervals, and reduction in relative risk were calculated as described under Question 1 above. For the remaining four studies these same calculations were carried out and statistical significance of the differences was assessed. In addition, Fisher's exact test was used to compute a p-value for each of the eight studies and statistical significance of these p-values was assessed.

  • Illustrative meta-analysis: For eight studies, conducted meta-analysis by sum of Fisher p-values, conducted meta-analysis by z-scores, conducted statistical test of heterogeneity.

Future Research

  • As part of proposed trial, provide statistical power analysis.

  • Calculate number of subjects needed in proposed trial to perform statistical analysis.

Supplemental Analysis

Our supplemental analysis consists primarily of original calculations, and the reader is referred to it for further detail. Briefly, however, we performed a complete sensitivity analysis on the incremental cost-effectiveness calculation. This entailed selecting the variables and ranges to evaluate; a total of nine variables were analyzed. Sensitivity analyses were performed using the DATA decision tree software with which the trees were created (version 3.0.18, TreeÅge Software, Williamstown MA). Incremental cost and incremental pneumonias prevented were calculated with DATA, and the incremental cost-effectiveness was calculated with an Excel spreadsheet. The sensitivity analysis was repeated for barium swallow and for FEES.

Six different two-way sensitivity analyses were performed. Three pairs of variables were analyzed for both MBS and FEES. Two-way analysis of incremental cost and incremental pneumonias prevented was carried out with DATA, and the results of each analysis were read into an Excel spreadsheet developed by ECRI for this purpose. The spreadsheet validates the input data, calculates incremental cost-effectiveness for each of 121 points (11 points, 10 intervals for each variable), and provides output in the form of a table and a three-dimensional graph.

Evidence Tables

References
Adams PF, Marano MA. Current estimates from the national health interview survey, 1994. Vital Health Stat 10. 1995 Dec; (193): 1213, 18-19.
Ahmad M, Fergus L, Stothard P, Harrington D, Sivak E, Farmer R. Impact of diagnosis-related groups' prospective payment on utilization of medical intensive care. Chest. 1988 Jan; 93(1): 1769.
Akpunonu BE, Mutgi AB, Roberts C, Khuder SA, Federman DJ, Lee L. Modified barium swallow does not affect how often PEGs are placed after stroke. J Clin Gastroenterol Mar. 1997; 24(2): 748.
Ali GN, Wallace KL, Schwartz R, Decarle DJ, Zagami AS, Cook IJ. Mechanisms of oral-pharyngeal dysphagia in patients with Parkinson's disease. Gastroenterology. 1996 Feb; 110(2): 38392.
American Speech-Language-Hearing Association (ASHA), Health Care Financing Administration. Salary equivalency per hour limits. Medicare in 1998-1999. Health Care Financing Administration; 1998 Mar 6. 7 p. Available from ASHA.
Annegers JF, Appel S, Lee JR, Perkins P. Incidence and prevalence of amyotrophic lateral sclerosis in Harris County, Texas, 1985-1988. Arch Neurol. 1991 Jun; 48(6): 58993.
Armitage P, Berry G. Statistical methods in medical research.3rd ed. Oxford: Blackwell Scientific; 1994. Chapter 16, Statistical methods in epidemiology.
Arrowsmith H. Nursing management of patients receiving gastrostomy feeding. Br J Nurs. 1996 Mar 14-27; 5(5): 26873. [PubMed]
Aviv JE. Sensory discrimination in the larynx and hypopharynx. Otolaryngol Head Neck Surg. 1997 Mar; 116(3): 3314. [PubMed]
Aviv JE, Kim T, Sacco RL, Kaplan S, Goodhart K, Diamond B, Close LG. FEESST: a new bedside endoscopic test of the motor and sensory components of swallowing. Ann Otol Rhinol Laryngol. 1998 May; 107(5 Pt 1): 37887.
Aviv JE, Martin JH, Keen MS, Debell M, Blitzer A. Air pulse quantification of supraglottic and pharyngeal sensation: a new technique. Ann Otol Rhinol Laryngol. 1993 Oct; 102(10): 77780.
Aviv JE, Sacco RL, Mohr JP, Thompson JL, Levin B, Sunshine S, Thomson J, Close LG. Laryngopharyngeal sensory testing with modified barium swallow as predictors of aspiration pneumonia after stroke. Laryngoscope. 1997 Sep; 107(9): 125460.
Aviv JE, Sacco RL, Thomson J, Tandon R, Diamond B, Martin JH, Close LG. Silent laryngopharyngeal sensory deficits after stroke. Ann Otol Rhinol Laryngol. 1997 Feb; 106(2): 8793.
Awe WC, Fletcher WS, Jacob SW. The pathophysiology of aspiration pneumonitis. Surgery. 1966 Jul; 60(1): 2329. [Free Full Text in PMC icon.Free Full text in PMC]
Axelsson K, Asplund K, Norberg A, Eriksson S. Eating problems and nutritional status during hospital stay of patients with severe stroke. J Am Diet Assoc. 1989 Aug; 89(8): 10926.
Barer DH. The natural history and functional consequences of dysphagia after hemispheric stroke. J Neurol Neurosurg Psychiatry. 1989 Feb; 52(2): 23641. [PubMed]
Barker WH, Mullooly JP. Stroke in a defined elderly population, 1967-1985. A less lethal and disabling but no less common disease. Stroke. 1997 Feb; 28(2): 28490.
Bartlett JG. The bacteriology of pulmonary infections following aspiration. West J Med. 1974; 121: 3957.
Bartlett JG, Gorbach SL. The triple threat of aspiration pneumonia. Chest. 1975 Oct; 68(4): 5606.
Bass NH, Morrell RM. The neurology of swallowing.In: Groher ME, editor(s). Dysphagia: diagnosis and management. 2nd ed. Boston: Butterworth-Heinemann; 1992. p.1-29.
Bastian RW. Videoendoscopic evaluation of patients with dysphagia: an adjunct to the modified barium swallow. Otolaryngol Head Neck Surg. 1991 Mar; 104(3): 33950. [PubMed]
Bastian RW. The videoendoscopic swallowing study: an alternative and partner to the videofluoroscopic swallowing study. Dysphagia. 1993 Fall; 8(4): 35967. [PubMed]
Baum BJ, Bodner L. Aging and oral motor function: evidence for altered performance among older persons. J Dent Res. 1983 Jan; 62(1): 26.
Beard CM, Kokmen E, Offord K, Kurland LT. Is the prevalence of dementia changing? Neurology. 1991 Dec; 41(12): 19114.
Beck-Sague C, Villarino E, Giuliano D, Welbel S, Latts L, Manangan LM, Sinkowitz RL, Jarvis WR. Infectious diseases and death among nursing home residents: results of surveillance in 13 nursing homes. Infect Control Hosp Epidemiol. 1994 Jul; 15(7): 4946.
Beck TJ, Gayler BW. Image quality and radiation levels in videofluoroscopy for swallowing studies: a review. Dysphagia. 1990; 5(3): 11828. [PubMed]
Berg K, Mor V. Medicare nursing home residents with a stroke: characteristics and 90-day outcomes of care. J Aging Health. 1995 Aug; 7(3): 384401.
Biem HJ, Laupacis A. Hard to swallow test [letter; comment]. Arch Neurol. 1994 Feb; 51(2): 11920.
Bisch EM, Logemann JA, Rademaker AW, Kahrilas PJ, Lazarus CL. Pharyngeal effects of bolus volume, viscosity, and temperature in patients with dysphagia resulting from neurologic impairment and in normal subjects. J Speech Hear Res. 1994 Oct; 37(5): 104159.
Bleach NR. The gag reflex and aspiration: a retrospective analysis of 120 patients assessed by videofluoroscopy. Clin Otolaryngol. 1993 Aug; 18(4): 3037. [PubMed]
Blonsky ER, Logemann JA, Boshes B, Fisher HB. Comparison of speech and swallowing function in patients with tremor disorders and in normal geriatric patients: a cinefluorographic study. J Gerontol. 1975 May; 30(3): 299303.
Bower JH, Maraganore DM, McDonnell SK, Rocca WA. Incidence of progressive supranuclear palsy and multiple system atrophy in Olmsted County, Minnesota, 1976 to 1990. Neurology. 1997 Nov; 49(5): 12848.
Boyce HW Jr. Dysphagia: clinical update. St. Louis: Mosby; 1997 Nov 17 [cited1997 Nov 17]. [5 pages]. Available: http://www.asge.org/library/cu/cudysph.html.
Boyce JM, Potter-Bynoe G, Dziobek L, Solomon SL. Nosocomial pneumonia in Medicare patients. Hospital costs and reimbursement patterns under the prospective payment system. Arch Intern Med. 1991 Jun; 151(6): 110914.
Brenner H, Gefeller O. Variation of sensitivity, specificity, likelihood ratios and predictive values with disease prevalence. Stat Med. 1997 MAY 15; 16(9): 98191.
Britton JE, Lipscomb G, Mohr PD, Rees WD, Young AC. The use of percutaneous endoscopic gastrostomy (PEG) feeding tubes in patients with neurological disease. J Neurol. 1997 Jul; 244(7): 4314.
Broderick JP, Phillips SJ, Whisnant JP, O'Fallon WM, Bergstralh EJ. Incidence rates of stroke in the eighties: the end of the Decline in stroke? Stroke. 1989 May; 20(5): 57782.
Brown RD, Whisnant JP, Sicks JD, O'Fallon WM, Wiebers DO. Stroke incidence, prevalence, and survival: secular trends in Rochester, Minnesota, through 1989. Stroke. 1996 Mar; 27(3): 37380.
Buchholz DW. What is dysphagia? [editorial; comment]. Dysphagia. 1996 Winter; 11(1): 234. [PubMed]
Bushmann M, Dobmeyer SM, Leeker L, Perlmutter JS. Swallowing abnormalities and their response to treatment in Parkinson's disease. Neurology. 1989 Oct; 39(10): 130914.
Bynum LJ, Pierce AK. Pulmonary aspiration of gastric contents. Am Rev Respir Dis. 1976 Dec; 114(6): 112936.
Campbell-Taylor I, Fisher RH. The clinical case against tube feeding in palliative care of the elderly. J Am Geriatr Soc. 1987 Dec; 35(12): 11004.
CDC Wonder [database online]. Atlanta (GA): Department of Health and Human Services; 1998; [cited 1998 Jul 26]. Population of 15 to 85+ years; All races; Both genders; By age (the United States);1996a; [1p].
CDC Wonder [database online]. Atlanta (GA): Department of Health and Human Services; 1998; [cited 1998 Jul 26]. Population of 65 to 85+ years; All races; Both genders; By age-gender (the United States);1996b; [1p].
CDC Wonder [database online]. Atlanta (GA): Department of Health and Human Services; 1998; [cited 1998 Jul 26]. Population of 65 to 85+ years; All races; Both genders; By age (the United States);1996c; [1p].
Centers for Disease Control and Prevention (CDC). Fastats A to Z. Pneumonia. 1998; [cited 1998 Jul 28]. [2]. Available: http://www.cdc.gov/nchswww/fastats/newmonia.htm.
Centers for Disease Control and Prevention (CDC). Table L. Number of deaths and fatality rate for discharges from short-stay hospitals, by age and selected first-listed diagnosis: United States, 1995. In: National Hospital Discharge Survey: Annual Summary, 1995. Washington D.C.: U.S. Government Printing Office; 1998 Jan. 1p. (Vital and Health Statistics; vol. 13, no. 133).
Chandra RK. Nutritional regulation of immunity and risk of infection in old age. Immunology. 1989 Jun; 67(2): 1417. [PubMed]
Chandra RK. The relation between immunology, nutrition and disease in elderly people. Age Ageing. 1990 Jul; 19(4): S2531. [PubMed]
Chen MY, Ott DJ, Peele VN, Gelfand DW. Oropharynx in patients with cerebrovascular disease: evaluation with videofluoroscopy. Radiology. 1990 Sep; 176(3): 6413.
Chen MY, Peele VN, Donati D, Ott DJ, Donofrio PD, Gelfand DW. Clinical and videofluoroscopic evaluation of swallowing in 41 patients with neurologic disease. Gastrointest Radiol. 1992; 17(2): 958. [PubMed]
Chua KS, Kong KH. Functional outcome in brain stem stroke patients after rehabilitation. Arch Phys Med Rehabil. 1996 Feb; 77(2): 1947.
Ciocon JO. Indications for tube feedings in elderly patients. Dysphagia. 1990; 5(1): 15. [PubMed]
CKMD101@aol.com. Cost of videofluoroscopic swallow study. In: Dysphagia.com (Dysphagia discussion group) [online]. [cited 1997 Nov 16]. Available from Internet:dysphagia@cyberport.com.
Coates C, Bakheit AM. Dysphagia in Parkinson's disease. Eur Neurol. 1997; 38(1): 4952. [PubMed]
Cogen R, Weinryb J. Aspiration pneumonia in nursing home patients fed via gastrostomy tubes. Am J Gastroenterol. 1989 Dec; 84(12): 150912.
Collins MJ, Bakheit AM. Does pulse oximetry reliably detect aspiration in dysphagic stroke patients? Stroke. 1997 Sep; 28(9): 17735.
Cook TD, Campbell DT. Quasi-experimentation: design and analysis issues for field settings. Boston: Houghton Mifflin Company; 1979. 405p.
Croghan JE, Burke EM, Caplan S, Denman S. Pilot study of 12-month outcomes of nursing home patients with aspiration on videofluoroscopy. Dysphagia. 1994 Summer; 9(3): 1416.
Daniels SK, Brailey K, Priestly DH, Herrington LR, Weisberg LA, Foundas AL. Aspiration in patients with acute stroke. Arch Phys Med Rehabil. 1998 Jan; 79(1): 149.
Daniels SK, McAdam CP, Brailey K, Foundas AL. Clinical assessment of swallowing and prediction of dysphagia severity. Am J Speech Lang Pathol. 1997 Nov; 6(4): 1724.
Davalos A, Ricart W, Gonzalez-Huix F, Soler S, Marrugat J, Molins A, Suner R, Genis D. Effect of malnutrition after acute stroke on clinical outcome. Stroke. 1996 Jun; 27(6): 102832.
Davenport RJ, Dennis MS, Wellwood I, Warlow CP. Complications after acute stroke. Stroke. 1996 March; 27(3): 41520.
de Lama Lazzara G, Lazarus C, Logemann JA. Impact of thermal stimulation on the triggering of the swallowing reflex. Dysphagia. 1986; 1: 737.
DeMeester TR, Bonavina L, Iascone C, Courtney JV, Skinner DB. Chronic respiratory symptoms and occult gastroesophageal reflux. A prospective clinical study and results of surgical therapy. Ann Surg. 1990 Mar; 211(3): 33745.
Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics. Health United States 1996-97 and injury chartbook. Washington D.C.: U.S. Government Printing Office; 1997 Jul. Table 34 (page 1 of 2). Leading causes of death and numbers of deaths, according to age: United States, 1980 and 1995.
Department of Health and Human Services, Centers for Disease Control and Prevention, National Center for Health Statistics. Health United States 1996-97 and injury chartbook. Washington D.C.: U.S. Government Printing Office; 1997 Jul. Table 89 (page 1 of 3). Discharges and average length of stay in non-Federal short-stay hospitals, according to sex, age, and selected first-listed diagnosis: United States, 1985, 1990, 1994, and 1995.
DePaso WJ. Aspiration pneumonia. Clin Chest Med. 1991 Jun; 12(2): 26984. [PubMed]
DePippo KL, Holas MA, Reding MJ. The Burke dysphagia screening test: validation of its use in patients with stroke. Arch Phys Med Rehabil. 1994 Dec; 75(12): 12846.
DePippo KL, Holas MA, Reding MJ. Validation of the 3-oz water swallow test for aspiration following stroke. Arch Neurol. 1992 Dec; 49(12): 125961.
DePippo KL, Holas MA, Reding MJ, Mandel FS, Lesser ML. Dysphagia therapy following stroke: a controlled trial. Neurology. 1994 Sep; 44(9): 165560. [Free Full Text in PMC icon.Free Full text in PMC]
Dey AN. Characteristics of elderly nursing home residents: data from the 1995 National Nursing Home Survey. Adv Data. 1997 Jul 2; 289: .
Dobkin B. Neuromedical complications in stroke patients transferred for rehabilitation before and after diagnostic related groups. J Neurol Rehabil. 1987; 1(1): 37.
Dorland's illustrated medical dictionary. 28th ed.Philadelphia (PA): W.B. Saunders Company; 1994. 1940p.
Edwards LL, Quigley EM, Harned RK, Hofman R, Pfeiffer RF. Characterization of swallowing and defecation in Parkinson's disease. Am J Gastroenterol. 1994 Jan; 89(1): 1525.
Eibling D. The efficacy of endoscopic swallowing evaluation in dysphagic population [dissertation]. 82 p.
Ekberg O. Defective closure of the laryngeal vestibule during deglutition. Acta Otolaryngol (Stockh). 1982 Mar-Apr; 93(3-4): 30917. [PubMed]
Ekberg O. Posture of the head and pharyngeal swallowing. Acta Radiol Diagn (Stockh). 1986 Nov-Dec; 27(6): 6916. [PubMed]
Ekberg O, Feinberg M. Clinical and demographic data in 75 patients with near-fatal choking episodes. Dysphagia. 1992; 7(4): 2058. [PubMed]
Ekberg O, Nylander G, Fork FT, Sjoberg S, Birch-Iensen M, Hillarp B. Interobserver variability in cineradiographic assessment of pharyngeal function during swallow. Dysphagia. 1988; 3(1): 468. [PubMed]
Elmstahl S, Persson M, Andren M, Blabolil V. Malnutrition in geriatric patients: a neglected problem? J Adv Nurs. 1997 Nov; 26(5): 8515.
Emery JL. Two cases of lentil pneumonitis. Proc R Soc Med. 1960; 53: 9523. [PubMed]
Ergun GA, Kahrilas PJ, Logemann JA. Interpretation of pharyngeal manometric recordings: limitations and variability. Dis Esophagus. 1993; 6: 116.
Feinberg MJ, Ekberg O, Segall L, Tully J. Deglutition in elderly patients with dementia: findings of videofluorographic evaluation and impact on staging and management. Radiology. 1992 Jun; 183(3): 8114.
Feinberg MJ, Knebl J, Tully J. Prandial aspiration and pneumonia in an elderly population followed over 3 years. Dysphagia. 1996 Spring; 11(2): 1049.
Fleming SM. Index of dysphagia: a tool for identifying deglutition problems. Dysphagia. 1987; 1(4): 2068.
Fletcher RH, Fletcher SW, Wagner EH. Clinical epidemiology: the essentials.2nd ed. Baltimore (MD): Williams and Wilkins; 1988. 246p.
Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med Decis Making. 1991 Apr-Jun; 11(2): 8894.
Fuh JL, Lee RC, Wang SJ, Lin CH, Wang PN, Chiang JH, Liu HC. Swallowing difficulty in Parkinson's disease. Clin Neurol Neurosurg. 1997 May; 99(2): 10612.
Gann PH. Prostate-specific antigen screening for prostate cancer: issues involving test validity. Endocr Related Cancer. 1996 Fall; 3(3): 17989.
Garb JL, Brown RB, Garb JR, Tuthill RW. Differences in etiology of pneumonias in nursing home and community patients. JAMA. 1978 Nov 10; 240(20): 216972.
Gardner Manzella, Inc. Cost [unpublished]. 1998. 1 p.
Garon BR, Engle M, Ormiston C. Silent aspiration: results of 1000 videofluoroscopic swallow evaluation. J Neurol Rehabil. 1996; 10(2): 1216.
Gilbert RJ, Daftary S, Woo P, Seltzer S, Shapshay SM, Weisskoff RM. Echo-planar magnetic resonance imaging of deglutitive vocal fold closure: normal and pathologic patterns of displacement. Laryngoscope. 1996 May; 106(5 Pt 1): 56872.
Gillum RF, Ingram DD. Relation between residence in the southeast region of the United States and stroke incidence.The NHANES I Epidemiologic Followup Study. Am J Epidemiol. 1996 Oct 1; 144(7): 66573.
Golbe LI, Davis PH, Schoenberg BS, Duvoisin RC. Prevalence and natural history of progressive supranuclear palsy. Neurology. 1988 Jul; 38(7): 10314. [Free Full Text in PMC icon.Free Full text in PMC]
Golden A, Beber C, Weber R, KuMar V, Musson N, Silverman M. Long-term survival of elderly nursing home residents after percutaneous endoscopic gastrostomy for nutritional support. Nurs Home Med. 1997 Oct; 5(11): 3829.
Gordon C, Hewer RL, Wade DT. Dysphagia in acute stroke. Br Med J (Clin Res Ed). 1987 Aug 15; 295(6595): 4114.
Gottlieb D, Kipnis M, Sister E, Vardi Y, Brill S. Validation of the 50-ml 3 drinking test for evaluation of post-stroke dysphagia. Disabil Rehabil. 1996 Oct; 18(10): 52932.
Grant JP. Percutaneous endoscopic gastrostomy. Initial placement by single endoscopic technique and long-term follow-up. Ann Surg. 1993 Feb; 217(2): 16874. [PubMed]
Gray C, Sivaloganathan S, Simpkins KC. Aspiration of high-density barium contrast medium causing acute pulmonary inflammation-report of two fatal cases in elderly women with disordered swallowing. Clin Radiol. 1989 Jul; 40(4): 397400.
Greenfield LJ, Singleton RP, McCaffree DR, Coalson JJ. Pulmonary effects of experimental graded aspiration of hydrochloric acid. Ann Surg. 1969 Jul; 170(1): 7486.
Gresham SL. Clinical assessment and management of swallowing difficulties after stroke. Med J Aust. 1990 Oct 1; 153(7): 3979. [PubMed]
Groher ME. Bolus management and aspiration pneumonia in patients with pseudobulbar dysphagia. Dysphagia. 1987; 1: 2156.
Groher ME, Bukatman R. The prevalence of swallowing disorders in two teaching hospitals. Dysphagia. 1986; 1: 36.
Gustafsson B. The experiential meaning of eating, handicap, adaptedness, and confirmation in living with esophageal dysphagia. Dysphagia. 1995 Spring; 10(2): 6885. [PubMed]
Gustafsson B, Tibbling L. Dysphagia, an unrecognized handicap. Dysphagia. 1991; 6(4): 1939. [PubMed]
Haerer AF, Smith RR. Medical and surgical experiences in patients of a large southern stroke center. South Med J. 1974 Jun; 67(6): 66771.
Hageman C. Feasibility of videofluoroscopy and piezoelectric technique: utilization for dysphagia in nursing homes [unpublished]. 11 p.
Hamdy S, Aziz Q, Rothwell JC, Crone R, Hughes D, Tallis RC, Thompson DG. Explaining oropharyngeal dysphagia after unilateral hemispheric stroke. Lancet. 1997 Sep 6; 350(9079): 68692.
Hamlet S, Muz J, Farris R, Kumpuris T, Jones L. Scintigraphic quantification of pharyngeal retention following deglutition. Dysphagia. 1992; 7(1): 126. [PubMed]
Hanson P, Lawson G, Remacle M. Electromyography for swallowing disorders in the elderly. Eur J Phys Med Rehabil. 1995; 5(3): 769.
Hardy E, Robinson NM. Swallowing disorders treatment manual. Bisbee (AZ): Imaginart; 1993. Chapter 2, The normal swallow.
Harkness GA, Bentley DW, Roghmann KJ. Risk factors for nosocomial pneumonia in the elderly. Am J Med. 1990 Oct; 89(4): 45763.
Hartelius L, Svensson P. Speech and swallowing symptoms associated with Parkinson's disease and multiple sclerosis: a survey. Folia Phoniatr Logop. 1994; 46(1): 917. [PubMed]
Health Care Financing Administration, Bureau of Data Management and Strategy. National physician fee schedule relative value file. Calendar year 1998. Rockville (MD): U.S. Department of Health and Human Services, Public Health Service, AHCPR; 1997 DEC 22. 17p. (Public Use Files;).
Hedges LV, Olkin I. Statistical method for meta-analysis.Academic Press, Inc., 1985. 369 p.
Helmick CG, Wrigley JM, Zack MM, Bigler WJ, Lehman JL, Janssen RS, Hartwig EC, Witte JJ. Multiple sclerosis in Key West, Florida. Am J Epidemiol. 1989 Nov; 130(5): 93549.
Henderson CT. Safe and effective tube feeding of bedridden elderly. Geriatrics. 1991 Aug; 46(8): 568, 63-6. [PubMed]
Hendrie HC, Osuntokun BO, Hall KS, Ogunniyi AO, Hui SL, Unverzagt FW, Gureje O, Rodenberg CA, Baiyewu O, Musick BS. Prevalence of Alzheimer's disease and dementia in two communities: Nigerian Africans and African Americans. Am J Psychiatry. 1995 Oct; 152(10): 148592.
Hing E, Sekscenski E, Strahan G. The National Nursing Home Survey; 1985 summary for the United States. National Center for Health Statistics. Washington D.C.: Government Printing Office; 1989 Jan. 249p. (Vital and health statistics; vol. 13, no. 97).
Holas MA, DePippo KL, Reding MJ. Aspiration and relative risk of medical complications following stroke. Arch Neurol. 1994 Oct; 51(10): 10513.
Hopkins RS, Indian RW, Pinnow E, Conomy J. Multiple sclerosis in Galion, Ohio: prevalence and results of a case-control study. Neuroepidemiology. 1991; 10(4): 1929. [PubMed]
Horner J, Alberts MJ, Dawson DV, Cook GM. Swallowing in Alzheimer's disease. 1994 Fall. pp. 177–89. [Free Full Text in PMC icon.Free Full text in PMC]
Horner J, Buoyer FG, Alberts MJ, Helms MJ. Dysphagia following brain-stem stroke.Clinical correlates and outcome. Arch Neurol. 1991 Nov; 48(11): 11703.
Horner J, Massey EW. Silent aspiration following stroke. Neurology. 1988 Feb; 38(2): 3179. [Free Full Text in PMC icon.Free Full text in PMC]
Horner J, Massey EW, Brazer SR. Aspiration in bilateral stroke patients. Neurology. 1990 Nov; 40(11): 16868. [Free Full Text in PMC icon.Free Full text in PMC]
Horner J, Massey EW, Riski JE, Lathrop DL, Chase KN. Aspiration following stroke: clinical correlates and outcome. Neurology. 1988 Sep; 38(9): 135962. [Free Full Text in PMC icon.Free Full text in PMC]
Houston MS, Silverstein MD, Suman VJ. Community-acquired lower respiratory tract infection in the elderly: a community-based study of incidence and outcome. J Am Board Fam Pract.