NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Beth Smith ME, Nelson HD, Haney E, et al. Diagnosis and Treatment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome. Rockville (MD): Agency for Healthcare Research and Quality (US); 2014 Dec. (Evidence Reports/Technology Assessments, No. 219.)

  • This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Diagnosis and Treatment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

Diagnosis and Treatment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome.

Show details


Key Findings

Thirty-six studies contributed to our understanding of diagnostic methods, diagnostic accuracy or concordance, and benefits or harms associated with a diagnosis of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS). Multiple case definitions have been used to define ME/CFS, and those that are labeled as myalgic encephalomyelitis (ME) and require the presence of post-exertional malaise (PEM) and other neurological and autonomic manifestations appear to represent a smaller but more impaired population. Validating new diagnostic tests is challenged by the lack of a ‘gold standard’ or universally accepted case definition. A self-reported symptom scale, the artificial neural network test, was found to have good sensitivity (95%), specificity (85%), and accuracy (90%) for identifying patients with ME/CFS compared with healthy controls. Another, the Schedule of Fatigue and Anergia for Chronic Fatigue Syndrome (CFS) scale, and certain 36-item Short Form Survey (SF-36) subscales or combination of subscales show moderate ability to discriminate between patients with ME/CFS compared with those without the condition. However, none have been adequately tested in a large population to determine validity and generalizability. Other tests, including serum parameters and cardiopulmonary function and recovery, have been insufficiently tested in broad populations to determine utility. We found little evidence on how diagnostic tests for ME/CFS vary by subgroups of the population and few studies that evaluated strategies on approaching the diagnostic workup to rule out other conditions prior to making a ME/CFS diagnosis. Evidence suggests that having an ME/CFS diagnosis is associated with perceived stigma, financial instability, difficulty in social interactions and relationships, and a greater risk of receiving a psychiatric diagnosis

Thirty-five trials contributed to our understanding of the efficacy of interventions to treat ME/CFS. Although most of the medication trials targeted an underlying pathophysiological dysfunction, most of the other treatments targeted associated symptoms of the disease. Trials of the immune modulator, rintatolimod, found improvement in exercise performance and suggested potential improvement in symptoms, including activities of daily living, and reduced use of other medications for relief of ME/CFS symptoms. A trial of the antiviral, valganciclovir, suggested improvement in fatigue, but further studies are required to determine if this is replicable. Different complementary and alternative (CAM) therapies have been studied only in small pilot trials with methodological limitations, and although homeopathy, pollen extracts, and carnitine preparations found improvement on some measures from baseline, methodological limitations and inconsistency in results across different measurement tools preclude any determination of potential effectiveness. Harms of CAM therapies have been poorly reported. Counseling, behavioral therapies, and graded exercise therapy (GET) were found to be beneficial compared with control groups for outcomes of fatigue, function, and clinical global impression of change. Counseling techniques were also beneficial for outcomes of quality of life and employment. The magnitude of benefit is likely similar between cognitive behavioral therapy (CBT) and GET; however, the studies selected for less disabled patients as they did not use a case definition of ME. Only one study of CBT and GET performed a subgroup analysis on patients meeting the London (Dowsett, 1994) criteria and may have been underpowered to detect a difference. Furthermore, benefit was lost in a sub-analysis that only considered studies using formal CBT approaches. The ultimate goal is recovery and the lack of consistent and meaningful outcome thresholds for measuring recovery limit any interpretation of the results from the few trials that considered this outcome. Results of four small trials suggest that younger, less disabled patients who focus less on their symptoms and avoid over or under exertion seem to do better. No differences were found for all other interventions and outcomes, as outcomes were either not reported, the study quality was poor, and/or the sample size was inadequate to provide a useful estimate. Although harms were not well reported across trials, GET was associated with a higher number of reported harms in several trials, higher withdrawal rates in one trial, and refusal for repeat exercise testing in another.

The key findings for this review are summarized in the summary of evidence table (Table 8, below) and the factors used to determine the overall strength of evidence grades are summarized in Appendix K.

Table 8. Summary of evidence.

Table 8

Summary of evidence.

Strength of Evidence

Our assessment of the strength of the evidence for major clinical outcomes is summarized in the strength of evidence table (Appendix K). We did not summarize the strength of evidence on diagnostic methods (Key Question 1) because the methods for doing so are not yet sufficiently developed to account for the variety of study designs, the uncertainty around determination of precision for estimates of test performance, the lack of consensus about the case definition for identifying a consistent study population, and the absence of a reference standard (“gold standard”).38 For intervention trials, major clinical outcomes are those explicitly stated in Key Question 2. The National Institutes of Health (NIH) Working Group and Technical Expert Panel members identified these as important outcomes because they are most relevant to patients, clinicians, and policymakers. Outcomes of benefit included in the strength of evidence table are overall function, fatigue, quality of life, days spent at work/school, proportion working full- or part-time, and clinical global impression of change. Harms outcomes included in the strength of evidence table are withdrawals due to harms, rates of harms, total withdrawals, serious harms, and total harms.

The strength of evidence table includes the four required domains: study limitations, directness, consistency, precision, and reporting bias (these terms are defined in Appendix F).31 The table summarizes the strength of evidence. Whenever possible, a quantitative estimation of the effect size was provided. When a quantitative estimate was not possible due to the heterogeneity in measuring outcomes and the small number of studies per intervention, a symbolic representation of effect was included, with + representing benefit, <> representing no difference, and – representing a negative effect.

We qualitatively rated the overall strength of evidence as high, moderate, low, or insufficient for each outcome. Strength of evidence is high for outcomes with a low level study limitations, consistency in results, and adequate precision (certainty surrounding the result). The strength of evidence was downgraded to moderate for outcomes with a medium level of study limitations, imprecise estimates, and inconsistency between trials. Strength of evidence was ranked low if multiple deficiencies existed. Strength of evidence was moderate for GET compared with usual care, support, relaxation or adaptive pacing for outcomes of function, and global improvement, and for CBT for global improvement. Strength of evidence was low for CBT on measures of fatigue, function, quality of life, and employment; for GET on measures of fatigue and work impairment; and for rintatolimod on measures of function. There is low strength of evidence that CBT is not associated with an increase in harms. For all other interventions and outcomes, strength of evidence was insufficient because these outcomes either were not reported, the study quality was poor, and/or the sample size was inadequate to provide a useful estimate.

Findings in Relationship to What Is Already Known

The lack of a clear etiology for ME/CFS, the multisystem involvement of the syndrome, and its overlap with other chronic conditions all contribute to the difficulty in diagnosing ME/CFS. Furthermore, there exists the risk of misdiagnosing a patient with an overlapping condition or incorrectly labeling a patient with ME/CFS. ME/CFS is a condition that does not have a universally accepted diagnostic (gold) standard, a set of criteria that defines the condition. The lack of a gold standard poses significant challenges for evaluation of diagnostic tests, and yet this is a situation that arises commonly with conditions that are syndromes. A syndrome is a “combination of symptoms and signs which have been observed to occur together so frequently and to be so distinctive that they constitute a recognizable clinical picture.”130 That is, the combination of findings is so unusual as to be thought not a coincidence. In such situations, the traditional evaluation of a diagnostic test is more challenging The ME/CFS literature is beginning to test diagnostic strategies but as yet has not presented data that would sufficiently differentiate the diagnosis of ME/CFS from other similar conditions in a population of patients with substantial diagnostic uncertainty.

One of the primary limitations in the literature about diagnostic tests for ME/CFS was that very few studies included a validation cohort. Instead, these studies primarily evaluated a diagnostic test in a single initial population (a derivation cohort). Derivation studies are a necessary first step when attempting to achieve a valid diagnostic test, but they also have inherent methodological problems. They often involve the use of cases and controls, two very distinct populations, in order to determine whether the test can distinguish between those two groups. If the test is capable of distinguishing between two distinct groups, then further testing should use populations that are more closely related (i.e., they have overlap in terms of symptoms), in order to more rigorously test the diagnostic capability of a particular test. The more rigorous diagnostic testing studies will include a population for whom the clinician is likely to face diagnostic uncertainty, and then test how well the test performs in classifying that population accurately. The studies identified for evaluation of diagnostic tests for ME/CFS fell into three main categories. The first are those that evaluated how those case definitions compare with each other, and whether they identify the same or different populations. While this was not a distinct Key Question, it was felt to shed light on the evolving definition of ME/CFS and the difficulty with identifying a universally acceptable reference standard. A second group of studies evaluated a diagnostic test or a scale against a chosen reference standard. In this case, the reference standard was typically one or more of several case definitions that have been published (CDC Holmes, 1988 or Fukuda, 1994, Canadian ME/CFS definition, International Consensus Criteria for ME, etc.). The third group of studies identified are those that address harms of diagnosis.

There were no studies that quantitatively compared the diagnostic concordance of two case definitions. Several studies attempted to demonstrate that ME, ME/CFS, and CFS case definitions identify clinically different groups of people. Studies did this by identifying people who met one criteria set but not the other.5,9,5052 Using this approach, it appears that the ME and ME/CFS case definitions select a population with more impairment, lower functioning, and higher symptom reporting compared with CFS alone. Other studies compared subjects who met a definition of CFS with subjects who had other disease states and/or those who comprised a healthy control population.5759 As expected, these studies demonstrated CFS subjects have lower functioning and higher symptom burdens than people from the general population.

Using a slightly different approach, a prior systematic review compared case definitions for ME/CFS to summarize how the prevalence of ME/CFS in a population and the symptom burden for patients vary when using different case definitions.131 That study attempted to bring some consistency to case definitions for ME/CFS in the absence of a reference standard. The inclusion criteria were broader than those for this report but similarly found that the validation studies were weak and heterogeneous. This group called for the community of ME/CFS researchers to prioritize research on treatments using existing case definitions, rather than development of additional new case definitions.131 They felt the CDC (Fukuda, 1994) criteria had the most studies on validation and comparison with other measures and was the most appropriate for clinical practice.

Notably, many of the intervention studies used the Oxford (Sharpe, 1991) case definition for inclusion, yet it has been criticized as being so nonspecific that it is unable to differentiate a patient with ME/CFS from a patient with an overlapping condition. The Oxford (Sharpe, 1991) criteria has been shown to include more patients than either the CDC (Fukuda, 1994) or the London ME (Dowsett, 1994) criteria. In the PACE trial, only 30 percent of patients enrolled using the Oxford (Sharpe, 1991) case definition also met the London (Dowsett, 1994) case definition for ME.121 Indeed, when comparing criteria across different case definitions, the symptom set of the Oxford (Sharpe, 1991) case definition is more generalized and as such is at greater risk of including patients with other overlapping conditions. Based on feedback from public comments to the draft of this review, patients and advocacy groups prefer the Canadian or International case definitions and have argued strongly against using a case definition that does not require the presence of PEM. (An Open Letter was received during the public comment period for this review from 53 advocates and experts).

Much research in this field focuses on discovering etiologies rather than testing diagnostic strategies in patients. Studies that attempted to define an etiology on the basis of a biochemical marker or a particular physiologic test were not included in this review; the intent of these studies was to identify an etiology rather than understand how the specific test could distinguish patients that would respond to treatment. In addition to biomarker studies (cell function, immunologic, virologic/bacteriologic, hormonal, etc.), studies identified subgroups on the basis of exercise testing,132,133 cerebral blood flow as measured by arterial spin labeling,134 gait kinetics,135 impaired blood pressure variability/hemodynamic instability,136,137 bioenergetics (capacity to recover from acidosis),138 and many others. These studies did not report diagnostic testing outcomes, such as receiver operating curve (ROC)/area under the curve (AUC), sensitivity, specificity, or concordance, and were therefore not useful in evaluating diagnostic testing for this report. The studies on serum biomarkers and cardiopulmonary function/recovery that did meet the inclusion criteria were not adequately tested in a broad spectrum of patients to determine utility for distinguishing patients with ME/CFS compared with other patients with chronic and disabling conditions.

In research studies, patients with ME/CFS reported feeling stigmatized by their diagnosis in terms of financial stability, work opportunities, perceived judgments on their character, social isolation, and interactions with health care providers. Compounding these difficulties is the substantial burden of misdiagnosis among this patient population. Two studies objectively identified prejudice and stereotypes towards patients with ME/CFS from members of the medical community; medical trainees and mental health practitioners make judgments about a patient’s condition based on the name it carries (ME, CFS, or other) and which treatment is being given. While these studies were descriptive and based on survey data, the results suggest valid concerns about the harm of labeling patients with a diagnosis of ME/CFS. These harms may reflect the chronic and disabling nature of this disease, combined with a lack of understanding about the diagnosis among the medical community and uncertainty about the etiology of ME/CFS. One commentary suggested that the harm is associated with the implications of a label rather than the label itself, and that it is “acceptable and often beneficial to make diagnoses such as CFS, provided that this is the beginning and not the end, of the therapeutic encounter.”139

Determining the efficacy of medication and CAM interventions to treat ME/CFS was limited because most were only evaluated in single studies at one center and had significant methodological limitations, including small sample sizes with some enrolling fewer than 20 subjects in one arm. Additionally, outcomes were assessed using different methods and different scales. Some medication trials were primarily intended to measure intermediate outcomes, such as natural killer cell-mediated cytotoxicity,89 and most were underpowered for the health outcomes relevant to this systematic review. While several fatigue and function outcomes were based on validated scales and measures, others were not, and the clinical significance of changes in scores over time are not clear.

Although placebo-controlled trials of immune modulating and antiviral medications suggested potential improvement in fatigue and functioning, some findings were of borderline statistical significance and other outcomes did not differ between groups. The rationale for treating patients with medications that have antiviral or immunomodulatory properties is based on the association of ME/CFS with viruses and immunological abnormalities that may underlie or promote its pathogenesis.18,140142 Although small trials of acyclovir,91 immunoglobulin G,85,143 and isoprinosine89 indicated no statistically significant differences between treatment and placebo groups for measures of fatigue, quality of life, or function, two trials of intravenous rintatolimod87,88 and a trial of oral valganciclovir86 suggested improvement. These trials differed from the earlier trials by using newer medications and applying selective inclusion criteria for participants that targeted patient subgroups based on clinical history of a likely viral onset of ME/CFS and high antibody titers86 or severe disability.87,88 However, most of these trials were meant as pilot studies to determine potential benefit and as a foundation for larger trials of longer duration. The results were not definitive and were limited by inconsistencies in methods and findings, small sample sizes, methodological shortcomings, and lack of long term followup. Trials of galantamine, hydrocortisone, and immunoglobulin G indicated no significant improvement compared with placebo. Harms related to medications that were statistically significantly higher for the treatment versus placebo groups included suppression of adrenal glucocorticoid responsiveness, increased appetite, weight gain, and difficulty sleeping with hydrocortisone; flu-like syndrome, chills, vasodilatation, dyspnea, and dry skin with rintatolimod; and headaches with immunoglobulin G.

Consistent with other systematic reviews, both CBT and GET were found to improve symptoms, primarily based on fatigue and function outcomes, whereas evidence on other nonpharmacological interventions was inconclusive.144147 Results need to be interpreted with caution given that studies often used multiple methods of evaluating outcomes and several had mixed results on the same outcome when comparing different tools. No study included patients based on a case definition for ME and only one included homebound patients. One study performed a subgroup analysis of those meeting the London ME (Dowsett, 1994) case definition but may have been too small to detect a difference even if a difference existed.121 Recovery as an outcome was reported in few trials and the variability in definition and thresholds leave the results meaningless for comparison. In the PACE trial, the criteria for inclusion was a SF-36 physical functioning score of 65 or less (revised protocol), yet the threshold for recovery was a score of 60 or more, and the Chalder fatigue score was less than 18, while normal is considered less than 4. An ideal definition of recovery would really mean a return to baseline function, which would be unique to each individual. Since this would be a difficult measure for research purposes, refining an acceptable definition with meaningful values is needed. Another critique of this literature is that some investigators teach patients that the disease is psychologically-based and caused by misperceptions and volitional deconditioning. By then educating and training patients that they can overcome their disease by changing attitudes, patients would expect to do better and consequently they report improvement on self-reported surveys.

When considering responders compared with nonresponders to treatments, one study comparing GET with usual care found that a reduction in symptom focusing was associated with improvement in self-reported measures of function, fatigue, and global change.127 In a different fair-quality study using a cluster analysis to identify coping strategies for ME/CFS patients, the investigators determined standardized discriminant function and structure coefficients for three clusters.148 One function separated the clusters and was significant (F=3.31, p=0.01) and accounted for 10 percent of the variance between groups (Rc=0.32). Adaptive coping accounted for 56 percent of the variance explained by the function (Rs=0.75) and less adaptive coping accounted for 25 percent (Rs=0.50). These strategies have obvious merit in general but also raise the question of whether reported improvements translate into meaningful change (i.e., returning to work, maintaining a household, meeting the demands of parenting). This question remains unanswered in the current literature. Additionally, although some of the studies attempted to measure adherence, inherent inaccuracies exist with self-reporting, particularly when it applies to home exercise programs. The one trial that considered homework compliance found that degree of improvement paralleled degree of homework compliance; however, only the cognitive therapy group had 75 percent or greater compliance and GET was not evaluated. It remains uncertain whether improved adherence, particularly with GET, is associated with greater benefit and meaningful change or greater harm.

Harms were not well reported throughout all of the nonpharmacological and CAM interventions. When reported, the harms associated with exercise included total, serious adverse events, nonserious adverse events, harms attributable to treatment, or withdrawal due to harms, but the specific harms were not delineated.90,108,121,127 In the combination trials, the greatest number of adverse events reported were in the GET arm of one trial,121 lowest adherence was in the exercise arm in another trial,108 and one trial had greatest withdrawal in the exercise arm.90,125,127 Significant number of patients refusing to repeat physiological testing implies significant harm in at least some of the patients.127 Although not scientific, a survey sponsored by the ME Association found that patients believed that GET made more people worse compared with other treatments.149,150 One study comparing CBT with cognitive therapy, anaerobic exercise, or relaxation found that those patients who remained within their energy envelope (avoided overexertion and under exertion by exerting a comfortable range of energy) had a significant improvement in mean fatigue and functioning scores regardless of treatment arm.106 This line of therapy needs to be further studied in varied settings to determine its utility over time and whether these interventions can widen one’s energy envelope and reduce harm.

A serious gap in the body of the evidence is the lack of subgroup analysis based on factors or symptom sets such as clinical features at baseline (extent of PEM, autonomic dysfunction, neurocognitive impairment, etc.), severity of disease, duration of disease, and patient demographics. In the current literature, ME definitions were not used for inclusion into any treatment trials and subgroup analysis was rarely performed. Effectiveness and/or harms may differ between patient subgroups, and given the small sample size of most of the trials, combining all patients may have lessened the effect size. A recent systematic review that compared different case definitions agreed that patients should be classified according to their severity and symptom patterns in order to optimally guide therapy and predict prognosis.131


The applicability of our findings to real-world clinical settings is supported by several features of the body of literature we reviewed. First, we included all recognized case definitions of ME/CFS in order to allow a broad representation of patients. Studies were conducted primarily in the United States or Western Europe and the patient population was predominantly female, which is consistent with clinical practice. Duration of symptoms, while not consistently reported, was broadly represented across studies. The interventions and comparators represented most of the therapeutic modalities commonly used in clinical practice.

However, there are several features of this body of evidence that limit its generalizability to the broader population of patients with ME/CFS, including factors surrounding the diagnosis itself. Given that the condition is a syndrome with a constellation of symptoms and lacking a gold standard for diagnostic comparison, diagnosis is at inherent risk of bias by the opinion of experts. Additionally, numerous comments on the draft report of this review emphasized that PEM is the critical feature of ME/CFS, yet most diagnostic studies used CDC CFS case definitions as reference standards (Holmes, 1988, Fukuda, 1994, or Reeves, 2005), which do not require the presence of PEM; no intervention trial used an ME case definition. Many of the diagnostic studies were conducted in a referral based environment and lacked a broad-based spectrum of patients, some with and some without the disease. Patients from specialty clinics may also represent more severe forms of the condition. Additionally, patients from rural centers or who lack insurance or financial resources may not have access to specialty clinics or clinical trials. Patients in research studies tended to be white middle-aged women, and it is unknown if the results in this population are generalizable to other demographic populations. The largest trial, PACE, excluded patients who could not read or speak English and only 7 percent of the study participants were from ethnic minority populations.121 Few trials enrolled homebound patients, with most trials requiring patients to be well enough to attend multiple sessions of treatment.

We elected to include trials using any predefined case definition but recognize that some of the earlier criteria, in particular the Oxford (Sharpe, 1991) criteria, could include patients with 6 months of unexplained fatigue with physical and mental impairment but no other specific features of ME/CFS. Applying this has the potential of inappropriately including patients that would not otherwise be diagnosed with ME/CFS and may provide misleading results. Most of the intervention trials used the Oxford (Sharpe, 1991) or CDC (Fukuda, 1994) case definitions for inclusion and the results may not be applicable to patients meeting case definitions for ME.

In clinical practice, treatment of ME/CFS often involves multiple concurrent therapies but we found few trials that compared one intervention with another or that compared a combination of concurrent therapies with another. We also found few trials that selected patients based on symptom patterning. The trial on valganciclovir, an antiviral medication, preselected patients with an inciting febrile event with lymphadenopathy and found improvement in fatigue in this population of ME/CFS, while the trials on immune modulators, which included patients who were severely disabled, found some improvement in exercise capacity. Both counseling techniques and GET showed improvement in most outcomes but studies to date have focused on efficacy rather than effectiveness. The combination of CBT and GET has not been adequately studied (one trial) to determine if this is more effective than a single intervention or if some patients may do better with this combination. It remains uncertain whether these results apply to all patients with ME/CFS or if there are patient subgroups that might receive greater benefit or experience greater harm, particularly in the GET trials, due to the lack of subgroup analysis.

Limitations of the Evidence Base

The main limitation of the evidence base in this review was poor study quality. Most trials did not specify randomization method, did not conceal allocation, and did not mask outcome assessment. Most studies were small and many were underpowered to detect significant differences. Studies were also highly variable in terms of methods used to measure outcomes limiting our ability to combine or compare results across studies.

A potential limitation of this review is that important studies whose findings might influence clinical and policy decisionmaking may not have been identified. A comprehensive, broadly inclusive search was conducted that produced 6,175 study titles and abstracts. Although non-English language studies and studies published before 1988 were excluded, it is unlikely that important studies of therapies used in current practice were missed; the general consistency of the findings with other systematic reviews provides some assurance that this review was not biased by the selection criteria. This review focused on diagnostic methods that provided data on a test’s utility in identifying patients with ME/CFS (receiver operator curve [ROC]/area under the curve [AUC], sensitivity, specificity, concordance). Other testing strategies were not reviewed and may provide further insight methods of identifying patients with ME/CFS.

To evaluate the benefits and harms of treatments, studies with durations of 12 weeks or longer were included because of the fluctuating nature of ME/CFS. This approach may have excluded studies of antiviral or other types of medications that are traditionally prescribed for shorter durations. To account for this, excluded studies were searched for medication trials that were appropriately given for a shorter duration, identifying two trials.91,92 Although intravenous rituximab was superior to placebo on SF-36 physical health and function scores and intravenous acyclovir was similar to placebo on fatigue and wellness scores, these results represent insufficient evidence. Neither study changed the overall conclusions of this report.

Outcome measurements for this report included overall improvement, fatigue, function, quality of life, and employment, which represent patient-centered functional health outcomes. Some interventions may have provided benefit for other symptoms of ME/CFS, and this review would not have identified these outcomes.

There may have been biased reporting of results in the literature such that only selected studies were published and retrievable and that published studies may have been affected by conflicts of interest, outcome reporting bias, or analysis reporting bias. Reporting bias and conflicts of interest are concerns with any systematic review. Quantitative analyses to evaluate the possibility of publication bias for the findings was not conducted because of the heterogeneity across studies in this review, and in many cases the lack of key information needed to perform quantitative syntheses generally precluded meaningful comparison of effect sizes. Weighing against the likelihood of publication bias, however, is the fact that the majority of included studies reviewed were small (most <100 patients, many <50) and most reported no significant effect of the intervention. Publication bias typically results in selective publication of larger studies and/or those with positive findings, and studies biased by conflicts of interest would also be more likely to report positive findings. A search of gray literature was conducted to look for unpublished data, and no evidence of unreported studies was found. The limited and vague reporting of harms in many studies may suggest outcome reporting bias for these outcomes.

Future Research and Implications for the Pathways to Prevention Workshop

What Are the Future Research Needs for Definition, Diagnosis, and Treatment of ME/CFS?

Given the prevalence and health impacts of ME/CFS, future research is necessary in several areas:

  • Case definitions: Consensus about which case definition is appropriate to use as the gold standard will further advance the study of diagnostic methods for ME/CFS. In the absence of consensus, future studies aimed at clarifying the diagnosis of ME/CFS should consider reporting how well a particular diagnostic test compares with more than one of the published case definitions. The lack of a definitive diagnostic test should not discourage the support of intervention and treatment studies. Ideally future intervention studies would consistently use an agreed upon single case definition to reduce variability in the patient samples and facilitate comparison of therapeutic benefit across studies. If a single definition cannot be agreed upon, future research should retire the use of the Oxford (Sharpe, 1991) case definition, given that it is at high risk of including patients who may have an alternate fatiguing illness, or whose illness resolves spontaneously with time.
  • Diagnostic instruments: Future studies evaluating the diagnostic capability of instruments for the identification of ME/CFS should include populations that include a broad range of people with relevant conditions that require clinical distinction from ME/CFS, such as fibromyalgia. Thus, the ideal diagnostic test for ME/CFS would adequately distinguish between ME/CFS and these conditions. Additionally, studies should report statistics on how well a particular measure distinguishes a group with ME/CFS from a group that does not meet these criteria—using concordance and the net reclassification index. For physiological and metabolic testing, selection of a broader spectrum of patients as a comparative group rather than healthy controls is needed.
  • ME/CFS registry: A national longitudinal registry of patients with a diagnosis of ME/CFS would allow for comparison of diagnostic criteria between patients and clarification of diagnoses over time. This strategy could also identify a well-characterized population for use in both diagnostic and treatment trials.
  • Treatment inclusion criteria: Use of selective inclusion criteria as was performed in some of the antiviral and rintatolimod trials may help to identify those with greater immunological versus neurological symptom sets, which may aid in furthering the understanding of etiology and diagnosis, as well as targeting treatment approaches. Consideration of the biomarker studies may aid in identifying these subsets of patients.
  • Treatment interventions: Reflective of the current clinical environment in which patients receive more than one treatment, interventions should be in multiple sites, use multicomponent treatments, larger sample sizes based on power calculations for key outcomes, and more rigorous adherence to methodological standards for clinical research. Given the fluctuating nature of the condition, followup periods greater than 1 year would be optimal to determine effectiveness over time.
  • Treatment analyses: Reporting of information about co-interventions, the timing of studied interventions in relation to other interventions, and adherence to interventions would improve the applicability of study findings. Similarly, stratification of findings by patient characteristics (e.g., baseline severity, comorbidities, demographics, symptom sets) would help determine the applicability of different interventions for specific patients and situations. It is particularly important for future studies to report findings according to the cardinal features of ME/CFS such as PEM, neurocognitive status, and autonomic function, as treatment choices may differ for subsets of the population.
  • Outcome evaluation: Given the plethora of outcome measures, the development of a set of core outcomes including patient-centered outcomes such as quality of life, employment, and time spent supine versus active, would help guide research and facilitate future data syntheses. In 2003 Reeves and colleagues recommended using an activity recorder to quantify activity, yet no study included in our review reported on this outcome. With today’s readily available personal activity trackers that can record activity as well as physiological responses, these outcomes should be easily obtained. Recovery needs to be better defined and should include functionally meaningful outcomes. Clearly reporting harms, particularly surrounding exercise therapy and testing and treatment for specific subgroups, may help identify patients more negatively affected by these interventions. Personal activity trackers could also be used to identify harms that result in reduced activity.
  • Other: Research is ongoing in diagnosing and treating specific symptoms such as PEM or orthostasis, and synthesizing this literature and evaluating its utility in diagnosing the syndrome of ME/CFS or subsets of the population is needed.6,151157 Further studies are needed to determine the utility of 2-day cardiopulmonary exercise testing to identify or monitor symptoms of post-exertional malaise.

The stories that were shared by patients and advocates in response to the draft report of this review iterated the devastating impact that this condition has had on patients and their loved ones. Although this review has focused on scientific literature, these messages have been heard and appreciated. It is recommended that future studies include the patient and/or advocate voice in the planning and development phases so that future research is relevant and meaningful to those affected by ME/CFS.


Multiple case definitions for ME/CFS exist. Those that require symptoms of PEM, neurological impairment, and autonomic dysfunction representing a more severe form of the condition. No current diagnostic tool or method has been adequately tested to identify patients when diagnostic uncertainty exists. Reports suggest stigmatization as a potential harm of receiving a diagnosis of ME/CFS; however, no studies specifically evaluated the potential positive aspects of getting a diagnosis such as relief at having an explanation for the symptoms. Although counseling approaches and GET have shown benefit in some measures of fatigue, function, and global improvement, they have not been well studied in subgroups of the population. Most other interventions have insufficient evidence to direct clinical practice. Harms reporting has been poor, and although GET appears to be associated with worsening symptoms in some patients, the cause remains uncertain. Acceptance of a single case definition and development of a core outcomes set would aid future research efforts to study effectiveness of interventions. Use of selective inclusion criteria such as symptom subsets (e.g., neuroendocrine/immune, neurological/neurocognitive, etc.) is needed to better inform diagnosis and treatment of ME/CFS. In general, future research focused on correcting limitations of current evidence is important to move the science forward.


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (3.6M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...