U.S. flag

An official website of the United States government

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Fordham B, Sugavanam T, Edwards K, et al. Cognitive–behavioural therapy for a variety of conditions: an overview of systematic reviews and panoramic meta-analysis. Southampton (UK): NIHR Journals Library; 2021 Feb. (Health Technology Assessment, No. 25.9.)

Cover of Cognitive–behavioural therapy for a variety of conditions: an overview of systematic reviews and panoramic meta-analysis

Cognitive–behavioural therapy for a variety of conditions: an overview of systematic reviews and panoramic meta-analysis.

Show details

Chapter 7Discussion

Principal findings and their meaning

Cognitive–behavioural therapy has been evaluated (with systematic reviews) in most conditions (68%, 27/40 of ICD-11 categories). These reviews have summarised the RCT evidence of whether or not CBT improved outcomes in these conditions. The review estimates were similar enough between the different conditions for us to generate a general (as opposed to a condition-specific) effect estimate. We found that CBT produced a modest general benefit to HRQoL, anxiety and pain outcomes. The evidence was consistent across all 22 out of 40 (55%) conditions (and comorbidities), populations and contexts that have been tested.

The estimates for depression outcomes between conditions were too different; therefore, we could not produce a pooled general effect estimate. Although there were many more reviews in the depression PMA, the reviews used fewer different outcome measurements than in HRQoL, anxiety and pain PMAs. Therefore, it is unlikely that the high heterogeneity is due to the variation in the outcome measurements used. CBT has been shown to be very effective for people with clinical depression.531,532 Our overview does not suggest that CBT is not effective for symptoms of depression, only that there was a great variation in how effective it was for changing depression symptoms across different conditions.

Cognitive–behavioural therapy was effective whether it was delivered in high- or low-intensity formats. This is not to imply that CBT can be delivered in high- or low-intensity formats interchangeably. The findings simply state that when low-intensity CBT has been tested in RCTs and synthesised into reviews, we found that it improved HRQoL, anxiety and pain outcomes. This adds strength to the argument that the mechanisms by which CBT is effective remain effective when delivered via high- or low-intensity formats.

Cognitive–behavioural therapy was effective in the short and long term. However, there was a paucity of reporting on the longer-term follow-ups (i.e. > 5 years post intervention); therefore, we have not captured the importance of relapse. We highlighted in the mapping exercise that there is a paucity of systematic review evidence regarding relapse prevention; this is an essential consideration to take into account when interpreting these findings.

When we pooled reviews that compared CBT with active interventions (e.g. pharmacotherapy, psychotherapy, exercise, education or relaxation), the effect estimates became very small. We found a significant interaction in the HRQoL analyses between those reviews that compared CBT with an active comparator and those that compared CBT with an inactive comparator. This could suggest that CBT and these other active interventions share mechanisms that improve HRQoL for patients.

We assume that CBT will help children, adolescents and adults, but we are uncertain as to how much it will help older adults, as there is less available evidence for this age population. We feel confident that CBT will be equally effective for male and female participants. The evidence base over-represents people who live in Europe, North America and Australasia, and poorly reports the ethnicity of the samples in the reviews. Consequently, we do not know if the effects will translate across people of different ethnicities in Europe, North America or Australasia or, to people who live in Asia, Africa or South America.

Strengths

When an individual systematic review pools evidence across many trials, the sample sizes can remain small. Small sample sizes mean that the effect estimate is less certain. One of the major strengths of this overview is that, by pooling data from many reviews across conditions, we become more certain of the effect estimates. Our HRQoL and anxiety outcome estimates include > 4000 participants, which guidance suggests indicates a certain effect.533

To maintain the lowest risk of bias and the greatest design homogeneity, we conducted our primary analyses with the highest-quality (rated ‘moderate’ or ‘high’ on the AMSTAR-2 checklist) reviews. The most common criticism of all the reviews was that we, as readers, could not access a review protocol to check if the authors had performed what they had intended to perform and had not simply ‘cherry-picked’ results to present in the review publication. Another common problem was reviews not reporting the reasons why they excluded trials from their review. Without this information, we cannot check if there was any bias towards including some, but not other, trials. Our sensitivity analyses suggest that the higher-quality reviews report consistent findings (less heterogeneity), compared with poorer-quality reviews, but the quality of the reviews did not alter the effect estimates.

As the inclusion of the lower-quality reviews did not alter the effect estimates, we concluded that the general effect is consistent across conditions represented by higher- and lower-quality reviews. We also suggested that the effect could be generalised to comorbid conditions represented by these reviews. Consequently, the general effect can be generalised to over half (55%) of all conditions represented in the ICD-11.

Weaknesses

The main methodological weakness was due to our restriction in remaining at the review level, as opposed to including RCT-level extraction and analysis. The mapping exercise identified 494 reviews. Of these, 279 were not included in the PMA because we could not extract the purely CBT RCT evidence. For example, a review that synthesised 10 CBT RCTs with three non-RCTs would be excluded from the PMA unless the review had presented any of the purely RCT evidence in isolation (even if it was one single RCT, which we could include). Similarly, if a review included 30 CBT RCTs combined with six mindfulness-based cognitive therapy RCTs, then this would have been excluded unless the review also presented a separate CBT subgroup analysis. The only way we would have been able to include these RCTs would have been to return to the original RCTs, extract the data and perform a meta-analysis of those data for entry into the PMA. This was a conflicting decision. The evidence base was so large that we did not have the resources to perform RCT-level extraction or analysis, but this was at the expense of many RCTs being excluded from the PMAs.

Another consequence of remaining at the review level was the limitation of the quality assessments. A review of high quality may include RCTs judged to have a high risk of bias. Without performing additional RCT-level assessment and a separate analysis of RCTs with low risks of bias, we could not restrict the data to the best-quality RCT-level data.

This overview does not examine the health economics of the CBT evidence base, which is an essential element of commissioning and is the context of evidence-based medicine. We could not perform this analysis because it was beyond the scope of this current overview.

Our method for classifying reviews was to represent each review in one ICD-11 code. We classified the review by the primary condition the CBT was being used to treat. For example, a review of CBT for depression in COPD patients was classified as a review of CBT for depression with comorbid COPD. This meant that we could not reflect the multimorbidity represented in these reviews. A total of 158 out of 494 reviews included a comorbid condition such as alcohol abuse or dementia. Our methodology means that we have under-represented the number of different conditions for which CBT has been used to improve HRQoL and reduce symptoms of depression, anxiety and pain.

We have mapped the systematic review data across each condition and by the following groups: ‘who’ (populations with different clinical severity), ‘what’ (the CBT intensity format) and ‘when’ (delivered at what time, i.e. preventatively, in response to clinical diagnosis or as a relapse prevention). We were restricted to reporting and analysing the review-level data. Reviews often combined RCTs conducted across multiple subgroups. We did not perform RCT-level exploration of the subgroups, which limits the accuracy of our findings.

Our indirect (intensity subgroup analyses) and direct (high- compared with low-intensity CBT reviews) evidence suggested no difference in effectiveness between using high-intensity and using low-intensity CBT. However, although reviews of high-intensity CBT produced broadly similar estimates of CBT’s effectiveness, the estimates from low-intensity reviews varied widely. The large variation in the low-intensity CBT estimates may be due to our definition of low-intensity CBT.1 We combined face-to-face delivery of CBT by paraprofessionals with self-help delivery of CBT (e.g. internet CBT). Future subgroup analyses could test if these two methods of delivery moderate the effectiveness of CBT.

When a review did not report how CBT was delivered in the included trials (i.e. high- or low-intensity CBT), we assumed that it was delivered face to face by a specialist (high intensity). We made this assumption because high-intensity CBT was the original and most common delivery method. When we developed our data extraction methods, we checked the trials in reviews that did not specify the CBT intensity. We found that these trials had tested high-intensity CBT. However, we did not check the included trials for every review included in our overview; therefore, this assumption may have led to us over-representing high-intensity CBT.

We made another assumption, whereby, if a review did not specify the time when the follow-up data were collected, we presumed that it was short term (< 12 months post intervention). We made this assumption because the majority of trials employed short-term follow-ups. However, as before, this assumption may be incorrect for some reviews.

Although most reviews estimated their effects with a random-effects meta-analysis, a few used a fixed-effects approach. We made an assumption that the within-review variability had been appropriately allowed for. If this assumption was incorrect, then our results might underestimate the amount of variation within conditions.

To make a meaningful interpretation of our effect estimates, we transformed them into mean differences using the standard deviation of the target outcome measure. If the value of this standard deviation is not a good approximate to the true standard deviation for this outcome, we might be underestimating or overestimating the effect size for each outcome considered.

We used techniques, such as using workbooks to record personal reflections, in the ECG meetings to ensure that each member could contribute equally. However, the debates often became polarised between academic discussions. Nevertheless, talking to PPI representatives more informally and during breaks generated rich feedback that helped the research group. On reflection, we should have included a formal forum at the end of each ECG meeting in which every member could summarise the day’s discussion and ask specific questions.

Implications

We have high-quality systematic review evidence, which demonstrates that, in comparison to no intervention, CBT improves HRQoL, anxiety and pain outcomes by a modest amount. This includes CBT that is delivered through low- and high-intensity formats. The benefit has been consistent in every condition, population and context in which it has been tested and synthesised into a systematic review. There are some conditions and contexts in which we are less certain about generalising the estimates from the PMA. These include conditions for which there is no similar condition already included in the review (e.g. vision impairments), or if CBT is being applied in contexts that we believe will vary substantially because of, among other things, cultural issues and health beliefs.

We have used a framework of broader generalisation that has considered pathophysiological rationale alongside the traditional quality indicators used in evidence-based medicine.534,535 The early proponents of evidence-based medicine were more subtle in their approach, and demanded that values, circumstances, expertise and even pathophysiologic rationale be considered, especially when generalising evidence from clinical trials.536,537 The most prominent evidence-based medicine rule of evidence is the GRADE system, which does not allow any role for pathophysiologic rationale at all, even though it does allow for recommendations to populations outside the trial (generalising).5 We suggested that using the results of the PMA alongside knowledge of mechanistic actions of CBT in particular situations may enable the generalisation of this effective treatment (CBT) to a greater range of physical and mental conditions (and hence patients).

Copyright © Queen’s Printer and Controller of HMSO 2021. This work was produced by Fordham et al. under the terms of a commissioning contract issued by the Secretary of State for Health and Social Care. This issue may be freely reproduced for the purposes of private research and study and extracts (or indeed, the full report) may be included in professional journals provided that suitable acknowledgement is made and the reproduction is not associated with any form of advertising. Applications for commercial reproduction should be addressed to: NIHR Journals Library, National Institute for Health Research, Evaluation, Trials and Studies Coordinating Centre, Alpha House, University of Southampton Science Park, Southampton SO16 7NS, UK.
Bookshelf ID: NBK567927

Views

  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (7.0M)

Other titles in this collection

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...