A detailed summary of the critical appraisal of individual studies is provided in Appendix 5.
Randomized Controlled Trials (RCTs)
The RCTs included in this review were of mixed quality. Some RCTs included design features such as blinding data analysts4 and adverse event tracking,4,6 and their reports included explicit descriptions of study methods to allow a comprehensive study appraisal. Others were poorly reported, and perhaps poorly conducted, for example omitting descriptions of important study processes such as randomization,5,6,8,10,11,16,17 allocation concealment,5,6,8,10,11,16,17 and intervention compliance.8,16 Most of the included RCTs provided an explicit description of study objectives, hypotheses, eligibility criteria, outcomes and interventions, which allows for a thorough appraisal study quality relating to these elements. With most studies omitting details of randomization and allocation concealment, an assessment of the success of the randomization process to reduce or remove the influence of known and unknown confounders cannot be made. Similarly, only two of the RCTs included a power calculation and justification for the sample size,4,6 as evidence for the ability of the study to detect a clinically meaningful difference in measured outcomes. These two RCTs likewise included an analysis of clinical significance for primary outcomes, as did one further RCT that did not include a sample size calculation within the report.10 For the remaining five RCTs that did not include a power calculation, nor an analysis of clinical significance, it remains unclear as to whether these studies were sufficiently powered to detect meaningful differences. Of note, two of these five studies were small pilot studies that were not intended for hypothesis testing.5,17 Two RCT reports included an explicit description of adverse events and a monitoring process,4,6 while the remaining RCT reports do not discuss safety. In these cases it is unclear whether adverse events were not monitored, not reported, or did not occur.
All but one report provided an explicit description of the sampling process,8 which raises the potential for selection bias within this RCT. All but two reports included a description of the therapist(s) and training related to providing the study intervention,8,16 raising the potential for variation within intervention delivery within these studies and therefore the ability to associate the intervention with observed outcomes. Intervention compliance was adequate in four of the RCTs,4,6,11,17 not reported in two,8,16 and low in two.5,10 Low intervention compliance reduces internal validity of study results, as the ability to measure relevant outcomes associated with the intervention decreases. An intent to treat analysis was completed in three RCTs5,6,10. For the remaining studies, non-compliant patients were excluded from the analysis raising the potential for overestimating treatment results.
While validated outcome questionnaires were used to assess outcomes across all included RCTs, the potential for social desirability bias cannot be ruled out with these self-report measures. Further, due to the nature of the intervention, blinding of patients or therapists is not possible; although in one study personnel who entered and checked data were blind to group assignment.4 The lack of blinding of patients and therapists across all included studies, in addition to the use of self-report measures, increases the potential for bias in outcome assessment, in particular since participants are aware that they received the intervention and the desired direction of effect.
Observational Studies
As with the RCTs, most of the included observational studies were of mixed quality. Two common study design features within this group are the non-randomized, uncontrolled nature of the nine included pre-post studies,7,9,14,15,19–25 and the non-randomized nature of the one included cohort study.26 The lack of randomization increases the potential for selection bias, while the uncontrolled nature of the pre-post studies makes it impossible to distinguish intervention effects from other effects such as regression to the mean, natural progression, or social desirability. As with the RCTs, validated outcome measures were used across all observational studies, but due to the subjective nature of these self-report instruments, especially within a non-blinded, non-randomized, uncontrolled design, the potential for measurement error is increased. Finally, due to the nature of the intervention, neither blinding of patients nor therapists was possible. The lack of blinding, in addition to the use of self-report measures, increases the potential for bias in outcome assessment, in particular since participants are aware they received the intervention and the desired direction of effect.
As with the RCTs, most observational studies included an explicit description of study objectives, hypotheses, eligibility criteria, outcomes, and interventions, allowing for a comprehensive appraisal of study quality. Further, an explicit description of the therapist(s) and their related training across all studies provides assurance the intervention was delivered consistently and as intended.
None of the included studies, however, provided a power calculation nor justification for the number of included couples, although three studies were identified as pilot studies,7,9,19,20 where hypothesis testing was not the main goal. Despite not providing justification for the sample size, four of the included observational studies included an analysis of clinical significance of the primary outcome, suggesting these studies were adequately powered to detect a meaningful difference.14,15,20,23,24 As with the RCT reports, none of the observational studies included information about adverse event tracking raising the potential that these important outcomes were not tracked, as opposed to not reported. External validity is further limited in five of the pre-post studies due to poor reporting of sampling procedures, in particular whether people who agreed to participate were different in any meaningful way from those who did not participate.7,9,19–21,24
Intervention compliance was adequate within seven of the ten included observational studies,7,9,19,20,22,23,25,26 but low within two.14,15,24 Compliance was not reported within one study report.21 For each of the observational studies, non-compliant patients were excluded from the analysis, which raises the likelihood of overestimating treatment results especially for those studies with low compliance.14,15,24 In one pre-post study compliance was low within both study sites, but compliance rates differed significantly between study sites.14,15 In addition, for this particular study, the publication notes considerable differences in participant characteristics across study sites in terms of ethnicity, religion, education and income, in addition to differences in intervention delivery in terms of focus, scope, and treatment duration.14,15 Given results for this study were combined across study sites, these between site differences increase the potential for bias in outcome measurement since measurement effects will be impacted differently by both compliance and the intervention across included participants.