Included under terms of UK Non-commercial Government License.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Corbett M, Chehadah F, Biswas M, et al. Certolizumab pegol and secukinumab for treating active psoriatic arthritis following inadequate response to disease-modifying antirheumatic drugs: a systematic review and economic evaluation. Southampton (UK): NIHR Journals Library; 2017 Oct. (Health Technology Assessment, No. 21.56.)
Certolizumab pegol and secukinumab for treating active psoriatic arthritis following inadequate response to disease-modifying antirheumatic drugs: a systematic review and economic evaluation.
Show detailsThe effectiveness of SEC and CZP has been summarised in Chapter 3. Results for the main outcome measures, ACR, PsARC, PASI, HAQ-DI and HAQ-DI conditional on PsARC, for all the comparator agents (ETN, ADA, INF, GOL, UST and APR), have also been presented. These data indicate that all these agents demonstrate statistically significant clinical efficacy in PsA. In order to determine the relative efficacy of these agents it would be ideal to have the results from good-quality adequately powered RCTs comparing active treatments with one another. However, as the evidence base is made up almost entirely of comparisons with placebo, statistical methods for making indirect comparisons, such as a NMA, should be considered. NMA enables the comparison of multiple treatments using both direct comparisons of interventions within RCTs and indirect comparisons across trials based on a common comparator.111 As suggested by the term, NMA needs a ‘network of evidence’ to be established between all of the interventions of interest. The drugs being evaluated here all have a common comparator: placebo. It is this common comparator that allows the network between SEC, CZP and all the active comparators to be established and to provide information on the benefits of these agents relative to placebo and each other. The relevant comparators included in the evidence base are presented in Table 38 and the basic network diagram is presented in Figure 7.
TABLE 38
List of comparators included in evidence synthesis

FIGURE 7
Network of evidence (not outcome or subgroup specific). PLA, placebo.
Four separate outcomes were considered. Three outcomes were included in the NMA to inform the economic model: PsARC response; change of HAQ-DI score conditional on PsARC response; and PASI 50, PASI 75 and PASI 90 responses. In addition, ACR 20, ACR 50 and ACR 70 responses were analysed, as ACR response is the primary outcome in most of the included trials. Trials with data suitable for the NMA are identified in Table 39. Data from the 12-week time point were used, when available, otherwise data relating to the closest time point after 12 weeks were used (normally 14 or 16 weeks). Not all trials provided data for all of the outcomes analysed.
TABLE 39
Evidence on PsARC, HAQ-DI conditional on PsARC, PASI and ACR, by trial
Framework of analyses
The evidence synthesis was undertaken using WinBUGS (version 1.4.3; MRC Biostatistics Unit, Cambridge, UK). WinBUGS is a Bayesian analysis software tool that, through the use of Markov chain Monte Carlo methods, evaluates posterior distributions for the parameters of interest given likelihood functions derived from data and prior probabilities (uninformative priors were used throughout). There were few individual studies on each treatment; therefore, fixed-effect models were used across studies in all analyses. Parameter estimates for all functional parameters were reported from the models. These differ by outcome, and further details are presented in Methods. Treatment effects were expressed in relation to placebo. Owing to the sparse evidence imposing a high level of uncertainty over estimates of functional parameters, point estimates are medians throughout. Some models assumed exchangeability across treatments within a class, that is, different treatments of the same class were assumed to be similar, rather than equal. Within such models we reported the relative effectiveness estimates for each treatment (called shrunken estimates), rather than the class means, allowing us to represent any residual differences across treatments.
The validity of a NMA depends on an assumption of homogeneity/exchangeability between all the trials included in the network [i.e. that there are no essential differences between the methods, populations and interventions being studied, and that any differences are a result of chance (as in a standard meta-analysis)]. The lack of homogeneity/exchangeability between studies involving one of the treatments of interest and studies involving the other treatments of interest may generate inconsistency. Checking for consistency in the current network was not possible because of the lack of trials that directly compare active agents. Our examination of the study details and patient characteristics (see Chapter 3, Characteristics of the randomised controlled trials included in the systematic review of short-term efficacy) identified that the trials of the newer agents (SEC, CZP, UST and APR) included biologic-experienced patients as well as biologic-naive patients. Given that it is evident from large observational data sets (see Chapter 3, Review of anti-tumour necrosis factor patient registry studies) that efficacy response rates in biologic-experienced patients are lower than in biologic-naive patients, it was considered inappropriate to conduct an ‘all-patients’ NMA for any outcome, and that, instead, biologic-naive and biologic-experienced patients should be analysed separately. Therefore, separate analyses (separate networks) for treatment-naive and treatment-experienced patients were constructed for each of the four outcomes: one each for PsARC, HAQ-DI conditional on PsARC, PASI 50, 75 and 90, and ACR 20, 50 and 70 responses. A summary of the trials reporting data on each of these outcomes is presented in Table 39. It should be noted that the NICE scope112 for the present appraisal subdivides biologic-naive patients into those who have not responded to one cDMARD and those who have not responded to two cDMARDs. However, sufficient data were not available for these further levels of subgroup analysis.
As discussed in Chapter 3, Evaluating the secukinumab and certolizumab pegol trial results in comparison with other treatments, another important difference between the included trials is the observed results in the placebo arms, particularly for PsARC (see Table 40), PASI outcomes (see Table 50) and ACR (see Table 56). Our investigations on trial designs and patient characteristics did not identify any clear reasons for such differences, other than that placebo response rates appear to have increased over time. This observation (termed ‘placebo creep’) has been made in several other areas of clinical research and its impact on indirect treatment comparisons has been discussed.113 In the current review, across all trials, the PsARC placebo response rates are high, but are much higher in more recently conducted trials, and this has implications when interpreting unadjusted effect estimates. This is because the ceilings (maximum values) of RRs are limited by baseline response rates. For example, in the FUTURE 2 trial,48 the placebo response rate for PsARC in the biologic-naive subgroup was (confidential information has been removed), which meant that the maximum possible RR would be (confidential information has been removed); this maximum result is lower than some of the actual RRs for other biologics (see Table 40). Higher placebo rates therefore appear to dilute effect estimates somewhat. This is also demonstrated by the examining the RRs moving up the ACR outcome thresholds from ACR 20 to ACR 70, which generally increase (see Table 29). However, it is not clear exactly how these varying placebo rates will affect treatment effects when calculated using ORs. The evidence synthesis – which was based on ORs – therefore explored a potential relationship between baseline risk and relative effectiveness. The NMA explored scenarios where a metaregression on baseline risk (i.e. placebo response) was implemented for PsARC, PASI and ACR outcomes, which imposes an interaction effect between baseline risk and relative effectiveness.114 Further details of these analyses are presented below. Given that HAQ-DI scores are modelled conditional on PsARC response, such an interaction effect was deemed to be less relevant, and a metaregression model was not implemented on HAQ-DI.
Psoriatic Arthritis Response Criteria response
Subpopulation: biologic naive
Data
For the biologic-naive population, trial-specific PsARC response data were available from 14 trials47,48,50–56,58–61,65,66 of nine active treatments (150 mg of SEC, 300 mg of SEC, CZP, UST, GOL, ADA, INF, ETN and APR), and all treatments were compared with placebo (Table 40).
TABLE 40
Summary of trial-specific data in the biologic-naive subpopulation for PsARC response
The nine active treatments were categorised into three classes (anti-TNF, anti-IL and APR). Outcome data for GOL, INF and APR at 14–16 weeks, and for UST at 24 weeks, were included in the analysis and assumed equivalent to outcomes at 12 weeks. The inclusion of the 24-week PsARC data for UST was based on an assumption that they fairly reflected the 12-week results (subgroup results for PsARC at 12 weeks in the PSUMMIT 2 trial59,66 were not available, although 12-week data for the full population were available); this issue is discussed further in Appendix 3, Data used for the ustekinumab (PSUMMIT) trials. The trial-specific data included in the PsARC response analysis are presented in Table 40.
Methods
The NMA implemented separate models for the pooling of treatment effects and of placebo responses. We first implemented a model with independent treatment effects across treatments. Then a number of alternative models were implemented to explore the possibility of placebo response, and, within this, whether or not there was similarity between treatment effects for treatments of the same class.
Exploring placebo response as a treatment effect modifier
An examination of individual trial results suggests that studies presenting higher placebo rates report lower relative effectiveness estimates (see Appendix 3, Detailed methods for the biologic-naive subpopulation). In addition, recent trials, which evaluate newer treatments, also tend to show higher placebo response rates. For example, a recent study on 300 mg of SEC showed a placebo response rate of 46% (the FUTURE 2 trial48), which is much higher than that reported in an earlier study evaluating ETN, of 23% (Mease et al.53). Our investigations regarding trial designs and patient characteristics did not identify a clear reason for such differences, although placebo response rates appear to have increased over time. We investigated the effect of placebo response as a potential treatment effect modifier. It should be noted that the source of any relationship between placebo response and treatment effect is unclear, and the reader should interpret the results carefully and with caution.
To account for the differences in placebo response rates across the trials, a metaregression was undertaken. The baseline risk estimated for each trial within the synthesis model was used as the adjustment covariate. This allows for uncertainty in the estimation of baseline risk to be considered in the adjustment alongside any correlation with the log-ORs. Note that the baseline risk is expressed as the log-odds of PsARC response in the placebo arm. As typical of metaregression, the relationship between the treatment effect and baseline risk is defined by an interaction term (beta).
Within the independent treatment-effects analysis, beta is estimated by comparing the treatment effects across multiple studies on the same treatment with different placebo response rates. Within the evidence base, not all treatments present with multiple trials. Thus, only a subset of treatments contribute evidence to estimate beta: ADA (ADEPT55 and Genovese et al.56), ETN (Mease et al.53,54) and APR (the PALACE 1,60,61 PALACE 261,65 and PALACE 361,65 trials). This limitation in the evidence base meant that the beta had to be assumed to be independent of treatment (i.e. equal for all treatments). Moreover, the evidence base also showed that studies on the same treatment report reasonably similar placebo response rates. This may limit the validity of inferences over beta. For example, the Genovese et al.56 and ADEPT55 studies report placebo response rates of 27% and 26%, respectively, whereas across the whole set of studies the placebo response range from 21% to (confidential information has been removed).
As inferences on beta are drawn from differences between trials, the smallest difference in placebo rates corresponds to the maximum possible influential difference in reported treatment effects. The two trials on ADA (ADEPT55 and Genovese et al.56) illustrate this perfectly: the small (1%) difference in placebo response is associated with a 10% difference in response rate in the treatment group (from 51% to 61%). These data are thus influential to estimates of beta. Of the studies that contribute to inferences on beta, two trials have the smallest sample size of the whole set of trials: Mease et al.53 (ETN) and Genovese et al.56 (ADA). Given this, a sensitivity analysis excluding both Mease et al.53 and Genovese et al.56 was performed and effects on the estimate of beta ascertained (see Appendix 3, Detailed methods for the biologic-naive subpopulation for a more detailed account of the methods).53,56
Exploring treatment effects as class
In the context of an adjusted model for placebo response, we explored the possibility of there being class effects. Three different class groupings were considered: all treatments as a single class; all biologics as a class with APR separate; and, to reflect the pharmacology, anti-TNFs grouped, ILs grouped and APR separate. Additionally, for the last two groupings, we explored two within-class assumptions: assuming treatments within a class to have equal effectiveness and, alternatively, that treatments within a class have similar (exchangeable) effectiveness. Fixed effects across studies were assumed for all models. We did not consider models assuming exchangeability between classes.
Summary of all treatment effect models explored
All models implemented for the evidence synthesis of PsARC response are presented in Table 41. The models are numbered for ease of reference. Details of the models are presented in Appendix 3, Detailed methods for the biologic-naive subpopulation.
TABLE 41
Key assumptions of models implemented for the evidence synthesis of PsARC response
Model A1 considers the effectiveness of treatments as independent of each other. Model B1 considers the relative effectiveness of the alternative treatments as independent of each other, but that they all depend on the response in the placebo arm. Models C1, C2 and C3 consider the treatments as equal in terms of their effectiveness within class, but dependent on the effect of the placebo arm. Models D1 and D2 assume the treatments to have a similar, but not equal, effectiveness that is dependent on the effect of the placebo arm; this model introduces more flexibility than assuming treatment effects to be equal (models C2 and C3), but does not fully assume treatments to differ as in model A1. It allows for differences between the effectiveness of treatments that we may not be able to explain but that we should consider.
As stated earlier, sensitivity analysis around the adjustment for placebo response were performed: sets of analyses (models A1, B1, C1, C2, C3, D1 and D2) were conducted for PsARC response, excluding the Mease et al.53 and Genovese et al.56 trials.
Network meta-analysis results
Treatment effect models
Table 42 presents results of the treatment effects of PsARC response on the log-odds scale. Results are presented for all the alternative models with measures of goodness of fit. There were no issues with convergence. More detailed results of the models (A1, B1, C1, C2, C3, D1 and D2) are presented in Appendix 3, Detailed results for the biologic-naive subpopulation (ORs as well as log-odds, together with means, medians and 95% CIs are presented).
TABLE 42
Network meta-analysis results of PsARC response: log-ORs (median) of treatments analysed (including the studies of Genovese et al. and Mease et al.) in the biologic-naive subpopulation
The unadjusted model A1 indicates an appropriate model fit (with residual deviance close to the number of data points informing the model). The placebo response-adjusted model B1 fits well compared with the unadjusted model A1 [it presents a smaller deviance information criterion (DIC) and residual deviance, but not significantly so, as the difference in DIC is < 5 points].115 Model B1 imposes an association between the log-odds of placebo response and treatment effect. The estimated beta implies that a trial with a higher odds of a placebo response is expected to report smaller treatment effects. Consider 300 mg of SEC in unadjusted model A1: the treatment effect is evaluated at 1.178, but the studies on this treatment have a higher log-odds of placebo response than those on other anti-TNFs. The treatment effects reported in the adjusted model assume all treatments were trialled with the same baseline risk. Thus, after adjustment with the placebo response in model B1, the treatment effect estimate for 300 mg of SEC is higher (2.110). This is why the results (and rankings) generated by model B1 are very different from the observed trial results and results generated by the model A1.
Although the assumptions imposed by the placebo-adjusted model may be difficult to justify, or counteract, the limitations in the evidence base that underlie inferences also limit interpretation. First, the distinction between treatment effects and placebo effect is unclear. This is because newer treatments tested under higher placebo response rates show lower treatment effects, whereas older treatments tested under lower placebo response rates show higher treatment effects. There is also limited evidence on the effects of different placebo response rates for the newer treatments (SEC and CZP), as these drugs were studied in a single trial each.
We have further explored treatment effects as class. Model C1, which assumes that all treatments are equal, does not fit well with the existing data as it shows a much increased residual deviance. Models C2 and C3, which assume treatment equal within their class (model C2 separates APR from other drugs and model C3 separates ILs, anti-TNFs and APR), also do not fit well with the existing data, resulting in higher residual deviance and DIC. Models D1 and D2, however, relax the assumption of equality and apply a class effect where treatments within a class are assumed to be similar, not equal. These models fit equally well when compared with model B1 (similar DIC and residual deviance).
In all models exploring treatment effects as a class, the interaction term (beta) is negative. Among the best-fitting models (B1, D1 and D2), the more negative interaction term is observed in model B1. The interaction terms are similar between models D1 and D2.
In sensitivity analyses, we explored the effect of excluding the studies of Genovese et al.56 and Mease et al.53 on the placebo interactions (see Appendix 3, Detailed results for the biologic-naive subpopulation for details). The results showed that the beta is still negative, although of lower absolute value.53,56
Preferred models
The unadjusted model A1 fits the data as well as any of the other models and generates results that reflect the observed results of individual trials. Alternatively, we considered a model adjusted for placebo response. Despite no clear rationale for why placebo response rate should affect the treatment effect, when allowing for such an association (model B1), lower treatment effects are expected with higher placebo response rates. The results (and rankings) attained with model B1 are very different to those evaluated in model A1, and depend on the credibility of the association assumed. Regarding possible class effects, the analyses found that an assumption of equal class effect for the treatments does not produce a better-fitting model (models C1, C2 and C3) than assuming independent treatment effects (models A1 and B1) or similar treatment effects (models D1 and D2). There was little difference in goodness-of-fit statistics (DIC and residual deviance) between models D1 and D2, and we consider the exchangeable class effect model (D2), which utilised two classes (anti-IL and anti-TNF) with APR separate, to be the most clinically plausible.53,56 Hence, we consider models A1 and D253,56 to be our preferred models for the economic model in Chapter 6. Given the limited effect in sensitivity analysis, the Genovese et al.56 and Mease et al.53 studies were included in the preferred models.
A comparison of these analyses with those presented in the company submissions (CSs; Novartis and UCB Pharma) and those in the previous MTA (Rodgers et al.33) is presented in Appendix 3, Comparison of the network meta-analysis of Psoriatic Arthritis Response Criteria responses in the company submissions (Novartis and UCB Pharma), a previous multiple technology appraisal (Rodgers et al.) and the current Assessment Group.
Table 43 presents the probability and ORs for PSARC response from these preferred models.
TABLE 43
Network meta-analysis results: probability of PsARC response and ORs by treatments in the biologic-naive subpopulation
The NMA that does not adjust for the placebo response finds that SEC is more effective than CZP, and both are more effective than UST and APR, but both are somewhat less effective than all comparator anti-TNFs. After adjusting for the unexplained increase in placebo rates seen in more recent trials (and, hence, of newer agents), and under a class effect that allows for exchangeability for treatments within each class, the probability of a response with SEC remains slightly higher than with CZP and both remain more effective than UST and APR, but now their probability of response is similar to, or only slightly less than, that of the anti-TNF comparators.
These results indicate that, although SEC and CZP are effective in terms of the PsARC outcome, the relative effectiveness of these biologics compared with ETN, ADA, GOL, UST and INF and with each other, is uncertain. Both agents do seem to be more effective than APR.
Subpopulation: biologic experienced
For the biologic-experienced population, trial-specific PsARC response data were available from three trials for three active treatments (300 mg of SEC, CZP and UST), all compared with placebo.47,48,59,66 However, the data from the CZP trial were not included in the analysis, as the RAPID-PsA trial excluded patients with primary failures of a prior anti-TNF (i.e. no response within the first 12 weeks of treatment) from being recruited in its biologic-experienced population and so is not comparable to the other two trials. The data included in the NMA for treatment-experienced patients are presented in Table 44.
TABLE 44
Summary of trial-specific data in the biologic-experienced subpopulation for PsARC response outcome
The NMA conducted for the synthesis of data in the biologic-experienced population is equal to that implemented in the treatment-naive population: treatment effects are assumed to be independent and the model assumed fixed effects across trials. The evidence for the biologic-experienced subpopulation was sparse. The results of the analysis are presented in the Table 45. The result shows that the probability of a PsARC response is higher with SEC than with UST, but the CrIs overlap and the difference is likely to be insignificant. The results are comparable to the observed data (compare Tables 44 and 45) and consistent with those of the biologic-naive subpopulation (compare Tables 43 and 45).
TABLE 45
Network meta-analysis results of PsARC response: probability of a PsARC response, ORs and treatment effects on a log-scale in the biologic-experienced subpopulation
Health Assessment Questionnaire-Disability Index changes conditional on Psoriatic Arthritis Response Criteria response/non-response
Subpopulation: biologic naive
Data
For the biologic-naive population, HAQ-DI changes conditional on PsARC responses were available for nine active treatments (150 mg of SEC, 300 mg of SEC, CZP, UST, GOL, ADA, INF, ETN and APR) from 13 trials (see Table 39).47,48,50–52,54–56,58–61,65,66 The data for HAQ-DI change conditional on PsARC response are presented in Table 46.
TABLE 46
The HAQ-DI changes conditional on PsARC response and non-response by trials and treatments in the biologic-naive subpopulation: observed data
Outcome data for GOL and INF at 14–16 weeks, and for UST at 24 weeks, were included in the analysis and assumed equivalent to outcomes at 12 weeks. The rationale for the inclusion of the 24-week data for UST is discussed in Appendix 3, Data used for the ustekinumab (PSUMMIT) trials. The observed data indicate that HAQ-DI changes conditional on PsARC response do vary by treatment, ranging between (confidential information has been removed) (300 mg of SEC, FUTURE 2 trial48) and –0.290 (APR, PALACE 3 trial61,65). The observed HAQ-DI changes conditional on PsARC non-response in treatments range between (confidential information has been removed) (150 mg of SEC, FUTURE 2 trial48) and –0.049 (GOL, GO-REVEAL trial50).
For the placebo arms, the observed HAQ-DI changes conditional on PsARC response and non-response differ between trials [ranging between (confidential information has been removed) (FUTURE 2 trial48) and –0.160 (IMPACT 252) for response, and from (confidential information has been removed) (RAPID-PsA trial47) to 0.070 (IMPACT 252) for non-response].
The observed HAQ-DI changes conditional on PsARC response and non-response with treatments are greater than with placebo in all trials.
Methods
We consider three models to estimate the HAQ-DI changes conditional on PsARC responder or non-responder status. A detailed description of the model and underlying assumptions are presented in Appendix 3, Detailed methods for the biologic-naive subpopulation. The model E1 considers that treatments are independent and considers fixed effects across studies. Models E2 and E3 apply a class effect comprising three groups: anti-TNFs, ILs and APR. This class effect reflects the best-fitting class effect model for PsARC (see Network meta-analysis results). The model E2 assumes that the treatments are similar within class (exchangeable) and considers fixed effects across studies; and model E3 considers that the treatments are equal within class and considers fixed effects across studies.
Network meta-analysis results
The results are presented as absolute changes in HAQ-DI score in relation to baseline (Table 47). More detailed results are presented in Appendix 3, Detailed results for the biologic-naive subpopulation.
TABLE 47
Network meta-analysis results of HAQ-DI score changes (median) conditional on PsARC response and non-response in the biologic-naive subpopulation
The model fit statistics (DIC) indicate that neither class effect model (E2 or E3) is a better fit for the data than the unadjusted, independent treatments model (E1). The class effect models had similar fits, but the one that allowed exchangeability within classes (E2) was considered to be the most clinically plausible. For the purposes of the economic model, in Chapter 6, models E1 and E2 were the preferred models.
The results from the two preferred models are similar. The results from the unadjusted independent treatment effects model found that significant reductions in mean HAQ-DI score were achieved with response to all nine treatments and response to placebo. However, patients who responded to placebo achieved a lower level of improvement in the HAQ-DI score than those who responded to active treatment. Furthermore, the improvement in response to placebo is below the minimally important difference for PsA of –0.35.116
The median conditional on response HAQ-DI change was highest with INF and ETN, followed by 300 mg of SEC, but 150 mg of SEC and CZP were worse than all treatments except for APR.
Subpopulation: biologic experienced
For the biologic-experienced population, HAQ-DI changes conditional on PsARC responses were available for three active treatments (300 mg of SEC, CZP and UST) from three trials.47,48,59,66 However, the data from the CZP trial were not included in the analysis as the biologic-experienced population in the RAPID-PsA trial is not comparable to that in the other two trials48,59,66 (see Psoriatic Arthritis Response Criteria response, Subpopulation: biologic experienced). The data included in the NMA for treatment-experienced patients are presented in Table 48.
TABLE 48
The HAQ-DI score changes conditional on PsARC response and non-response by trials and treatments in the biologic-experienced subpopulation: observed data
Outcome data at 24-week were included in the analysis and assumed equivalent to outcomes at 12 weeks [see Appendix 3, Data used for the ustekinumab (PSUMMIT) trials]. The observed data indicate that, as in the treatment-naive subgroup, HAQ-DI changes conditional on PsARC response do vary by treatments. The observed HAQ-DI changes conditional on PsARC response and non-response in placebo arms differ between trials. The observed HAQ-DI changes conditional on PsARC response and non-response with treatments are greater than placebo in all trials.
The NMA conducted for the synthesis of data in the biologic-experienced population is equal to that implemented in the treatment-naive population: treatment effects are assumed to be independent and the model assumed fixed effects across trials. No class effect assumption was made for this subgroup analysis. The results are presented as absolute changes in HAQ-DI score in relation to baseline (Table 49). These results are generally comparable with the observed estimates from the primary studies.
TABLE 49
Network meta-analysis results of evidence synthesis of HAQ-DI changes conditional on PsARC response and non-response in biologic-experienced subpopulation
The results from the independent treatment effects model found that significant reductions in mean HAQ-DI score were achieved with response to SEC and UST, and response to placebo. As for the biologic-naive patients, those who responded to placebo achieved a lower level of improvement in the HAQ-DI score than those who responded to active treatments. Furthermore, the improvement in responders to placebo is below the minimally important difference for PsA of –0.35.116
Psoriasis Area and Severity Index response
Subpopulation: biologic naive
Data
For the biologic-naive population, PASI response data were available for nine active treatments (150 mg of SEC, 300 mg of SEC, CZP, UST, GOL, ADA, INF, ETN and APR) from 13 trials47–67 (see Table 2). A brief summary of PASI responses in different trials is presented in Table 50. Outcomes at 14 and 16 weeks were included in the analysis and assumed to be equivalent to outcomes at 12 weeks. Data from the 12-week time point were used for the two PSUMMIT trials. Not all patients who were randomised to trials were eligible for the PASI evaluation, and the proportion of PASI-evaluable patients differed between trials, ranging between 42% and 84% in treatment arms and between 31% and 87% in placebo arms. All trials reported PASI 50 and PASI 75, except the PSUMMIT 2 and SPIRIT-P1 (Study of Ixekizumab in Participants With Active Psoriatic Arthritis) trials,57,59,66,67 which did not report PASI 50. A few trials did not report PASI 90 (i.e. the PALACE trials,60,61,65 Mease et al.53 and PSUMMIT 259,66).
TABLE 50
Summary of trial-specific data in the biologic-naive subpopulation for PASI response outcome
Methods
The NMA for PASI utilised a framework of analysis that evaluated the probability of PASI responses in different categories of PASI thresholds (50/75/90) within a single model:117 the single model included all categories of PASI and generated a single effect estimate for each treatment and also probabilities of achieving PASI 50, PASI 75 and PASI 90.
Reflecting the analyses on PsARC, alternative assumptions were tested in two analyses. The first analysis assumed independent treatment effects and did not include any metaregression for placebo effects (model F1). As the number of trials to inform each treatment effect was small, a fixed-effect model was used. In a second analysis, we explored the impact on treatment effects of adjusting for placebo responses [i.e. baseline effects (metaregression model)]. As can be seen in Table 50, there were large differences between trials for PASI responses in the placebo arms, ranging between 0% (in IMPACT51) and 27% (in RAPID-PsA47). The IMPACT51 had a very small sample size and reported 0% response in the placebo arm and 100% response in the treatment arm, which lead to very extreme values for placebo adjustment. Therefore, IMPACT51 could not be included in the metaregression analysis. Unlike the analysis for PsARC, for PASI, we did not assume a class effect as the evidence from individual trials does not support such an assumption. Table 51 presents the key assumptions for the models implemented for the PASI response. The detailed model assumptions are presented in Appendix 3, Detailed methods for the biologic-naive subpopulation.
TABLE 51
Summary of models implemented for evidence synthesis of the PASI response
Model F1 considers that treatments are independent of each other and assumes fixed effects on cut-off points/thresholds. Model G1 considers the same assumption as model F1, but IMPACT51 was excluded from the analysis. Model G2 assumes that treatments are independent of each other, but treatment effects are adjusted with the trial-specific baseline effects assuming a common interaction term (beta).
Network meta-analysis results
Table 52 presents the results of the treatment effects for the PASI responses estimated from the three models with measures of goodness of fit. There were no issues with convergence.
TABLE 52
Network meta-analysis results of the PASI response: treatment effects (median) on a probit scale in the biologic-naive subpopulation
The results of models G1 and F1 are similar, except for a small effect on the estimate of effect for INF; therefore, model F1 is the preferred unadjusted model, as it does not exclude any trial evidence. In model G2, the DIC and residual deviance are lower than in model G1, indicating that the model fits well with the existing data and the data support the assumption of adjustment with baseline effects.
Table 53 shows the probability of achieving PASI 50, PASI 75 and PASI 90 from the preferred treatment-unadjusted and -adjusted model in the biologic-naive population.
TABLE 53
Network meta-analysis results of the PASI response: probability of achieving PASI 50, PASI 75 and PASI 90 in the biologic-naive subpopulation
The results of the unadjusted NMA for the PASI, as a single outcome or as separate categorical variables, show that all treatments are more effective than placebo. The difference between treatments is uncertain, with wide CrIs that mostly overlap with each other. The results show that patients taking INF have the highest probability of achieving PASI 50, PASI 75 and PASI 90 responses. However, after adjustment for placebo, 300 mg of SEC has the highest probability of response. The probabilities for CZP changed between the models. It appears to be less efficacious than all other treatments, except APR and ETN, in achieving PASI responses in the unadjusted model. However, in the adjusted model, it appears to be more efficacious than GOL, UST, APR and ETN, and similar to ADA. The estimated probabilities from the analysis reflect fairly closely those from the primary studies, indicating that the model fits the data well.
Subpopulation: biologic experienced
For the biologic-experienced population, trial-specific PASI response data were available for three active treatments (300 mg of SEC, CZP and UST) from three trials,47,48,59,66 but, as for the other outcomes, the data from the CZP trial were not included in the analysis as the biologic-experienced population in the RAPID-PsA trial47 is not comparable to the population in the other two trials48,59,66 (see Psoriatic Arthritis Response Criteria response, Subpopulation: biologic experienced). The data included in the NMA for the treatment-experienced patients are presented in Table 54.
TABLE 54
Summary of trial-specific data in the biologic-experienced subpopulation for PASI response outcome
In the FUTURE 2 trial,48 only a small proportion of patients were eligible for the PASI evaluations; 33% in the treatment arm and 34% in the placebo arm. The small sample size and associated lack of events in this placebo arm increase uncertainty in the analysis.
A NMA was conducted under the same specification as used in model F1 (independent treatments, unadjusted biologic-naive analysis). Because the data were sparse, no adjustment was undertaken for this subgroup analysis. The results of the analysis are presented in Table 55.
TABLE 55
Network meta-analysis results of the PASI response: probability of achieving PASI 50, PASI 75, PASI 90 and treatment effects in the biologic-experienced subpopulation
The result shows that the probability of achieving a PASI response in all categories is much higher with SEC than with UST, although the estimates are highly uncertain, with wide CrIs that overlap with each other. The results are fairly comparable with observed data.
American College of Rheumatology response
Subpopulation: biologic naive
Data
For the biologic-naive population, evidence on ACR response was available for nine active treatments (150 mg of SEC, 300 mg of SEC, UST, CZP, GOL, ADA, INF, ETN and APR) from 15 trials.47,48,50–61,65–67 A brief summary of the ACR responses in the different trials is presented in Table 56. Outcomes at 14 and 16 weeks were included in the analysis and assumed to be equivalent to outcomes at 12 weeks. All 15 trials reported all three categories of ACR response (20/50/70).
TABLE 56
Summary of trial-specific data in the biologic-naive subpopulation for ACR response outcome
Methods
As ACR is, like PASI, a categorical variable (ACR 20, ACR 50 and ACR 70), the NMA for ACR utilised a similar framework of analysis to that used to estimate the probability of PASI responses: all categories of ACR were within a single model which generated a single effect estimate for each treatment and also probabilities of achieving an ACR 20, ACR 50 and ACR 70.
Analogously to the analyses on PsARC, sets of alternative analyses were conducted for ACR response outcomes. We explored the effect of differences in trial-specific placebo responses on treatment effect by undertaking a metaregression. In the context of an adjusted model for placebo response, we explored the possibility of there being class effects. Three different class groupings were considered: all treatments as a single class; all biologics as a class with APR separate; and, to reflect the pharmacology, anti-TNFs grouped, ILs grouped and APR separate. In addition, we explored two within-class assumptions: assuming treatments within a class to have equal effectiveness and, alternatively, assuming that those treatments within a class have similar (exchangeable) effectiveness. Fixed effects across studies were assumed for all models. We have not considered models assuming exchangeability between classes.
Summary of all treatment effect models explored
All models implemented for the evidence synthesis of an ACR response are presented in Table 57. Detailed coding of the models is presented in Appendix 3, Detailed methods for the biologic-naive subpopulation.
TABLE 57
Key assumptions of models implemented for evidence synthesis of ACR response
Model H1 considers that the treatments are independent of each other. Model I1 considers the relative effectiveness of the alternative treatments as independent of each other, but that they all depend on the response in the placebo arm. Model J1 considers the treatments as equal in terms of their effectiveness, but dependent on the effect of the placebo arm. Models J2 and J3 consider the treatments as equal in terms of their effectiveness within class, but dependent on the effect of the placebo arm. Models K1 and K2 assume the treatments to have a similar, but not equal, effectiveness and to be dependent on the effect of the placebo arm.
Network meta-analysis results
Table 58 presents the results of the treatment effects for ACR responses estimated from the seven models with measures of goodness of fit. There were no issues with convergence.
TABLE 58
Network meta-analysis results of ACR response: treatment effects (median) on a probit scale in a biologic-naive subpopulation
The placebo response-adjusted model I1 fits well compared with the unadjusted model H1 (smaller DIC and residual deviance), but is not significantly better. In addition, the results (rankings) generated by model I1 are very different from the observed trial results. Models J1, J2 and J3 do not fit well with the existing data, resulting in a significantly higher residual deviance and DIC. Both models K1 and K2 fit as well as the unadjusted model H1 (similar DIC and residual deviance).
Among all the placebo response-adjusted models, models I1, K1 and K2 show similar DIC and residual deviance, which means that these three models fit the existing data equally well, although not significantly better than the unadjusted model.
The interaction term (beta) is negative in all models, which means that higher placebo response rates in trials are associated with higher treatment effects, demonstrating that adjustment for heterogeneity in the placebo responses across trials was required. The interaction term varies between models, but is similar between models K1 and K2.
Preferred models
The unadjusted model, H1, fits the data as well as any of the other models and generates results that reflect the observed results. Considering the placebo-adjusted models, model I1-generated results (rankings) are very different from the observed trial results and the results generated by model H1. Using an assumption of equal class effect for the treatments does not produce a better-fitting model (models J1, J2, J3) than assuming independent treatment effects (models H1, I1), or similar (exchangeable) treatment effects (models K1, K2). In addition, there was a little difference in the goodness-of-fit statistics (DIC and residual deviance) between models K1 and K2, and we consider the exchangeable class effect model, which utilised two classes (anti-ILs and anti-TNFs) with APR separate, to be the most clinically plausible. Hence, our preferred models are models H1 and K2. Note that the economic model uses PsARC; thus, these results were not implemented in the economic model in Chapter 6.
Table 59 presents the probabilities of achieving ACR 20, ACR 50 and ACR 70 responses in a biologic-naive population from the preferred models, H1 and K2.
TABLE 59
Network meta-analysis results of ACR response: probability of achieving ACR 20, ACR 50 and ACR 70 responses in a biologic-naive subpopulation
The results of the unadjusted NMA for ACR, as a single outcome or as separate categorical variables, show that all treatments are more effective than placebo. The difference between treatments is uncertain, with wide CrIs that mostly overlap with each other. The results show that patients taking INF have the highest probability of achieving ACR 20, ACR 50 and ACR 70 responses. The probabilities for SEC are lower than those for INF, ETN, GOL and ADA. After adjustment for placebo, the probabilities for 300 mg of SEC and 150 mg of SEC increase and are very similar to those for INF. The probabilities of achieving ACR 20, ACR 50 and ACR 70 responses with CZP varied between the models: in the unadjusted model the probabilities were higher than only those for APR and UST, but after adjustment they were also higher than those for GOL, ADA and UST.
Subpopulation: biologic experienced
For the biologic-experienced population, trial-specific ACR response data were available for three active treatments (300 mg of SEC, CZP and UST) from three trials,47,48,59,66 but, as for the other outcomes, the data from the CZP trial were not included in the analysis as the biologic-experienced population in the RAPID-PsA trial is not comparable to the populations of the other two trials.48,59,66 The data included in the NMA for treatment-experienced patients are presented in Table 60.
TABLE 60
Summary of trial-specific data in a biologic-experienced subpopulation for ACR response outcome
The NMA model was similar to model H1: independent treatment effects in the biologic-naive subpopulation. Owing to the lack of data, no adjustment was undertaken for this subgroup analysis.
The results of the analysis are presented in Table 61 and show that the probabilities of achieving an ACR response in all categories are slightly higher with UST than with SEC, although the differences are insignificant. The results are fairly comparable to the observed data (compare Tables 60 and 61).
TABLE 61
Network meta-analysis results of ACR response: probability of achieving ACR 20, ACR 50 and ACR 70 responses, and treatment effects in a biologic-experienced subpopulation
Limitations
Data were sparse; there were few studies in each treatment [a maximum of three studies in two treatments (ADA55–57,67 and APR60,61,65)]. For this reason, we were not able to fit random-effect models, especially when considering placebo adjustment. Hence, fixed-effect models were used in all analyses.
Summary of findings of relative efficacy from network meta-analysis
The NMA was conducted to formally investigate the relative efficacy of SEC and CZP and the other active comparators. Analyses were conducted on four outcomes: PsARC, HAQ-DI conditional on PsARC response, PASI and ACR. Analyses were not run for the full-trial populations because of the heterogeneity across trials, but instead were performed separately for the biologic-naive and biologic-experienced subgroups. The data suggest the rate of placebo response to be a potential source of heterogeneity within the biologic-naive population networks, despite there being no clear rationale for such an effect. For this reason, we explored models that adjust for the placebo response, alongside unadjusted models.
Biologic-naive patients
In terms of PsARC response, the results indicated that, although SEC and CZP are effective, the relative effectiveness of these biologics compared with ETN, ADA, GOL and INF, and with each other, is uncertain, although both agents do seem to be more effective than APR.
In terms of HAQ-DI conditional on PsARC response, the results from the preferred adjusted model were similar to the independent treatment effect analysis. The results from the unadjusted independent treatment effects model showed that significant reductions in mean HAQ-DI score were achieved with response to all nine treatments and response to placebo, although the improvement in response to placebo is below the minimum clinically significant threshold for PsA of –0.35.116 The median HAQ-DI score change was highest with INF and ETN, followed by 300 mg of SEC, but 150 mg of SEC and CZP were worse than all treatments except for APR.
The results of the unadjusted NMA for PASI, as a single outcome or as separate categorical variables, indicated that all treatments were more effective than placebo. The difference between treatments was uncertain, with wide CrIs that mostly overlap with each other. The results showed that patients treated with INF have the highest probability of achieving PASI 50, PASI 75 and PASI 90 responses. However, after adjustment for placebo, 300 mg of SEC has the highest probability of response. The probabilities for CZP changed between the models. It appears to be less efficacious than all other treatments, except APR and ETN, in achieving PASI responses in the unadjusted model. However, in the adjusted model, CZP appears to be more efficacious than GOL, UST, APR and ETN, and similar to ADA.
Similarly, for ACR responses, differences between treatments were uncertain, with wide CrIs that mostly overlapped with each other. The unadjusted results suggested that patients taking SEC or CZP had lower probabilities of a response than those for INF, ETN, GOL and ADA. After adjustment for placebo response, the probabilities of a response for both SEC and CZP increased; those for SEC were very similar to those for INF.
Biologic-experienced patients
The evidence for the biologic-experienced subpopulation is very sparse with only two trials evaluating two treatments. Hence, only two treatments (SEC and UST) could be included in these analyses. The results showed that, across all outcomes analysed, both SEC and UST were significantly more effective than placebo. Most of the results suggested SEC may be better than UST, although the results were uncertain with wide overlapping CrIs.
- Framework of analyses
- Psoriatic Arthritis Response Criteria response
- Health Assessment Questionnaire-Disability Index changes conditional on Psoriatic Arthritis Response Criteria response/non-response
- Psoriasis Area and Severity Index response
- American College of Rheumatology response
- Limitations
- Summary of findings of relative efficacy from network meta-analysis
- Evidence synthesis: relative efficacy of treatments - Certolizumab pegol and sec...Evidence synthesis: relative efficacy of treatments - Certolizumab pegol and secukinumab for treating active psoriatic arthritis following inadequate response to disease-modifying antirheumatic drugs: a systematic review and economic evaluation
Your browsing activity is empty.
Activity recording is turned off.
See more...