Vedolizumab for induction and maintenance of remission in Crohn's disease

Abstract Background Vedolizumab blocks inflammatory activity within the gastrointestinal tract. Systematic reviews have demonstrated the efficacy of vedolizumab in ulcerative colitis and inflammatory bowel disease in general. This systematic review and meta‐analysis summarises the current evidence of vedolizumab in the induction and maintenance of remission in Crohn's disease. Objectives To evaluate the benefits and harms of vedolizumab versus placebo for the induction and maintenance of remission in people with Crohn's disease. Search methods We used standard, extensive Cochrane search methods. The latest search date was 30 November 2022. Selection criteria We included randomised controlled trials (RCTs) and quasi‐RCTs comparing vedolizumab to placebo for the induction or maintenance of remission in people with Crohn's disease. Data collection and analysis We used standard Cochrane methods. For induction studies, the primary outcome was 1. clinical remission, and secondary outcomes were rates of 2. clinical response, 3. adverse events, 4. serious adverse events, 5. surgery, 6. endoscopic remission and 7. endoscopic response. For maintenance studies, the primary outcome was 1. maintenance of clinical remission, and secondary outcomes were rates of 2. adverse events, 3. serious adverse events, 4. surgery, 5. endoscopic remission and 6. endoscopic response. We used GRADE to assess certainty of evidence. Main results We analysed induction (4 trials, 1126 participants) and maintenance (3 trials, 894 participants) studies representing people across North America, Europe, Asia and Australasia separately. One maintenance trial administered subcutaneous vedolizumab whilst the other studies used the intravenous form. The mean age ranged between 32.6 and 38.6 years. Vedolizumab was superior to placebo for the induction of clinical remission (71 more per 1000 with clinical remission with vedolizumab; risk ratio (RR) 1.61, 95% confidence interval (CI) 1.20 to 2.17; number needed to treat for an additional beneficial outcome (NNTB) 13; 4 studies; high‐certainty evidence) and superior to placebo for inducing clinical response (105 more per 1000 with clinical response with vedolizumab; RR 1.43, 95% CI 1.19 to 1.71; NNTB 8; 4 studies; high‐certainty evidence). For the induction phase, vedolizumab may be equivalent to placebo for the development of serious adverse events (9 fewer serious adverse events per 1000 with vedolizumab; RR 0.91, 95% CI 0.62 to 1.33; 4 studies; low‐certainty evidence) and probably equivalent to placebo for overall adverse events (6 fewer adverse events per 1000 with vedolizumab; RR 1.01, 95% CI 0.93 to 1.11; 4 studies; moderate‐certainty evidence). Vedolizumab was superior to placebo for the maintenance of clinical remission (141 more per 1000 with maintenance of clinical remission with vedolizumab; RR 1.52, 95% CI 1.24 to 1.87; NNTB 7; 3 studies; high‐certainty evidence). During the maintenance phase, vedolizumab may be equivalent to placebo for the development of serious adverse events (3 fewer serious adverse events per 1000 with vedolizumab; RR 0.98, 95% CI 0.68 to 1.39; 3 studies; low‐certainty evidence) and probably equivalent to placebo for the development of overall adverse events (0 difference in adverse events per 1000; RR 1.00, 95% CI 0.94 to 1.07; 3 studies; moderate‐certainty evidence). Authors' conclusions High‐certainty data across four induction and three maintenance trials demonstrate that vedolizumab is superior to placebo in the induction and maintenance of remission in Crohn's disease. Overall adverse events are probably similar and serious adverse events may be similar between vedolizumab and placebo during both induction and maintenance phases of treatment. Head‐to‐head research comparing the efficacy and safety of vedolizumab to other biological therapies is required.


A B S T R A C T Background
Vedolizumab blocks inflammatory activity within the gastrointestinal tract. Systematic reviews have demonstrated the e icacy of vedolizumab in ulcerative colitis and inflammatory bowel disease in general. This systematic review and meta-analysis summarises the current evidence of vedolizumab in the induction and maintenance of remission in Crohn's disease.

Objectives
To evaluate the benefits and harms of vedolizumab versus placebo for the induction and maintenance of remission in people with Crohn's disease.

Search methods
We used standard, extensive Cochrane search methods. The latest search date was 30 November 2022.

Selection criteria
We included randomised controlled trials (RCTs) and quasi-RCTs comparing vedolizumab to placebo for the induction or maintenance of remission in people with Crohn's disease.

Data collection and analysis
We used standard Cochrane methods. For induction studies, the primary outcome was 1. clinical remission, and secondary outcomes were rates of 2. clinical response, 3. adverse events, 4. serious adverse events, 5. surgery, 6. endoscopic remission and 7. endoscopic response. For maintenance studies, the primary outcome was 1. maintenance of clinical remission, and secondary outcomes were rates of 2. adverse events, 3. serious adverse events, 4. surgery, 5. endoscopic remission and 6. endoscopic response. We used GRADE to assess certainty of evidence.

⊕⊕⊝⊝
Outcome occurred in 9% in the vedolizumab group compared with 9.2% in the placebo group.
*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).
Moderate certainty: we are moderately confident in the effect estimate: the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different. Low certainty: our confidence in the effect estimate is limited: the true effect may be substantially different from the estimate of the effect. Very low certainty: we have very little confidence in the effect estimate: the true effect is likely to be substantially different from the estimate of effect.
a Not downgraded for study limitations even though there were some risk of bias domains that were unclear. Overall these were not considered serious. b Downgraded once for imprecision due to a moderately narrow confidence interval. c Downgraded twice for imprecision due to a wide confidence interval.

⊕⊕⊝⊝
Outcome occurred in 13.1% in the vedolizumab group compared with 13.7% in the placebo group.
*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

Library
Trusted evidence. Informed decisions. Better health.
Cochrane Database of Systematic Reviews

Description of the condition
Crohn's disease (CD) is a chronic inflammatory bowel disease (IBD) characterised by transmural inflammation of the gastrointestinal tract. Its pathophysiology is thought to involve a complex interplay between genetic susceptibility, immune and environmental factors (Boyapati 2015). Worldwide, the incidence of CD is increasing, with the highest incidence in westernised nations (Molodecky 2012).
Symptoms depend on the area of bowel involved but o en include diarrhoea, abdominal pain, gastrointestinal bleeding and weight loss. Complications may further arise with the development of stricturing or fistulising disease. Conventional therapy is with corticosteroids followed by an immunomodulator (methotrexate, azathioprine, 6-mercaptopurine) or a tumour necrosis factor (TNF) inhibitor (infliximab, adalimumab, certolizumab). However, this approach results in up to 55% to 60% of people failing to achieve remission at one year following diagnosis (D'Haens 2008). Despite the advent of TNF inhibitors, many people have either primary nonresponse or secondary loss of response and, as such, new therapies with di erent mechanisms have been advanced.

Description of the intervention
Vedolizumab is a humanised monoclonal antibody which inhibits the α4β7 integrin. Integrins are adhesion molecules which allow for lymphocyte tra icking, an important process in T-cell-mediated inflammation. The value of inhibiting α4 integrins was recognised in the form of natalizumab, for both the treatment of multiple sclerosis and CD (von Andrian 2003). However, the contemporary use of natalizumab for CD has been limited by the risk of progressive multifocal leukoencephalopathy (Bloomgren 2012). Vedolizumab specifically inhibits the α4β7 integrin from binding to MAdCAM-1, a molecule selectively expressed in the gastrointestinal tract (Butcher 1996). This more selective mechanism of action should theoretically reduce the likelihood of progressive multifocal leukoencephalopathy.

How the intervention might work
The value of α4β7 as a target in the treatment of IBDs was initially demonstrated in animal colitis models (Hesterberg 1996). Furthermore, a previous Cochrane Review suggested vedolizumab is e ective in inducing and maintaining remission in moderate-tosevere ulcerative colitis (UC) (Bickston 2014).

Why it is important to do this review
This review aims to highlight the e icacy and risks of vedolizumab in CD compared to placebo. A prior systematic review in 2014 concluded that vedolizumab was more e ective than placebo as an induction and maintenance therapy for IBD, which includes CD and UC (Wang 2014). Similarly, another systematic review and meta-analysis in 2014 concluded that vedolizumab was superior to placebo for inducing remission and response in UC (Bickston 2014).
Moćko 2016 previously published a systematic review and metaanalysis for the e ectiveness and safety of vedolizumab in Crohn's disease and identified two studies for quantitative analysis. This analysis was limited to outcomes within the induction phase of vedolizumab treatment (induction of remission, clinical response and safety).
This is an up-to-date review and analysis for outcomes in both the induction and maintenance phases of vedolizumab in CD.

O B J E C T I V E S
To evaluate the benefits and harms of vedolizumab versus placebo for the induction and maintenance of remission in people with Crohn's disease.

M E T H O D S
Criteria for considering studies for this review

Types of studies
We included randomised controlled trials (RCTs) and quasi-RCTs (where treatment allocations were determined by non-randomised methods). This included published conference abstracts.

Types of participants
We included adults, children, or both with CD (defined by clinical, histological or endoscopic criteria) in this review. We also considered studies where only a subset of participants met the inclusion criteria.

Types of interventions
We included studies comparing vedolizumab to placebo at any dose, frequency and route of administration (subcutaneous and intravenous (IV)).

Types of outcome measures
We analysed studies which looked at induction of remission (induction studies) separately to studies where the primary outcome was to maintain remission (maintenance studies).

Primary outcomes
For induction studies: • proportion of people who achieved clinical remission For maintenance studies: • proportion of people who maintained clinical remission

Secondary outcomes
For induction studies: Cochrane Database of Systematic Reviews

Timing of outcome measurement
For induction studies, outcomes were measured a er induction vedolizumab and prior to commencement of maintenance therapy, as defined by the authors.
For maintenance studies, outcomes were measured at or a er 52 weeks where available.

Search methods for identification of studies
The search strategies are reported in Appendix 1.

Electronic searches
We searched the following electronic databases: •

Searching other resources
We conducted further searches based on reference lists of identified studies. We searched key terms in major conference abstracts (Digestive Diseases Week, United European Gastroenterology Week, European Crohn's and Colitis Organisation Congress) between December 2008 and November 2022 to identify additional unpublished trials.
We restricted the search to 2008 onwards as the first industry sponsored phase 2 trial in CD was published in this year (Feagan 2008).

Selection of studies
Two review authors (SH and AK) independently screened titles and abstracts from our literature search for relevance based on our inclusion criteria. We retrieved and reviewed the full text of potentially relevant publications. We resolved any disagreements between review authors regarding inclusion criteria through discussion with a third review author (RB).

Data extraction and management
We extracted the following data onto a piloted data collection form.

Assessment of risk of bias in included studies
Two review authors (SH and AK) independently assessed the methodological quality of each study using the Cochrane RoB 1 tool (Higgins 2017). We judged the following factors at high, low or unclear risk of bias: • sequence generation (i.e. was the allocation sequence adequately generated?); • allocation sequence concealment (i.e. was allocation adequately concealed?); • blinding (i.e. was knowledge of the allocated intervention adequately prevented during the study?); • incomplete outcome data (i.e. were incomplete outcome data adequately addressed?); • selective outcome reporting (i.e. were reports of the study free of suggestion of selective outcome reporting?); • other potential sources of bias (i.e. was the study apparently free of other problems that could have put it at high risk of bias?).

Measures of treatment e ect
We analysed data using Review Manager Web on an intention-totreat basis (RevMan Web 2020). Primary and secondary outcomes were all dichotomous, and we expressed results as risk ratios (RR) with corresponding 95% confidence intervals (CI). The investigators of included studies set the definitions for clinical and endoscopic remission.

Unit of analysis issues
When studies reported multiple observations for the same outcome, we combined the outcomes for fixed intervals of followup (e.g. clinical remission at eight weeks). We included cross-over trials if data were available from the first phase of the study (i.e. before any cross-over). Where studies allocated participants to more than one treatment arm, then we pooled these arms for the primary analysis. Although some studies may have reported more than one e icacy or safety event per participant, the primary analysis considered the proportion of participants who experienced at least one event.
Cluster-randomised trials were eligible for inclusion in this review. If any cluster-randomised trials were identified, we intended to adjust for clustering using an estimate of the intraclass correlation coe icient as described in the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2017).

Dealing with missing data
For studies with missing or unclear data, we contacted the study authors by e-mail. We counted dichotomous data that remained missing or unclear as a treatment failure, in line with the intention-to-treat principle. For missing continuous data, we planned to conduct an available-case analysis. Where appropriate, we conducted sensitivity analyses to assess the impact of including unclear data on the e ect estimate.

Library
Trusted evidence. Informed decisions. Better health.

Cochrane Database of Systematic Reviews
We contacted the study authors by e-mail to follow-up other missing information, such as study design and standard deviations.

Assessment of heterogeneity
We planned to assess for heterogeneity first by visual inspection of forest plots. We observed the presence of statistical heterogeneity based on the Chi 2 test (with a P value of 0.10 considered significant). We then aimed to quantify statistical heterogeneity using the I 2 statistic, in line with the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2017). We based our interpretation of the I 2 statistic on: • 0% to 40%: might not be important; • 30% to 60%: may represent moderate heterogeneity; • 50% to 90%: may represent substantial heterogeneity; • 75% to 100%: considerable heterogeneity.
In situations of moderate-to-considerable heterogeneity, we aimed to exclude visually obvious outliers if there were methodological or clinical factors present in those studies to explain the heterogeneity.

Assessment of reporting biases
We evaluated potential reporting bias by comparing outcomes listed in protocols to published manuscripts. If the protocols were unavailable, we compared outcomes listed in the methods section of published manuscripts to those described in the results section. If there were a su icient number of studies included (i.e. more than 10) in the pooled analyses, we planned to investigate potential publication bias using funnel plots.

Data synthesis
We combined data for meta-analysis when we determined, by consensus, that participant groups, interventions and outcomes were su iciently similar. For binary outcomes, we calculated the pooled RR and 95% CIs. We used a random-e ects model to pool studies.

Subgroup analysis and investigation of heterogeneity
We performed subgroup analysis for the primary outcomes according to prior TNF inhibitor failure, compared to those who had not previously failed TNF inhibitors.

Sensitivity analysis
Sensitivity analyses examined the impact of the following variables on the pooled e ect.
• Random-e ects versus fixed-e ect model • Only including studies at low risk of bias across all domains (selection, performance, detection, attrition and reporting bias) • Loss to follow-up (greater than 10% versus less than 10%)

Summary of findings and assessment of the certainty of the evidence
We assessed the certainty of the total body of evidence using the GRADE criteria (Schünemann 2011). Evidence from RCTs was considered high certainty and was downgraded according to: • study limitations (risk of bias); • indirectness; • inconsistency (unexplained heterogeneity); • imprecision: • for the optimum information size calculation, we used a previously published resource that o ers data on appropriate sample sizing for trials in this field (Gordon 2021); • for e ects that crossed the line of no e ect, we used the size of CIs to judge for imprecision. As there is no existing published resource in the field to judge imprecision based on CI sizes, we determined the following ranges following discussion within our review team: ▪ for serious adverse events, we defined a narrow CI as within ± 10 per 1000 events, a moderately narrow CI as within ± 20 per 1000 events and anything greater than ± 20 per 1000 events as a wide CI; ▪ for overall adverse events, we defined a narrow CI as within ± 30 per 1000 events, a moderately narrow CI as within ± 50 per 1000 events and anything greater than ± 50 per 1000 events as a wide CI; • publication bias.
We classified the overall certainty of the evidence for each outcome as: high certainty (i.e. further research is very unlikely to change our confidence in the estimate of e ect); moderate certainty (i.e. further research is likely to have an important impact on our confidence in the estimate of e ect and may change the estimate); low certainty (i.e. further research is very likely to have an important impact on our confidence in the estimate of e ect and is likely to change the estimate); or very low certainty (i.e. we are very uncertain about the estimate).
We used GRADEpro GDT to produce the summary of findings tables. The tables included the following key outcomes.

Description of studies
The results of the search are presented in the PRISMA flow diagram ( Figure 1). Study characteristics are included in Table 1 and Table 2.  There are no studies awaiting classification or ongoing.

Included studies
A summary of key characteristics across the included studies is shown in Table 1, Table 3, Table 4 (induction studies) and Table 2,  Table 5, Table 6 (maintenance studies), and the Characteristics of included studies table.

Study design
This review included five RCTs. Several trials had separate induction and maintenance phases, where a second randomisation would occur at the maintenance phase amongst those who had responded to induction therapy. Therefore, we analysed these studies as separate induction and maintenance trials.

Induction studies
Four RCTs were induction studies (Feagan 2008; Sandborn 2013 -Induction Phase; Sands 2014; Watanabe 2020 -Induction Phase). All were multicentre studies; Sandborn 2013 -Induction Phase and Sands 2014 were conducted across multiple countries, while Feagan 2008 was based in Canada and Watanabe 2020 -Induction Phase was a Japanese cohort.

Maintenance studies
Three RCTs were maintenance trials (Sandborn 2013 -Maintenance Phase; Vermeire 2021; Watanabe 2020 -Maintenance Phase). Sandborn 2013 -Maintenance Phase and Watanabe 2020 -Maintenance Phase had an earlier randomised induction phase that then followed into a second randomisation for the maintenance phase. Vermeire 2021 only included a randomised maintenance phase with an open-label, non-placebo-controlled induction phase. This study was multicentre across multiple di erent countries.

Participants
For the induction phase, the studies included 1025 participants with active CD. For the maintenance phase, the studies included 895 participants who had active CD and had then developed a clinical response to induction vedolizumab.

Induction studies
• Feagan 2008 compared IV vedolizumab (reported as MLN0002) 0.5 mg/kg and 2 mg/kg groups versus placebo, administered at days one and 29. • Sandborn 2013 -Induction Phase compared IV vedolizumab 300 mg versus placebo, administered at weeks zero and two. • Sands 2014 compared IV vedolizumab 300 mg versus placebo, administered at weeks zero, two and six. • Watanabe 2020 -Induction Phase compared IV vedolizumab 300 mg versus placebo, given at weeks zero, two and six.

Maintenance studies
• Sandborn 2013 -Maintenance Phase compared IV vedolizumab 300 mg in an eight-weekly group, to a four-weekly group and a placebo group, amongst those who had responded in an induction phase, measured at week six. • Vermeire 2021 compared subcutaneous vedolizumab 108 mg to a placebo group, administered two weekly, amongst those who had responded to an induction phase of IV vedolizumab measured at week six. • Watanabe 2020 -Maintenance Phase compared IV vedolizumab 300 mg at an eight-weekly interval to placebo, amongst those who had responded to an induction phase of IV vedolizumab measured at week 10.

Control/comparisons
All studies (both induction and maintenance) were placebo controlled.

Induction studies
• Feagan 2008 and Watanabe 2020 -Induction Phase did not specify the type of IV placebo administered. • Sands 2014 and Sandborn 2013 -Induction Phase used 250 mL of 0.9% sodium chloride for the placebo group.

Maintenance studies
• Sandborn 2013 -Maintenance Phase used 250 mL of 0.9% sodium chloride for the placebo group. • Watanabe 2020 -Maintenance Phase used an unspecified IV placebo. Vermeire 2021 used an unspecified subcutaneous placebo.

Induction studies
• Feagan 2008 allowed participants to use concurrent mesalamine. • Sandborn 2013 -Induction Phase allowed participants to use concomitant corticosteroids and immunosuppressive agents, but not recent biological agents, mesalamine or topical glucocorticoids. • Sands 2014 allowed participants to use corticosteroids, immunosuppressives or mesalamine. • Watanabe 2020 -Induction Phase allowed participants to use corticosteroids, immunosuppressives or mesalamine.

Library
Trusted evidence. Informed decisions. Better health.

Maintenance studies
• Sandborn 2013 -Maintenance Phase allowed participants to use concomitant corticosteroids and immunosuppressive agents, but not recent biological agents, 5-aminosalicylic acid or topical glucocorticoids. • Vermeire 2021 allowed participants to use concomitant corticosteroids, mesalamine or immunosuppressive agents. • Watanabe 2020 -Maintenance Phase allowed participants to use corticosteroids, immunosuppressives or mesalamine.

Disease activity
All studies reported disease activity at the beginning of all induction and maintenance phases. All the induction studies required at least moderate disease activity (Crohn's Disease Activity Index (CDAI) of 220 or greater). All the maintenance studies required a clinical response to induction vedolizumab, defined by a CDAI reduction of 70 or greater.

Induction studies
In Feagan 2008, the baseline mean CDAI was 288 for the vedolizumab 0.5 mg/kg, 296 for the vedolizumab 2 mg/kg group and 288 for the placebo group. In Sandborn 2013 -Induction Phase, the baseline mean CDAI was 327 in the vedolizumab group and 325 in the placebo group. In Sands 2014, the baseline mean CDAI was 314 in the vedolizumab group and 301 in the placebo group. In Watanabe 2020 -Induction Phase, the mean CDAI was 304 in the vedolizumab group and 295 in the placebo group.

Maintenance studies
All three maintenance studies were preceded by an induction arm. Participants who developed a clinical response (CDAI reduction of 70 or greater) to induction therapy were then randomised for the maintenance study. In Sandborn 2013 -Maintenance Phase, the mean CDAI following induction therapy was not reported. In Vermeire 2021, the median CDAI at week six was 150.5 in the subcutaneous vedolizumab group and 147.5 in the placebo group. In Watanabe 2020 -Maintenance Phase, the mean CDAI at week 10 was 147.9 in the vedolizumab group and 149.7 in the placebo group.

Disease duration
All studies reported disease duration, which ranged between a mean of 7.5 years and 9.6 years.

Induction studies
In Feagan 2008, the mean disease duration was 8.8 years for the vedolizumab 0.5 mg/kg group, 8.0 years for the vedolizumab 2 mg/ kg group and 9.1 years for the placebo group. Sandborn 2013 -Induction Phase reported a mean disease duration of 9.2 years for the vedolizumab group and 8.2 years for the placebo group. Sands 2014 reported a mean disease duration of 8.4 years for the vedolizumab group and 8.0 years for the placebo group. In Watanabe 2020 -Induction Phase, the mean disease duration was 9.0 years in the vedolizumab group and 9.1 years in the placebo group.

Maintenance studies
Sandborn 2013 -Maintenance Phase reported a mean disease duration of 8.4 years in the eight-weekly vedolizumab group, 7.7 years in the four-weekly vedolizumab group and 9.6 years in the placebo group. Vermeire 2021 reported a mean disease duration of 9.5 years for the vedolizumab group and 8.2 years for the placebo group. Watanabe 2020 -Maintenance Phase reported a mean disease duration of 9.0 years in the vedolizumab group and 7.5 years in the placebo group.

Extent of disease
All studies except Feagan 2008 reported extent of disease. Ileocolonic disease was the most common disease distribution amongst all reported induction and maintenance studies.

Induction studies
Feagan 2008 did not report disease extent or distribution. Sandborn 2013 -Induction Phase reported ileal-only disease in 16.8% of the vedolizumab group and 14.2% of the placebo group; colononly disease in 28.2% of the vedolizumab group and 29.1% of the placebo group; and ileocolonic disease in 55% of the vedolizumab group and in 56.8% of the placebo group. Sands 2014 reported ileal-only disease in 16% of the vedolizumab group and 14% of the control group; colon-only disease in 23% of the vedolizumab group and 25% of the placebo group; and ileocolonic disease in 61% of both groups. Watanabe 2020 -Induction Phase reported ilealonly disease in 16.5% of the vedolizumab group and 11.5% of the placebo group; colon-only disease 13.9% of the vedolizumab group and 24.4% of the placebo group; and ileocolonic disease in 69.6% of the vedolizumab group and 64.1% of the placebo group.

Maintenance studies
Sandborn 2013 -Maintenance Phase reported ileal-only disease in 19% of the eight-weekly vedolizumab group, 22% of the fourweekly vedolizumab group and 12% for the placebo group; colonic disease in 18% of the eight-weekly vedolizumab, 31% of the fourweekly vedolizumab group and 28% of the placebo group; and ileocolonic disease in 64% of the eight-weekly vedolizumab group, 47% of the four-weekly vedolizumab group and 59% of the placebo group. Vermeire 2021 reported ileal-only disease in 24% in the vedolizumab group and 15.7% in the placebo group; colonic-only disease in 20% of the vedolizumab group and 19.4% of the placebo group; ileocolonic disease in 44.4% of the vedolizumab group and 55.2% of the placebo group; and "other" disease locations in 11.3% of the vedolizumab group and 9.7% of the placebo group. Watanabe 2020 -Maintenance Phase reported ileal disease in 16.7% of both groups; colonic disease in 41.7% of the vedolizumab group and 8.3% of the placebo group; and ileocolonic disease in 41.7% of the vedolizumab group and 75% of the placebo group.

Age
All studies reported mean or median participant age. In the induction studies, the mean ranged between 32.6 and 38.6 years. In the maintenance studies, the mean age ranged between 34.9 and 38.2 years.

Induction studies
All four induction studies were randomised. Two studies had su icient information within the study or protocol about randomisation to judge them at low risk of bias (Sandborn 2013 -Induction Phase; Sands 2014). Two studies did not mention the randomisation method and so were at unclear risk (Feagan 2008; Watanabe 2020 -Induction Phase). We wrote to the study authors and received no clarification.
Three induction studies provided su icient information about allocation concealment to characterise them as low risk (Sandborn 2013 -Induction Phase; Sands 2014; Watanabe 2020 -Induction Phase). We contacted the authors for Feagan 2008, but received no response (unclear risk).

Maintenance studies
All three maintenance studies were RCTs. Sandborn 2013 -Maintenance Phase outlined the randomisation process within the protocol so was at low risk. We contacted the authors for Vermeire 2021 who confirmed the randomisation schedule was generated using interactive response technology and so was at low risk. The specific randomisation process for Watanabe 2020 -Maintenance Phase was unclear so was at unclear risk.
Allocation concealment was low risk for Sandborn 2013 -Maintenance Phase and Watanabe 2020 -Maintenance Phase where investigators were blinded to the allocations, except for unblinded pharmacists at treating sites. We contacted the authors for Vermeire 2021 who confirmed that the personnel from the vendor who had access to the randomisation schedule were not involved in the study conduct or data analysis. This was judged at low risk.

Blinding
All studies (induction and maintenance) were described as doubleblind.

Induction studies
All four induction studies were placebo controlled and at low risk of bias for blinding of participants and personnel (Feagan 2008; Sandborn 2013 -Induction Phase; Vermeire 2021; Watanabe 2020 -Induction Phase).
Sandborn 2013 -Induction Phase was at low risk of bias for blinding for outcome assessment. The authors highlighted in their protocol that the study sponsors were unblinded and analysed the data only a er completion of the induction phase. We contacted the authors for Sands 2014, who confirmed that all outcome assessors were blinded to the treatment assignment and so this was at low risk. Feagan 2008 and Watanabe 2020 -Induction Phase contained insu icient information to determine risk of bias for blinding of outcome assessment. We contacted the authors but received no information (unclear risk).

Maintenance studies
All three maintenance studies were placebo controlled and considered low risk of bias for blinding of participants and personnel (Sandborn 2013 -Maintenance Phase; Vermeire 2021; Watanabe 2020 -Maintenance Phase). Sandborn 2013 -Maintenance Phase had two treatment arms (four-weekly and eight-weekly vedolizumab) and a placebo arm. To preserve the blinding, all participants were administered four-weekly study drug or placebo.
Only Sandborn 2013 -Maintenance Phase was at low risk of blinding for outcome assessment. Procedures were in place to preserve the blinding until the completion of the maintenance phase. The other two studies contained insu icient information to determine risk of bias for blinding of outcome assessment (Vermeire 2021; Watanabe 2020 -Maintenance Phase). We contacted the authors but received no information (unclear risk).

Incomplete outcome data
All four induction studies and all three maintenance studies were at low risk of attrition bias. They all had low discontinuation rates, and they were balanced between groups.

Selective reporting
All four induction studies and all three maintenance studies were at low risk of selective reporting. All results were reported as outlined in their methods sections. Whilst all studies reported a pretrial protocol, only Sandborn 2013 -Induction Phase and Sandborn 2013 -Maintenance Phase had a published protocol that we could access. These study's results matched their registered outcomes, except one of the secondary outcomes in the Sandborn 2013 -Induction Phase protocol (CDAI-100 response) was changed from a secondary endpoint to primary endpoint.

Other potential sources of bias
All studies had su icient information on baseline characteristics between groups and were at low risk of other biases.

E ects of interventions
See: Summary of findings 1 Vedolizumab compared to placebo for induction of remission in Crohn's disease; Summary of findings 2 Vedolizumab compared to placebo for maintenance of remission in Crohn's disease See Summary of findings 1 for the results of the induction studies.

Induction studies
Four studies compared vedolizumab to placebo for the induction of remission in CD (Feagan 2008; Sandborn 2013 -Induction Phase; Sands 2014; Watanabe 2020 -Induction Phase). The induction phase consisted of either two or three doses of IV vedolizumab and outcomes were measured between weeks six and 10.
We conducted a subgroup analysis for the primary outcome, amongst participants who had previously failed TNF inhibitor therapy and those who had not. In this analysis, the test for subgroup di erences showed no evidence of a di erence between the subgroups (P = 0.21). In those who had previously failed TNF inhibitor therapy (613 participants), there was no evidence of a Cochrane Database of Systematic Reviews di erence in induction of clinical remission between groups (12% with vedolizumab versus 10% with placebo; RR 1.21, 95% CI 0.65 to 2.25; Analysis 1.1), although this result was a ected by imprecision and some inconsistency (I 2 = 27%). In those who had not failed anti-TNF alpha therapy (513 participants), induction of clinical remission may be more likely in the vedolizumab compared to the placebo group (28% with vedolizumab versus 14% with placebo; RR 1.94, 95% CI 1.32 to 2.84; Analysis 1.1). However, in the absence of a di erence between subgroups, we cannot be certain whether there is any true di erence between subgroups.
When we used a fixed-e ect method of analysis, our conclusions remained the same. All studies were at low risk of bias and had a less than 10% loss to follow-up, so these prespecified sensitivity analyses were not performed.

Induction of clinical response (CDAI-100 response)
Four induction studies recorded clinical response. When we used a fixed-e ect method of analysis our conclusions remained the same. We did not perform other preplanned sensitivity analyses.

Adverse events
All four studies recorded the proportion of participants who developed any adverse event.
For the development of one or more adverse event (treatment or non-treatment related) during induction therapy, there was no evidence of a di erence between groups (64.1% with vedolizumab versus 61.9% with placebo; RR 1.01, 95% CI 0.93 to 1.11; 1126 participants; Analysis 1.3).
When we used a fixed-e ect method of analysis our conclusions remained the same. We did not perform other preplanned sensitivity analyses.

Serious adverse events
All four studies recorded the proportion of participants who developed any serious adverse event.
For the development of one or more serious adverse event during induction therapy, there was no evidence of a di erence between groups (9.0% with vedolizumab versus 9.2% with placebo; RR 0.91, 95% CI 0.62 to 1.33; 1126 participants; Analysis 1.4).
When we used a fixed-e ect method of analysis our conclusions remained the same. We did not perform other preplanned sensitivity analyses.

Surgery
No studies reported the proportion of participants requiring surgery during the induction phase.

Endoscopic remission
No studies reported endoscopic remission during the induction phase.

Endoscopic response
No studies reported endoscopic response during the induction phase.

Maintenance studies
See Summary of findings 2 for results of the maintenance studies.
Three studies compared vedolizumab to placebo for the maintenance of remission in CD (Sandborn 2013 -Maintenance Phase; Vermeire 2021; Watanabe 2020 -Maintenance Phase). Vermeire 2021 used subcutaneous vedolizumab whilst Sandborn 2013 -Maintenance Phase and Watanabe 2020 -Maintenance Phase used IV vedolizumab. Outcomes were recorded between weeks 52 and 60.

Maintenance of clinical remission
In this pooled analysis of three studies (894 participants When we used a fixed-e ect method of analysis our conclusions remained the same. All studies were at low risk of bias and had a less than 10% loss to follow-up, so these prespecified sensitivity analyses were not performed.

Adverse events
All maintenance studies recorded the proportion of participants who developed any adverse event.
For the development of one or more adverse event (treatment or non-treatment related) during maintenance therapy, there was no When we used a fixed-e ect method of analysis our conclusions remained the same. We did not perform other preplanned sensitivity analyses.

Serious adverse events
All maintenance studies recorded the proportion of participants who developed serious adverse events.
For the development of one or more serious adverse event during maintenance therapy, there was no evidence of a di erence between groups (13.1% with vedolizumab versus 13.7% with placebo; RR 0.98, 95% CI 0.68 to 1.39; Analysis 2.3).
When we used a fixed-e ect method of analysis our conclusions remained the same. We did not perform other preplanned sensitivity analyses.

Surgery
No studies reported the proportion of participants requiring surgery during the maintenance phase.

Endoscopic remission
No studies reported endoscopic remission during the maintenance phase.

Endoscopic response
No studies reported endoscopic response during the maintenance phase.

Summary of main results
Four induction RCTs enroling 1126 participants and three maintenance RCTs enroling 894 participants met the criteria for inclusion in this review.

Induction phase
• The evidence is very certain that vedolizumab is superior to placebo in inducing clinical remission in CD. • The evidence is very certain that vedolizumab is superior to placebo in inducing a clinical response (CDAI-100 response). • There was no evidence of a di erence in overall adverse events between vedolizumab and placebo during induction therapy, but the evidence was of moderate certainty due to moderately narrow CIs. • There was no evidence of a di erence in serious adverse events between vedolizumab and placebo during induction therapy, but the evidence was of low certainty due to imprecision from wide CIs. • No induction studies reported the rate of endoscopic remission, endoscopic response or surgery.

Maintenance phase
• The evidence is very certain that vedolizumab is superior to placebo in maintaining clinical remission in CD.
• There was no evidence of a di erence in overall adverse events between vedolizumab and placebo during maintenance therapy, but the evidence was of moderate certainty due to moderately narrow CIs. • There was no evidence of a di erence in serious adverse events between vedolizumab and placebo during maintenance therapy, but the evidence was of low certainty due to imprecision from sparse events. • No maintenance studies reported the rate of endoscopic remission, endoscopic response or surgery.

Overall completeness and applicability of evidence
We used a comprehensive peer-reviewed search strategy at the protocol stage to minimise the likelihood of missing eligible reports (Hui 2020). We are unaware of any unpublished data related to the study question, although there is always the potential that randomised data within the grey literature have been missed.
The overall results were mostly relevant to the study question in our protocol. Within our protocol for induction studies, induction of endoscopic remission and the need for surgery were secondary outcomes that none of the included trials reported. For our maintenance studies, rate of endoscopic relapse and surgery were secondary outcomes that none of the included trials reported. The lack of endoscopic relapse assessment does somewhat limit the applicability of the evidence, particularly given the increasing recognition of mucosal healing as a target to achieve long-term outcomes in the management of CD (De Cruz 2013; Shah 2016). Despite this, all identified induction and maintenance studies contributed to the primary outcome and the main purpose of the systematic review and meta-analysis was met.
For both the induction and maintenance studies, there was variation in the route and dosing of vedolizumab. In contemporary clinical practice, induction dosing consists of IV vedolizumab 300 mg at weeks zero, two and six. For the induction studies, Feagan 2008 administered doses which would be considered subtherapeutic. Sandborn 2013 -Induction Phase assessed outcomes only following doses a week zero and two. Within the maintenance studies, Vermeire 2021 demonstrated subcutaneous vedolizumab was superior to placebo for the maintenance of remission a er IV induction. As the underlying mechanism of action is identical to the IV form, we considered it appropriate to include these data in the overall meta-analysis.
Our analysis highlights uncertainty as to whether there is a subgroup di erence between those who had previously failed a TNF inhibitor and those who had not. For the induction of remission in the subgroup who had previously failed TNF inhibitor therapy, vedolizumab may not be superior to placebo, although this result was a ected by imprecision and some inconsistency.
In the maintenance studies, vedolizumab may still be superior to placebo, regardless of previous TNF inhibitor failure. It may be that there is a subclass of people with CD who respond well to vedolizumab induction therapy despite prior TNF inhibitor failure and proceed to develop a sustained response. However, the identification of this subset of patients is not yet defined, and raises the wider challenges of precision medicine through the use of biomarkers in CD (Boyapati 2016).
The timing of outcome measurement for the induction studies is worth highlighting given that vedolizumab is frequently viewed Cochrane Database of Systematic Reviews as a biological with a comparatively slower onset of action in contemporary practice. This is reflected in expert consensus from the STRIDE-II initiative (Selecting Therapeutic Targets in Inflammatory Bowel Disease), which suggests a median time to clinical response of 11 weeks and time to clinical remission of 17 weeks (Turner 2021). By contrast, within the four induction studies included in this review, outcome measurement ranged between six and 10 weeks. Notably, Feagan 2008 and Sandborn 2013 -Induction Phase documented induction outcomes following two doses of vedolizumab while Sands 2014 and Watanabe 2020 -Induction Phase recorded outcomes a er three doses. This may represent an additional explanation as to why vedolizumab may be e ective at maintaining clinical remission within the TNF-inhibitor failed subgroup, whereas there was less certainty of its e ect at inducing remission in this cohort. Furthermore, it was found that vedolizumab may be e ective at inducing a clinical response within this TNF-inhibitor failed subgroup, again highlighting the potential that the assessment of clinical remission may have been conducted prematurely. However, STRIDE-II stresses that the recommendations for onset of action are guided by a rough estimate of experts' opinion due to a paucity of high-quality scientific evidence.

Quality of the evidence
The overall body of evidence allows a robust conclusion regarding the objective of this review. We included four induction studies (1126 participants) and three maintenance studies (894 participants). The methodological basis for these conclusions is sound. All trials were randomised, placebo controlled and described as double-blinded.
The RoB 1 tool suggested a low risk of bias across most domains for induction and maintenance studies. There were several areas of unknown risk of bias despite contacting the study authors for clarification. Nonetheless, these overall limitations were not serious and the evidence was not downgraded for risk of bias for any of the outcomes within the summary of findings tables.
We did not downgrade for inconsistency for any outcomes across both induction and maintenance studies. The I 2 value was low (0% to 15%) across all outcomes. We did not downgrade for indirectness for any outcomes for either induction or maintenance studies. Specifically, there was no major indirectness with regards to the population, intervention and outcome measurement.
We downgraded twice for serious adverse events for both induction and maintenance studies due to wide CIs. We downgraded once for adverse events for both induction and maintenance studies due to moderately wide CIs.
We did not downgrade for publication bias for any outcomes. All included studies were RCTs and there were no observational data included in this review. There was an insu icient number of studies to construct a funnel plot.

Potential biases in the review process
One area of potential bias was the changes introduced between the protocol and review stage. The major change was timing of outcome measurement. In the protocol, outcomes were to be measured at weeks six, 12 and 52 where available (Hui 2020). On subsequent review of the available trials, these were broadly divided into induction and maintenance studies with separate randomisation phases. For both induction and maintenance studies, there was some variation in timepoints for outcome measurement and so we removed the timepoints for induction study outcomes. Despite this, two of four studies representing the majority of participants were measured at week six, whilst Feagan 2008 reported results at week nine (day 57) and Watanabe 2020 -Induction Phase recorded outcomes at week 10. For maintenance studies, in the review, we determined that outcomes were most appropriate to be measured at or a er 52 weeks rather than strictly at week 52. In our results for maintenance studies, two of three studies reported week 52 data while Watanabe 2020 -Maintenance Phase reported results at week 60. Overall, the variation in the timing of outcome measurement across both induction and maintenance studies was small and unlikely to impact the overall results significantly.
Another limitation of the review was the heterogeneity in intervention dosage and route (subcutaneous versus IV) between trials. Amongst the induction studies, Feagan 2008 and Sandborn 2013 -Induction Phase used vedolizumab dosing that would be considered subtherapeutic in contemporary practice. In the maintenance studies, Vermeire 2021 investigated subcutaneous vedolizumab.
Finally, we identified only one published protocol for the studies in our systematic review, and there were no major discrepancies between planned and reported outcomes. For the studies which did not have a published protocol, this introduced a theoretical risk of publication bias. However, within these trials, the interventions and outcomes described in the methods section were consistent with the results. Furthermore, the primary and secondary outcomes were consistent between trials, including the definition of induction and maintenance of remission, clinical response and adverse events.

Agreements and disagreements with other studies or reviews
This is the first Cochrane Review to investigate the e icacy and safety of vedolizumab in CD. Our findings for induction of remission were similar to a previously published systematic review (Moćko 2016), including the uncertainty of vedolizumab compared to placebo in TNF inhibitor-experienced participants for induction of clinical remission. The overall safety and e icacy of vedolizumab in inducing and maintaining remission in IBD (both CD and UC) was also investigated in a prior systematic review (Wang 2014). This review demonstrated vedolizumab was superior to placebo for the induction and maintenance of remission of IBD, including within a subgroup analysis of induction of remission of CD.
US guidelines support the use of vedolizumab in CD and particularly recommend its use for maintenance therapy where vedolizumab has successfully induced remission (Lichtenstein 2018). UK guidelines also support the use of vedolizumab in active CD, and specifically include patients where TNF inhibitors have previously failed (Lamb 2019). The basis of this latter recommendation is the Swedish Inflammatory Bowel Disease Registry (SWIBREG) (Eriksson 2017), which reported clinical remission of 54% at a median follow-up of 17 months amongst a cohort of people with active CD (86% of whom had previously failed TNF inhibitor therapy). These data were notably not placebo controlled or blinded. Even at six to 10 weeks within our metaanalysis, it should be noted that 10% of participants receiving The GEMINI long-term safety study (Vermeire 2017) was an openlabel extension study to GEMINI 2 which o ered four-weekly vedolizumab as maintenance therapy. This study continued to report long-term clinical remission rates that were statistically similar between TNF inhibitor-failed and TNF inhibitor-naive people at week 152. This is largely consistent with the results of our review, where vedolizumab was probably superior to placebo in maintaining clinical remission in the subgroup who had previously failed TNF inhibitors.

A U T H O R S ' C O N C L U S I O N S Implications for practice
There is high-certainty evidence that vedolizumab is e ective at inducing and maintaining clinical remission in Crohn's disease.
There is low-to moderate-certainty evidence that there may be no increased risk of adverse events compared to placebo.
The certainty of the evidence is primarily impacted by imprecision, due to wide confidence intervals in the estimate of the e ect size.

Implications for research
This review highlights there is minimal need to consider further induction and maintenance studies to demonstrate the e icacy and safety of vedolizumab in Crohn's disease when people are mixed populations of those who have had prior anti-tumour necrosis factor (TNF) exposure and those who have not.
The findings have shown that future research should investigate the role of vedolizumab in people as separate trials considering whether they had experienced prior failure with TNF-inhibitor therapy given the observed suggestion of di erence in these groups.
Furthermore, future research must also consider the e icacy and safety of vedolizumab compared in head-to-head trials with other biological therapies in Crohn's disease. Presently, the selection of biologicals in Crohn's disease is o en based on clinical judgement as there remain very few head-to-head randomised controlled trials (RCT) to inform clinical decisions. To our knowledge, the unpublished SEAVUE study (ustekinumab versus adalimumab) is the only RCT to date that has reported head-to-head results in Crohn's disease (Irving 2021).
Clear reporting of concurrent and prior therapies from other classes is also key, such as purine analogues and corticosteroids, as this informs wider future comparisons with other trials.
Finally, endoscopic remission was a secondary endpoint that did not reveal any results in our systematic review. Mucosal healing is gaining increased acceptance as an outcome of interest in the treatment of inflammatory bowel disease and future studies should consider this as an endpoint.
Key policymakers and stakeholders need to be involved in future studies to address the evidence gaps. This is especially important with biological medications in Crohn's disease given their significant cost to individuals and healthcare systems.

A C K N O W L E D G E M E N T S
Cochrane Gut supported the authors in the development of this systematic review.
The following people conducted the editorial process for this review. •

Study characteristics
Methods Study design: 3-arm double-blind randomised trial (designed with 2 intervention and 1 placebo arms. We grouped the intervention arms for this analysis)

Study dates: February 2000 to June 2002
Setting: NR

Inclusion criteria
• Adults with endoscopic, histopathological or radiological documentation of CD of the ileum or colon (or both), and a CDAI score 220-400 at screening. • Participants could receive concomitant treatment for CD with mesalamine or antibiotics provided they had been maintained on a stable dose for 2 weeks immediately before screening.

Exclusion criteria
• With an ostomy, an active fistula, or evidence of fixed obstruction • Requiring ciclosporin or immunosuppressants within 3 months or investigational drugs within 30 days before screening • Requiring systemic corticosteroids, heparin, non-steroidal anti-inflammatory drugs, tube feeding, defined formula diets or parenteral alimentation • Previously treated with biological therapy for CD • Markedly abnormal laboratory tests (haemoglobin < 10 g/dL; white blood cell count < 3 × 10 9 /L; platelet count < 100 × 10 9 /L; serum aspartate aminotransferase, alanine aminotransferase or alkaline phosphatase > 2.5 × ULN; serum creatinine > 1.5 × ULN; positive stool for enteric pathogens or proteinuria) • Unable to comply with the protocol

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Unclear risk Process of random sequence generation not described. Contacted study authors by e-mail but received no response.
Allocation concealment (selection bias) Unclear risk Study described as 'double-blind' but details on allocation concealment not described. Contacted study authors by e-mail (Dr Brian Feagan) but received no response.
Blinding of participants and personnel (performance bias) All outcomes Low risk Study was 'double-blind' and placebo-controlled.
Blinding of outcome assessment (detection bias) Unclear risk Study was 'double-blind' but details of blinding of outcome assessment were not described. Contacted study authors by e-mail but received no response.

Measurement timepoints during study
• Study visits scheduled at weeks 0, 2, 4 and 6 • Adverse events, CDAI, neurological symptoms of PML as means of questionnaires, use of concomitant medications, presence/absence of fistulae were evaluated at these visited • Blood testing performed at baseline and "throughout the study"

Outcomes
Primary outcomes

Secondary outcomes
• Mean change in CRP from baseline to week 6 Notes Funding source: Takeda Pharmaceuticals

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Low risk Quote: "Randomization was computer-generated and was performed at a central location."

Allocation concealment (selection bias)
Low risk Quote from protocol, "treatment assignments will be obtained through the interactive voice response system (IVRS) and for dose preparation according to the procedures outlined in the Study Manual. Information regarding the treatment assignments will be kept securely at Millennium per its standard operating procedures." Quote: "Randomization schedules will be generated by the Millennium Biostatistics Group and archived within the Biostatistics and Medical Writing Department of Millennium. Each patient who is qualified for treatment will be assigned a unique randomization number. The IVRS will provide treatment assignments based on these randomization numbers." Blinding of participants and personnel (performance bias) All outcomes Low risk Trial was double blinded and placebo controlled. The protocol stated that all study site personnel except the investigational pharmacist or designee were blinded to the treatment assignments for the duration of the study.
Blinding of outcome assessment (detection bias) All outcomes Low risk Quote from protocol, "after the Induction Phase has been completed, select and pre-specified personnel at Millennium will become unblinded to patient-level data in order to conduct the analyses and reporting of the Induction Phase data. As these activities will occur while the Maintenance Phase is ongoing, proper procedures will be in place to protect the blind until completion of the Maintenance Phase." Incomplete outcome data (attrition bias) All outcomes Low risk According to the flowchart of Supplementary Figure 1 (S1), attrition was balanced in all groups with adequate reasons provided for loss in numbers.
Selective reporting (reporting bias) Low risk Outcomes remained the same between published protocol and final trial.
The only outcome change highlighted was CDAI-100 response changed from being a secondary to primary endpoint.

Duration of study: 46 weeks
Measurement timepoints during study: study visits were conducted every 4 weeks during the maintenance trial. The primary and secondary outcomes were assessed at week 52 from time of induction (46 weeks into maintenance study) Follow-up measurements after study end: those who had no unacceptable adverse events or did not require CD-related surgery were continued in the open-label GEMINI long-term safety trial.

Primary outcomes as defined by study authors
• Clinical remission at week 52 (in maintenance therapy trial)

Secondary outcomes as defined by study authors
• CDAI-100 response • Glucocorticoid-free remission (defined as clinical remission at week 52 without glucocorticoid therapy) • Durable clinical remission (defined as clinical remission at ≥ 80% of study visits, including the final visit) at week 52

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Low risk Quote: "Randomization was computer-generated and was performed at a central location." Allocation concealment (selection bias) Low risk Quote from protocol "treatment assignments will be obtained through the interactive voice response system (IVRS) and for dose preparation according to the procedures outlined in the Study Manual. Information regarding the treatment assignments will be kept securely at Millennium per its standard operating procedures." Quote: "Randomization schedules will be generated by the Millennium Biostatistics Group and archived within the Biostatistics and Medical Writing Department of Millennium. Each patient who is qualified for treatment will be assigned a unique randomization number. The IVRS will provide treatment assignments based on these randomization numbers." Blinding of participants and personnel (performance bias) All outcomes Low risk Quote: "all patients and all study personnel except for those directly involved with study drug preparation will be blinded to study drug assignment for the entire study." Comment: use of placebo was described in protocol. During the maintenance phase all arms of the study (IV vedolizumab 8 weekly, 4 weekly and placebo) would receive either placebo or study drug 4 weekly to maintain the blind.
IV cover bags were also used to maintain blinding.
Blinding of outcome assessment (detection bias) All outcomes Low risk Quote: "after the Induction Phase has been completed, select and pre-specified personnel at Millennium will become unblinded to patient-level data in order to conduct the analyses and reporting of the Induction Phase data. As these activities will occur while the Maintenance Phase is ongoing, proper procedures will be in place to protect the blind until completion of the Maintenance Phase." Incomplete outcome data (attrition bias) All outcomes Low risk According to the flowchart of Supplementary Figure 1 (S1), attrition was balanced in all groups with adequate reasons provided for loss in numbers.
Selective reporting (reporting bias) Low risk No published protocol found. According to trial registration, authors reported relevant data accordingly -clinical response and clinical remission (CDAI scores) at relevant intervals.
Other bias Low risk Baseline characteristics reported and balanced for participants in all groups.
No other apparent sources of bias.

Study characteristics
Methods

Induction or maintenance study: induction study
Active or inactive disease at beginning of study: active Participants were considered for the primary outcome if they had failed TNF antagonists. They were also included for randomisation and some of the secondary outcomes if they were TNF antagonist naive.

Inclusion criteria
• Aged 18-80 years • CD with known ileal or colon (or both) involvement ≥ 3 months before enrolment (based on clinical and endoscopic evidence, corroborated by histopathology) • At least moderately active CD (defined by CDAI score 220-400 within 7 days before enrolment) in addition to ≥ 1 of: CRP > 2.87 mg/L, colonoscopy within prior 4 months, fecal calprotectin > 250 μg/g during screening in conjunction with features of active CD on small bowel imaging • Inadequate response, loss of response or intolerance to: TNF inhibitors, immunosuppressive or corticosteroids in past 5 years (for primary outcome population)

Exclusion criteria
• Follow-up measurements after study end: those who had no unacceptable adverse events or did not require CD-related surgery were continued in the open-label GEMINI long-term safety trial Outcomes

Primary outcomes as defined by study authors
• Proportion of participants in clinical remission at week 6 (defined by CDAI ≤ 150) from the TNF antagonist failure population

Secondary outcomes
• Proportion of participants in clinical remission at week 6 from the overall study population (including about 25% of TNF antagonist-naive participants) • Proportion of participants in clinical remission at week 10 (from the TNF antagonist-failure populations in addition to the additional TNF-naive population) • Proportion of participants with a CDAI-100 response at week 6 (from the TNF antagonist-failure population)

Notes
Funding source: Takeda Pharmaceuticals International, Inc.
Sponsored and funded by Millennium Pharmaceuticals, Inc (trading as Takeda Pharmaceuticals International Co).

Conflicts of interest:
(quote) "The authors disclose the following: Bruce Sands has received consulting and advisory board fees as well as clinical research/institutional grant support from AbbVie, Inc, Janssen Pharmaceuticals, Inc, and Takeda Pharmaceuticals International Co; Brian Feagan has received consulting fees and research grant support from Janssen Pharmaceuticals, Inc, Takeda Pharmaceuticals International Co, and UCB SA; Paul Rutgeerts has received consulting fees from Takeda Pharmaceuticals International Co. and UCB SA; Jean-Frédéric Colombel has received consulting fees from Takeda Pharmaceuticals International Co. and UCB SA; William Sandborn has received consulting fees and research grants from Janssen Pharmaceuticals, Inc, Takeda Pharmaceuticals International Co, and UCB SA, and speaker fees from Janssen Pharmaceuticals, Inc; Richmond Sy has received consulting, lecture, and advisory board fees as well as research grant support from AbbVie, Inc, and Janssen Pharmaceuticals, Inc, and both advisory board fees and clinical trial support from Takeda Pharmaceuticals International Co; Geert D'Haens has received consulting and lecture fees from AbbVie, Inc, Janssen Pharmaceuticals, Inc, Takeda Pharmaceuticals International Co, and UCB SA, research grants from Janssen Pharmaceuticals, Inc, and speaking honoraria from UCB SA; Shomron Ben-Horin has received consultancy and advisory board fees from Janssen Pharmaceuticals, Inc, and AbbVie, Inc, and an unrestricted research grant from Janssen Pharmaceuticals, Inc; Asit Parikh is an employee of Takeda Pharmaceuticals International, Inc; Jing Xu, Maria Rosario, Irving Fox, and Catherine Milch are employees of Takeda Pharmaceuticals International Co; and Stephen Hanauer has received consultancy and advisory board fees as well as clinical research/institutional grant support from AbbVie, Inc, Janssen Pharmaceuticals, Inc, Takeda Pharmaceuticals International Co, and UCB SA.

Cochrane Database of Systematic Reviews
Random sequence generation (selection bias) Low risk Quote: "randomization was computer-generated centrally." Allocation concealment (selection bias) Low risk Quote: "patient enrollment, monitored by an interactive voice response system." Blinding of participants and personnel (performance bias) All outcomes Low risk Quote: "treatment-qualified patient received a unique randomization number used to provide treatment assignments for dose preparation via the interactive voice response system. Saline bag covers and labels maintained blinding. Only the study site pharmacist was aware of treatment assignments." Blinding of outcome assessment (detection bias) All outcomes Low risk Contacted study authors by e-mail (Dr Bruce Sands), who confirmed that outcome assessors were blinded to treatment assignments.
Incomplete outcome data (attrition bias) All outcomes Low risk Attrition in each group reported and balanced in all groups. Discontinuation due to adverse events reported and balanced between treatment and placebo groups.
Selective reporting (reporting bias) Low risk Although published protocol was reported to be available, we could not find it. According to trial registration and method section author reported relevant outcomes -proportion of participants in clinical remission and response (CDAI scores).
Other bias Low risk Baseline characteristics reported for and balanced in all groups. No other apparent sources of bias.

Study characteristics
Methods After a 28-day screening period, all enroled participants received open-label IV vedolizumab 300 mg at weeks 0 and 2 with disease assessment at week 6. Those who responded (CDAI ≥ 70 decrease in CDAI) were randomised 2:1 to maintenance vedolizumab

Inclusion criteria
• Aged 18-80 years • Diagnosis of CD established ≥ 3 months before screening by clinical and endoscopic evidence and corroborated by a histopathology report • Moderate-to-severe active CD (CDAI score 220-450) within 7 days before the first dose of study drug and ≥ 1 of: CRP > 2.87 mg/L; ileocolonoscopy with a minimum of 3 non-anastomotic ulcer or 10 aph- Cochrane Database of Systematic Reviews thous ulcers, within 4 months before screening; or fecal calprotectin > 250 μg/g with CT/MRI/small bowel radiography/capsule endoscopy revealing CD ulcerations within 4 months before screening • Ileal or colonic (or both) involvement at minimum • People with > 8 years' duration of extensive colitis or pancolitis or le -sided colitis of > 12 years' duration must have documented surveillance endoscopy performed within 12 months of screening • Up-to-date cancer screening • Inadequate response to, loss of response to or intolerance of ≥ 1 of: immunomodulators, corticosteroids or anti-TNF therapies

Exclusion criteria
Gastrointestinal exclusion criteria • Abdominal abscess, extensive colonic resection, subtotal or total colectomy • History of > 3 small bowel resections or diagnosis of short bowel syndrome • Received tube feeding, defined formula diets or parenteral alimentation within 28 days before the administration of the first dose of study drug • Previous ileostomy, colostomy or known fixed symptomatic stenosis of the intestine • Receipt of any investigational or approved biological/biosimilar within 60 days or 5 half-lives of screening, or receipt of any non-permitted investigational or approved non-biological therapies within 30 days or 5 half-lives of screening • Oral 5-ASA probiotics and antibiotics were permitted if doses were stable for 2 weeks before first dose of study and remained stable throughout the study. Antidiarrhoeals were permitted. Azathioprine, 6mercaptopurine or methotrexate could be continued if the participant's dose had been stable for 8 weeks before the start of the study • Topical (rectal) treatment with 5-ASAs or corticosteroid enemas/suppositories within 2 weeks of the administration of the first dose of study drug • Requirement or anticipated requirement for surgical intervention for CD during the study • History or evidence of adenomatous colonic polyps that had not been removed or colonic mucosal dysplasia • Suspected or confirmed diagnosis of ulcerative colitis, indeterminate colitis, ischaemic colitis, radiation colitis, diverticular disease associated with colitis or microscopic colitis Infectious disease exclusion criteria • Evidence of an active infection during the screening period • Evidence of, or treatment for, Clostridium difficile infection or other intestinal pathogen within 28 days before the first dose of study drug • People with chronic HBV infection or chronic HCV infection or HBV-immune may have been included • Active or latent tuberculosis • Any identified congenital or acquired immunodeficiency (e.g. common variable immunodeficiency, HIV infection, organ transplantation) • Receipt of any live vaccinations within 30 days before screening • Clinically significant infection (e.g. pneumonia, pyelonephritis) within 30 days before screening, or ongoing chronic infection

General exclusion criteria
• Previous exposure to approved or investigational anti-integrin antibodies (e.g. natalizumab, efalizumab, etrolizumab, abrilumab (AMG 181)), antimucosal addressin cell adhesion molecule-1 antibodies or rituximab • Previous exposure to vedolizumab • Hypersensitivity or allergies to any of the vedolizumab excipients • Any unstable or uncontrolled cardiovascular, pulmonary, hepatic, renal, gastrointestinal, genitourinary, haematological, coagulation, immunological, endocrine/metabolic or other medical disorder that, in the opinion of the investigator, would confound the study results or compromise patient safety • Any surgical procedure requiring general anaesthesia within 30 days before screening or plan to undergo major surgery during the study period

Cochrane Database of Systematic Reviews
• Any history of malignancy, except for the following: adequately treated non-metastatic basal cell skin cancer; squamous cell skin cancer that had been adequately treated and that had not recurred for ≥ 1 year before screening and history of cervical carcinoma in situ that had been adequately treated and that had not recurred for ≥ 3 years before screening. People with remote history of malignancy (e.g. > 10 years since completion of curative therapy without recurrence) were to be considered on a caseby-case basis based on the nature of the malignancy and the therapy received • History of any major neurological disorders, including stroke, multiple sclerosis, brain tumour or neurodegenerative disease • Positive PML subjective symptom checklist at screening (or before the administration of the first dose of study drug at week 0) • Any of the following laboratory abnormalities during the screening period: haemoglobin level < 8 g/dL, white blood cell count < 3 × 10 9 /L, lymphocyte count < 0.5 × 10 9 /L, platelet count < 100 × 10 9 /L or > 1200 × 10 9 /L, alanine aminotransferase or aspartate aminotransferase > 3 × ULN, alkaline phosphatase > 3 × ULN, serum creatinine > 2 × ULN • History of drug abuse (defined as any illicit drug use) or a history of alcohol abuse within 1 year before screening • Active psychiatric problem that, in the investigator's opinion, may have interfered with compliance with study procedures • Patient or carer was unable to attend all study visits or comply with study procedures • Unwilling or unable to self-inject, or did not have a carer (defined as a legal adult) to inject the study medication • Lactation or pregnancy during the screening period or a positive urine pregnancy test at week 0, before study drug administration • Intention to reproduce before, during or within 18 weeks after participating in study • Immediate family member, study site employee or in a dependent relationship with a study site employee who was involved in conduct of study (e.g. spouse, parent, child, sibling), or may have consented under duress

Measurement timepoints during study
• Participants completed validated instruments to measure quality of life and work productivity at weeks 0, 6, 30 and 52, including the IBDQ • Blood samples were drawn for determination of vedolizumab serum concentrations predose at weeks 0, 6, 8, 14, 22, 30, 38, 46, 50 and 68 Any follow-up measurements after study end? NR

Primary outcomes as defined by study authors
• Clinical remission (defined as CDAI score ≤ 150) at week 52

Secondary outcomes as defined by study authors
• Enhanced clinical response (defined as a ≥ 100 decline in CDAI score from baseline (week 0)) at week 52 • Corticosteroid-free clinical remission (participants using oral corticosteroid at baseline who discontinued corticosteroid and were in clinical remission at week 52) • Clinical remission at week 52 in anti-TNF-naive participants

Cochrane Database of Systematic Reviews
We contacted the authors who confirmed the randomisation schedules were generated using interactive response technology.

Allocation concealment (selection bias)
Low risk Quote: "An interactive web response system was used for patient randomisation" and "All randomisation information was stored in a secured area, accessible only by authorised personnel." We contacted the authors who confirmed that the personnel from the vendor who had access to the randomisation schedule were not involved in the study conduct or data analysis.
Blinding of participants and personnel (performance bias) All outcomes Low risk Quote: "double-blind, placebo-controlled" and "All randomisation information was stored in a secured area, accessible only by authorised personnel." We contacted the authors who confirmed participants and study personnel were unaware of treatment assignments.
Blinding of outcome assessment (detection bias) All outcomes

Low risk
We contacted the authors (Dr Severine Vermeire), who confirmed that outcome assessors were blinded to the treatment assignments.
Incomplete outcome data (attrition bias) All outcomes Low risk Attrition was accounted for and balanced in both groups with adequate reasons provided for loss in numbers.
Selective reporting (reporting bias) Low risk No published protocol found. According to trial registration and method section, authors reported the necessary endpoints -proportion of participants with clinical remission and response.
Other bias Low risk Baseline characteristics were reported for and balanced in all groups. No other apparent sources of bias.

Study characteristics
Methods Study design: 2-arm, phase 3, double-blind randomised study

Study dates: January 2014 to November 2017
Setting: NR Participants Induction or maintenance study: induction (maintenance trial assessed separately)

Inclusion criteria
• In opinion of investigator, the person was capable of understanding and complying with protocol requirements • Person or, when applicable, the person's legally acceptable representative signed and dated the informed consent form prior to initiation of any study procedures • Male or female, aged 15-80 years, inclusive, at signing of informed consent

Cochrane Database of Systematic Reviews
• A non-sterilised male participant who had a female partner of child-bearing potential had to agree to use adequate contraception during the period from the signing of informed consent to 6 months after the last dose of the study drug • A female participant of child-bearing potential (i.e. non-sterilised or whose last regular menses was within previous 2 years) who had a non-sterilised male partner had to agree to use adequate contraception during the period from the signing of informed consent to 6 months after the last dose of the study drug • Diagnosed with ileal, colonic or ileocolonic CD ≥ 3 months prior to first dose of study drug according to the Revised Diagnostic Criteria for CD issued by Research Group for Intractable Inflammatory Bowel Disease Designated by the Ministry of Health, Labor, and Welfare of Japan (2012) • CDAI score 220-450 (inclusive) at first dose of study drug, and meeting ≥ 1 of: CRP > 0.30 mg/dL at screening; irregular-to-round shaped ulcers or multiple aphtha (≥ 10 lesions) in extensive area of the small or large intestine on endoscopy or imaging test within 4 months before first dose of study drug; longitudinal ulcers or a cobblestone appearance in the small or large intestine on endoscopy or imaging test within 4 months before first dose of study drug • Complication of colon cancer or dysplasia was ruled out by total colonoscopy at first dose of study drug (or the results from total colonoscopy performed within 1 year before giving consent were available), if patients met any of: extensive or limited colitis of ≥ 8 years' duration, aged ≥ 50 years or with a first-degree family history of colon cancer • Met the treatment failure criteria below with ≥ 1 of the following agents within 5 years before signing of informed consent • Corticosteroids ▪ Resistance: patients whose response was inadequate despite the treatment of ≥ 40 mg/day (oral or IV) for ≥ 1 week or 30-40 mg/day (oral or IV) for ≥ 2 weeks ▪ Dependence: patients who had failed to reduce the dosage to < 10 mg/day due to recurrence during gradual dose reduction (oral or IV) ▪ Intolerance: patients who were unable to receive continuous treatment due to adverse reactions (e.g. Cushing's syndrome, osteopenia/osteoporosis, hyperglycaemia, insomnia, infection) • Immunomodulators (azathioprine, 6-mercaptopurine or methotrexate) ▪ Refractory: patients whose response was inadequate despite the treatment for ≥ 12 weeks ▪ Intolerance: patients who were unable to receive continuous treatment due to adverse reactions (e.g. nausea/vomiting, abdominal pain, pancreatitis, liver function test abnormalities, lymphopenia, thiopurine S-methyltransferase genetic mutation, infection) • Anti-TNFα ▪ Inadequate response: patients whose response was considered inadequate (determined by investigators) despite the induction therapy in the dosage described in the package insert (this definition was different from the 1 used in GEMINI 2 and GEMINI 3) ▪ Loss of response: patients who had relapse during the scheduled maintenance therapy after achieving clinical response (those who withdrew for other reasons than relapse were not applicable here) ▪ Intolerance: patients who were unable to receive continuous treatment due to adverse reactions (e.g. infusion-related reaction, demyelinating disease, congestive heart failure, infection)

Exclusion criteria
• Evidence of or suspected abscess • History of subtotal or total colectomy • History of small intestine resections in ≥ 3 locations, or a history of diagnosis of short bowel syndrome • Had ileostomy, colostomy, internal fistula, or severe intestinal stenosis • Treatment history with natalizumab, efalizumab or rituximab • Started oral 5-ASAs, probiotics, antibiotics for CD treatment or oral corticosteroids (≤ 30 mg/day) within 13 days before first dose of study drug. Or patients who changed dosage of or discontinued these drugs within 13 days before first dose of study drug if the patient had used these drugs for > 14 days before first dose of study drug • Received 5-ASA, corticosteroid enemas/suppositories, corticosteroid IV infusion, oral corticosteroid at > 30 mg/day, drugs for diarrhoea-predominant irritable bowel syndrome, or Chinese herbal medicine for CD treatment (e.g. Daikenchuto) within 13 days before first dose of study drug

Interventions
• CG: IV placebo at weeks 0, 2 and 6 • IG: IV vedolizumab 300 mg weeks 0, 2 and 6 Duration of study: 14-week induction phase Measurement timepoints during study: weeks 2, 6, 10 and 14 Follow-up measurements after study end: participants were either included in the maintenance phase or in an open-label cohort and reinduced with vedolizumab. Participants were followed up until week 94 Outcomes

Primary outcomes as defined by study authors
• CDAI-100 response at week 10 (CDAI reduction of ≥ 100)

Secondary outcomes
• Percentage of participants who achieved clinical remission at week 10 (CDAI ≤ 150) • Change over time in CRP concentration during the induction phase in participants with baseline CRP 0.30 mg/dL Notes Funding: Takeda Pharmaceutical Company Limited.

Bias Authors' judgement Support for judgement
Random sequence generation (selection bias) Unclear risk Quote: "Randomization schedules were generated by personnel designated by the sponsor."

A D D I T I O N A L T A B L E S Study ID Numbers randomised group Trial registration number
Published protocol Do the outcomes reported match the protocol or trial register?

C O N T R I B U T I O N S O F A U T H O R S
SH: completed the search, data extraction, analysis and completed the writing of the review, and approved the final version prior to submission.
VS: provided substantial comments regarding intellectual content, contributed to the analysis, and approved the final version prior to submission.
MG: provided substantial comments regarding intellectual content, contributed to the analysis and editing the review, and approved the final version prior to submission.
AK: completed the search and approved the final version prior to submission.
GA: provided substantial comments regarding intellectual content, contributed to writing and editing the review, and approved the final version prior to submission.
NSD: contributed to editing the review, and approved the final version prior to submission.
RKB: oversaw and contributed to the search, analysis, contributed to writing and editing the review, and approved the final version prior to submission.

D E C L A R A T I O N S O F I N T E R E S T
SH: none.
GA was the Managing Editor of Cochrane Gut group. However, she has not been involved in any stage of the review's editorial process.
NSD has received a research grant from Takeda and GESA; speaking fees from Abbvie, Ferring, Shire and Pfizer; and is on an advisory board for Abbvie.

D I F F E R E N C E S B E T W E E N P R O T O C O L A N D R E V I E W
We made the following changes from our protocol (Hui 2020).

Methods -types of studies
Quasi-randomised trials were eligible in the final review, where they were not specifically referred to in the protocol.

Methods -types of participants
The types of participants were initially planned to include only a medically induced remission with exclusion of people with Crohn's-related surgery in the preceding six months. We removed this requirement as these data were not available for any one of the included studies.
We clarified in the review that studies with only a subset of eligible participants would be included.

Methods -types of interventions
The protocol stated that vedolizumab was to be compared to a control arm of placebo or other medical therapy. We changed this to a comparison with placebo only in the review.
We provided more details on the intervention of interest (doses, frequency, duration) in the review.

Methods -primary and secondary outcomes
We reversed the primary outcomes from "failure to achieve clinical remission" to "induction of clinical remission" and "clinical relapse" to "maintenance of clinical remission".
We reversed the secondary outcomes from "failure to achieve clinical response" to "induction of clinical response", "failure to achieve endoscopic remission" to "endoscopic remission" and "failure to achieve endoscopic response" to "endoscopic response". The remaining secondary outcomes did not di er.
Primary and secondary outcomes were initially grouped in the protocol. In our review, we separated outcomes and studies into an induction phase (with primary and secondary outcomes) and maintenance phase (with primary and secondary outcomes) as several of the included studies conducted separate randomisation processes at the maintenance phase.

Methods -search methods
In our protocol, we planned to contact experts in the field for additional published and unpublished studies. We omitted this from the review. We removed the Cochrane IBD Review Group Specialised Trials Register as it is now integrated within CENTRAL.
For the search strategy for ClinicalTrials.gov, we included further details on the specific search. We included the WHO ICTRP search strategy.
In the search methods of the protocol, we stated that we would contact experts in the field. We removed this. However, we contacted the authors of the included studies. We also removed the manual search for conference abstracts because all relevant conferences publish the abstracts as a supplementary issue in journals and are indexed in Embase. We searched Embase including the conference abstracts.

Methods -timing of outcome measurement
The protocol stated assessments at six, 12 and 52 weeks. We replaced these timepoints with separate analyses for induction studies and maintenance studies. Timing of assessment of outcomes within induction and maintenance studies were as defined by the study authors.

Methods -measures of treatment e ect
In the protocol, we described our plans to deal with continuous and time-to-event data. This was not required with the available studies.

Methods -assessment of risk of bias in included studies
We specified that we used the RoB 1 tool to determine the risk of bias of included studies.

Methods -unit of analysis issues
In the review, we have outlined how data from cluster randomised controlled trials would be incorporated.