The effect of exercise interventions on inflammatory markers in major depressive disorder: protocol for a systematic review and meta-analysis

Background: Depression currently affects 4.4% of the global population, and 93.7% of this population suffer from major depressive disorder (MDD) according to 2017 statistics. MDD patients are more likely to suffer from co-morbidities such as cardiovascular disease and high body mass index (BMI), thus contributing to its large cost to society. Throughout the literature, there are known links between inflammation and MDD. Interestingly, while exercise is considered a promising intervention for MDD, the mechanism(s) of action remain unclear, thereby preventing the creation of optimal, cost-saving, exercise “prescriptions” for those with MDD. Thus, the aim of this review and meta-analysis is to summarize and analyse the current literature exploring how quantified exercise interventions modulate inflammatory molecules in MDD patients. Methods: Electronic databases (APA PsycINFO, and PubMed/MEDLINE (EBSCO interface), EMBASE) will be searched using a detailed search strategy comprised of three search term themes: exercise, depression/MDD, and inflammation/inflammatory molecules. Only quantified exercise interventions performed in adult humans with MDD validated via a recognized diagnostic criterion will be included. Studies should also include a MDD control group and explore changes in inflammatory molecules. Examples of these molecules include: C-reactive protein (CRP), interleukin-6 (IL-6), interleukin-1 beta (IL-1b), tumour necrosis factor-alpha (TNF-a), homocysteine, d-dimer and myeloperoxidase (MPO). After eligible studies are identified, standardized data extraction will be employed and the risk of bias in each study will be appraised using the Cochrane handbook checklists. In the event of two or more homogenous studies exploring exercise effects over a similar time period, raw mean differences or standardized mean differences will be pooled using random effects analysis. This systematic review and meta-analysis will be reported according to the Preferred Reporting Items for Systematic reviews and Meta-Analyses (PRISMA) guidelines. Dissemination: This systematic review and meta-analysis will be disseminated in peer-reviewed journals. PROSPERO registration: CRD42020186006 (31/08/2020)


Rationale
Depression is a leading cause of disability worldwide, affecting more than 300 million people or 4.4% of the global population today 1 . In 2015, it was calculated that depressive disorders account for more than 50 million 'years lived with disability' (YLD) internationally, with the World Health Organization predicting that depressive disorders will be the leading cause of global health burden by 2030 1 . According to the Global Burden of Disease (GBD), the number of incident cases of depression worldwide has seen a steep increase from 172 million in 1990 to more than 258 million in 2017, whereby a large majority (93.7%) of these patients with depression suffer from major depressive disorder (MDD) 2 . The Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5), classifies MDD as a clinical depression syndrome where a patient experiences persistent low mood for at least two weeks, inclusive of other depressive symptoms: dysphoria, reduced motivation, psychomotor and cognitive dysfunctions ('pseudo-dementia'), anhedonia, sleep and weight changes, and reduced libido 3 . MDD is also associated with physical health-comorbidities-cardiovascular disease, high BMI, and premature mortality-all of which compound its high financial cost to society 4 .
The connection between inflammation and MDD is wellrecognized throughout the literature [5][6][7][8][9] . The mechanism of this relationship is described by the inflammation/cytokine model of depression, which purports that a pro-inflammatory state caused by an increase in pro-inflammatory cytokines and/or the reduction in anti-inflammatory cytokines results in the development of clinical depression in humans 10-12 . The pro-inflammatory cytokines involved in the cycle include tumour necrosis factor (TNF)-α, interferon (IFN)-γ, interleukin-1 beta (IL-1β), and interleukin-6 (IL-6), which is also known to have anti-inflammatory properties 13 . The key cytokines that have been implicated in the sickness behaviours that overlap with MDD-anhedonia, social withdrawing, decreased activity-are TNF-α and IL-1β 10 . IL-6 is less strongly associated, although it is elevated in those with SSRI-resistant depression 14 . Interestingly, TNF-α and IL-6 levels were found to be particularly elevated in depressed subjects compared to controls in a 24 study meta-analysis conducted by Dowlati et al. 15 .
It has been consistently reported throughout the literature that exercise has antidepressant effects 16 . For example, a meta-analysis of 49 prospective cohort studies (1,837,794 person-years) by Schuch et al. determined that participants with higher exercise frequency had a 17% reduction in odds of developing depression 16 . On the other hand, there is evidence that low cardiorespiratory fitness (CRF), an indicator of physical inactivity, yields a 64% increased risk of developing depression compared to those with high CRF across at least 3,540,450 person-years of data 17 . Apart from these findings, exercise has also been shown to enhance self-esteem and has less stigmatization than psychotherapy 18 . The low side-effect profile and adaptation opportunities of exercise based on a patient's comorbidities makes it an attractive intervention when considering the complexity of depression in relation to pharmacological and psychotherapeutic options 19 . Despite the apparent utility of exercise in treating MDD, the precise mechanisms and mediators underlying its antidepressant effects remain to be elucidated, thus prohibiting the creation of optimal exercise treatments for MDD patients. We are aware of a detailed systematic review regarding the neurobiological effects of exercise on major depressive disorder, which provided evidence that exercise training had no significant acute or chronic adaptations on any inflammatory marker reviewed in people with MDD 20 . We believe that a 6-year time gap is an appropriate timeframe within which to look anew on the data. The additional novelty of our review is its specific focus on how quantified exercise interventions modulate inflammatory biomarkers in MDD patients that have been published to date rather than the authors more expansive review that accounts the adaptations on neurogenesis, inflammation biomarkers, and brain structure, and how associations between improved depressive symptoms relate to hippocampal volume. We believe that our review will extend the observations around inflammation specifically beyond 2016 and is necessary to restate that the paucity of studies remains in this area of research and that limitations presented within existing studies precludes a definitive conclusion that exercise reduces inflammation in people with MDD.
It is known that IL-6 levels are increased post-exercise when it is released from skeletal muscles, and it has been shown to inhibit the production of TNF-α; as such, IL-6 may be anti-inflammatory in the context of immediate exercise 21 . However, the mediation of IL-6 in relation to its proinflammatory function, along with other similar molecules, is still being explored with reference to different exercise styles

Amendments from Version 2
This version includes minor changes as a direct response to the constructive feedback from the reviewer, which was to incorporate our responses to the reviewer within the body of the manuscript.
These are as follows: 1) Incorporation of a description as to the need for this review and the novelty it brings to the research field and how this paper will further expand the literature (Introduction, paragraph three).
2) Objectives: Third objective -Minor amendment to clarify that the review will clarify the effects of different exercise for acute exercise (those for a single bout of exercise) or chronic exercise (those from repeated bouts of exercise).
3) Methods, Interventions section: incorporate more detail about the need for included literature to have clear quantification and specification of exercise intensity and measurements. 4) Data synthesis section, paragraph two: The incorporation of more details to clarify how not standardised measurement will be handled (e.g., cytokine dosage) as suggested by the reviewer. 5) Discussion, paragraph one: The incorporation of further detail to clarify how studies evaluating the effects of exercise will be handled and reported based on if they include a control group of people with MDD who did not exercise. 6) Update to the author details (JM) to include current employer address.
Any further responses from the reviewers can be found at the end of the article REVISED and durations. According to a longitudinal exercise study by Kohut et al., cardiovascular exercise performed three times per week for 45 minutes per day over 10 months resulted in decreased serum levels of CRP and IL-6 22 . It was also found that reductions in TNF-α were seen in both the cardiovascular group and flex group, who performed elements of yoga, Tai Chi, flex band, free hand weights and stability balls over the same time period 22 . Interestingly, a shorter-term exercise 10-week study by Dongers et al. showed that resistance exercise training reduced CRP levels by 32.8%, compared to the aerobic group which only saw a 16.1% reduction 23 . Therefore, there is evidence from exercise training research that suggests the long-term anti-inflammatory outcome of exercise could be due to the reduction of pro-inflammatory molecules.
Considering the (1) inflammatory aetiology of MDD, the (2) connections between inflammation and exercise, and the (3) urgency for accessible economical treatments for MDD on a global scale, it would be useful to both the scientific and medical community to perform a novel systematic review and meta-analysis that will inform areas for further research and identify gaps in the field. It is important to note that this study will only evaluate the impact of quantified exercise interventions, rather than "physical activity" broadly. Such terms are often conflated, or used inter-changeably, in the literature despite their distinct definitions. Physical activity (PA) refers to any bodily movement by the skeletal muscles that necessitates energy expenditure (e.g., gardening, leisurely walking, sports); on the other hand, exercise is defined as a subset of PA that is structured, planned, and repetitive, and has the objective of improving or maintaining fitness 24 . Thus, our inclusion of only quantified exercise interventions aims to ensure standardization across all studies in order to increase the validity of review findings.

Objectives
The purpose of this research is to systematically review previous studies that have investigated the effects of quantified exercise interventions on inflammatory cytokine and biomarker levels in individuals diagnosed with MDD, via validated criteria.
The second focus of this review is to undertake a meta-analysis on inflammatory molecule changes pre-and post-intervention in MDD patients included in the systematic review studies.
Further objectives include: (1) Evaluate the inter-study consistency of MDD populations and the criteria of diagnosis.
(2) Establish which inflammatory molecules are most studied in the MDD populations in research that involves an exercise intervention.
(3) Evaluate effects of different exercise modalities, the exercise intensity, exercise session duration, and exercise intervention duration (i.e., defined as acute exercise (those for a single bout of exercise) or chronic exercise (those from repeated bouts of exercise)), and how they modulate inflammation in MDD patients.
(4) Ascertain similar studies that measured the same outcomes, to determine those suitable for meta-analysis.
This protocol is reported in line with the Preferred Reporting Items for Systematic review and Meta-Analyses Protocols (PRISMA-P) guidelines 25 .

Eligibility criteria
Studies will be selected for inclusion in the systematic review and meta-analysis according to the following criteria: Study designs. We intend to include all published intervention studies examining the effects of an exercise intervention on inflammatory molecule levels in human MDD patients and controls participating in interventional exercise studies. Exercise intensity and duration must be reported. The following study designs will be excluded from review and meta-analysis: case-studies, qualitative studies or quantitative studies performed on animal models or human populations with no clear MDD diagnosis; or with subclinical symptoms of depression; or those with psychiatric or somatic co-morbidities. Systematic reviews and meta-analyses will be used a resource to identify relevant/eligible studies but will not be reported herein.
Participants. The study population must be human, aged 18 years or older, with a clinical diagnosis of MDD according to a recognised, widely used diagnostic classification approach i.e., the Diagnostic and Statistical Manual for Mental Disorders Interventions. The systematic review and meta-analysis will evaluate the impact of exercise intervention that have provided clear quantification of the exercise intensity using a clearly specified unit of measure (as example: work rate specified in watts, joules, heart rate, or a percentage of maximal oxygen uptake or peak oxygen uptake) and should specify the following parameters: exercise (type, mode, frequency, and duration); work done (measures of power (watt) or energy transferred (joules)); measures of heart rate, repetitions performed; duration of each session; number of total sessions; duration of the intervention. Interventions evaluating yoga or tai-chi only, or those combined with pharmacological treatments, will not be included in this review.
Comparators. Study control groups (i.e., not performing intervention) should include those diagnosed with MDD by the same standards as the participants and not having any other major psychiatric or health co-morbidities. Controls should ideally have the same amount of interaction time with the researchers.

Time frame.
There will be no restrictions on the length of follow up of outcomes.
Setting. There will be no restrictions on type of setting for the interventions.

Report characteristics.
We will only include studies reported in the English language, due to limited resources to adequately translate studies. Only published journal articles available from our selected online databases will be included.

Information sources
The following electronic databases will be searched: EMBASE, APA PsycINFO (EBSCO interface), PubMed/MEDLINE (EBSCO interface). The literature search will include all searchable published articles up until current day (2021). Literature saturation will be ensured by examining study references as well as identifying related systematic reviews for further relevant studies. While specific authors will not be contacted, research by prominent authors in the field will be screened to ensure all relevant material has been captured.

Search strategy
Planned limits on the search include papers published in the English language only. The search strategy will include terms relating to the following three subject categories. Further search terms will be identified via descriptive terms under MeSH terms. The content of preceding search strategy content has been formulated and agreed upon by all authors of the study. A Health Sciences Librarian with expertise in formulating systematic review searches will be asked to help refine the search strategy for each database. The EBSCO interface search strategy, which can be used both for Pubmed/ MEDLINE and PsychINFO databases, will be adapted for input into the EMBASE database interface. The detailed search strategy is presented as extended data 25 .
Data collection, extraction and assessment Study selection process. Two authors (DG and JM) will independently screen titles and abstracts for all articles, retrieved by the search, to identify studies that meet the inclusion and exclusion criteria. The full texts of all selected and potentially relevant articles will be collected and then independently examined by both authors to decide whether or not all eligibility criteria have been met at this phase. Any disagreements will be mediated through discussion and/or a third reviewer (GC). Duplicates will be excluded. A PRISMA flow chart will display the articles examined at each stage, detailing the number of papers included and excluded and reasons for exclusion. All reviewers will be unblinded to the journal titles, study authors and institutions.

Data extraction and collection process.
The following data items will be extracted from all studies and tabulated: study author and publication date, sample size (N), mean age and sex breakdown of participants (N and %), MDD diagnostic criteria, intervention type, intervention intensity, length of individual intervention sessions, duration of intervention (number of total sessions to discern between acute (a single exercise bout) and chronic (re-peated exercise bouts) exercise interventions), outcome(s) measured. Data items will be manually extracted independently by two authors (DG and JM). Software will be used to organize and compare data extracted by both reviewers. As in the selection process, any disagreements between reviewers (DG and JM) will be arbitrated through discussion and/or with the aid of the third author (GC). Any uncertainties regarding data will be resolved by contacting study authors via email.
Outcome measures and prioritisation. The outcomes of interest included in this study fall into two broad categories: pro-inflammatory cytokines and inflammatory biomarkers. Cytokines are non-structural proteins that are produced by nearly all nucleated cells in the body, and they can be classified either as pro-inflammatory or anti-inflammatory 26 . Pro-inflammatory cytokines function to induce inflammation by modulating gene expression; examples include interleukin-6 (IL-6), interleukin-1 beta (IL-1B), and tumour necrosis factor-alpha (TNF-a) 26 . C-reactive protein (CRP), homocysteine, d-dimer and myeloperoxidase (MPO) are not categorized as pro-inflammatory cytokines but are known as inflammatory biomarkers; they are highly associated with inflammatory processes and have been found to either induce or mediate pro-inflammatory cytokines 27-30 .
In summary, our primary outcome measures of interest include those molecules most highly associated with pro-inflammatory processes. Examples of these molecules are C-reactive protein (CRP), interleukin-6 (IL-6), interleukin-1 beta (IL-1B), and tumour necrosis factor-alpha (TNF-a). Secondary outcome measures of interest include inflammatory biomarkers such as homocysteine, d-dimer and myeloperoxidase (MPO). It is important to note that throughout the study, pro-inflammatory cytokines and inflammatory biomarkers will be collectively referred to as inflammatory molecules.

Risk of bias
Two review authors (DG and JM) will independently assess the quality of each study according to study design using existing appraisal checklists provided in the Cochrane handbook, including the Risk of Bias (RoB) 2 tool for randomised controlled trials 31 . Meta-biases, such as outcome reporting bias, will be explored by evaluating whether the study's protocol was published before the recruitment of participants; this will be performed for all available protocols. Trial registries will also be reviewed to establish if reported outcome measures and statistical methods reported in the studies match their original protocols.
Data quality for each study will be recorded in a spreadsheet and a table summarising the quality of assessment/evidence will be compiled and included in the systematic review and meta-analysis. The risk of confounds will be identified by independently explored in each study in the quality assessment stage by examining the published study design (e.g., inclusion criteria, selectivity) or analysis (e.g., employing adjustment techniques or analysis of covariance). To reduce the risk of bias in determining study quality, all discrepancies will be resolved via reviewer consensus or through consultation with a third reviewer (GC).

Data synthesis
Studies will be grouped according to intervention type (i.e., acute or chronic) and summary tables of all characteristics of all articles included in the review compiled. Tables will outline the following: characteristics of study populations (N, age, sex), MDD diagnostic criteria, study design, intervention/test type, intensity, and outcomes. A narrative will summarise the findings of the tables and explore inter-and intra-study relationships.
In the event of two or more homogenous studies in the same acute or chronic intervention group and having the same outcome unit of measure and comparator, the raw mean difference will be used in the meta-analyses (e.g., Beck Depression Inventory Score). Where studies use different measurement units, the raw mean difference (RMD) values will be extracted from each study and transformed to standardised mean difference (SMD) values to allow comparison between different units of measure. An example of where this will be pertinent will be the reporting of cytokine dosage. Such a measure is not standardised and may be subject to the influence of different analysis kits that may have different ranges and sensitivities (an important factor indicated by one reviewer). For clarity, as outlined, the SMD test will be used to calculate the effect sizes of each mediator (see for example: Hedges and Olkin 32 ). For studies that evaluate the acute effects of exercise, the effect size will be calculated as the change from pre to post-test. The baseline or rest measure prior to exercise value will be used as the pre-test value. For post-test, the measure acquired immediately after exercise will be used. Where discrepancies/differences in sample timings exist, we will indicate this to allow the reader to make an informed interpretation of the observations reported. The RMDs or the SMDs, as appropriate, will be pooled using random-effects analysis and will be displayed in a forest plot with 95% confidence intervals (CIs) and weights. Cohen's criteria (Cohen 1998) will be used to interpret effect sizes: small (0.2), medium (0.5), and large effects (0.8). Subgroup analysis will be employed to explore sources of heterogeneity based on the following parameters: patient characteristics, MDD criteria used for diagnosis, intervention type (acute vs. chronic exercise), intervention mode (for example, bike vs. ergometer). Analyses will be performed using SPSS. In the event of missing data, we will endeavour to contact the original study authors to obtain the raw data or the reported data to avoid the need to extract data from figures, should the required data be available. Where no data is available, this will be identified and detailed within the review for transparency. Where data is required to be extracted from figures, due to no other source being forthcoming or available, the method used will be specified with calculations (and citations for calculations) included to allow reproduction of these results.
The Review Manager (Cochrane Collaboration Software, Rev-Man) software version 5 will be used to perform statistical analysis and to combine results in a forest plot, using random-effect models. The pooled SMD and standard error (SE) will be used to generate a funnel plot in order to estimate the likelihood of publication bias if 10 or more studies are included in the meta-analysis. If possible, subgroup analyses will be undertaken to evaluate the impact of these sources of heterogeneity.

Confidence in cumulative evidence
The Grading of Recommendations Assessment, Development and Evaluation (GRADE) working group methodology will be employed to evaluate the quality of evidence. The following domains will be assessed for each PICO outcome: risk of bias in individual studies, inconsistency of results between studies, indirectness of evidence, imprecision, publication bias, and factors that increase the quality of evidence 33 . Quality of evidence will be graded in terms of the level of confidence that the true effect lies close to that of the estimated effect (i.e., high, moderate, low, and very low) 33 .

Ethics and dissemination
Ethical approval is not required for this study as it does not involve the inclusion or conduct of any experimental or personal data that would require informed consent. This systematic review and meta-analysis will be disseminated in peer-reviewed journals.

Amendments
In the event of protocol amendments, the date of each amendment will be documented alongside a description of the change and its rationale. Amendments will also be recorded and tracked on the PROSPERO registry of this protocol.

Study status
The study was first registered on PROSPERO (31/08/2020) with the protocol finalised in March 2021. The literature search will be completed in May, data extraction and risk of bias assessment will be completed in June, the statistical analysis will be completed in September/October, and we expect to finish the study in May 2022.
Any protocol amendments will be stated in the review article.
Any additional analyses will be reported in the review article and stated post-hoc.

Discussion
By providing an up-to-date systematic review and meta-analysis, this study will synthesize current evidence of how exercise interventions modulate inflammatory molecule levels in those with diagnosed MDD compared to controls. Changes in pre-and post-intervention inflammatory molecule levels will be statistically summarised via meta-analysis, which will analyse acute and chronic exercise interventions, separately. Since the underlying goal of this study is to explore if exercise has anti-inflammatory effects in those with MDD, in relation to different types of activity and duration. Ideally, the aim is to include those studies that evaluate the effects of an exercise-based intervention, be it a single bout or a longer-term training intervention, that investigate the effects of exercise on inflammation biomarkers, using controls that have been randomly allocated or created using another mechanism, but not before-and-after studies without controls. Ideally, these would be randomised control trials that have used healthy controls, and/or depressed participants as controls. We intend to discuss where this is not the case and identify inconsistencies in the existing literature on the definition of a control intervention. These identified findings may indicate areas for further research and substantiate more targeted "exercise prescriptions" for those suffering from MDD.

Potential limitations
This study may have several limitations. The search will be restricted to English language publications only. While not including unpublished literature possibly results in an increased risk of publication bias for the included studies, given that reports that describe a significant finding or a positive outcome have a greater probability of being published when compared to those that do not. In any case, a funnel plot will be used to evaluate the presence of publication bias. Studies focusing on adolescent or child populations will also be excluded from the data set, thereby limiting the number of potentially relevant research included in this review and meta-analysis. In addition, the inclusion of only validated MDD populations and specific inflammatory molecules may disregard other applicable studies. Sources of heterogeneity across the studies, in criteria such as patient characteristics, MDD criteria used for diagnosis, intervention duration type, and intervention mode, may also limit meta-analysis findings. If possible, subgroup analyses will be undertaken to evaluate the impact of these sources of heterogeneity.

Data availability Underlying data
No data are associated with this article. Selection of the studies to include will be performed by JM and DG. Extraction of data from studies will be conducted by JM and DG. Statistical analyses will be performed by JM and DG. All authors will be involved in the interpretation and analysis of the results, while an appraisal of study quality will be performed by DG, GC and JM. The results and subsequent manuscript will be written and edited by all authors. All authors approved this submission. GC is the guarantor of the review.
When I referred to the 2016 citation, I suggested to the authors to strengthen the need and the novelty of their study, discussing the extant literature and showing how this new paper will further expand the literature. However, they have just added: "We are aware of a systematic review regarding the neurobiological effects of exercise on major depressive disorder, which provided evidence that exercise training had no significant acute or chronic adaptations on any inflammatory marker reviewed in people with MDD20." In the response letter, the authors respond: "We believe that a 6-year time gap is an appropriate timeframe within which to look anew on the data. The additional novelty of our review is its specific focus on how quantified exercise interventions modulate inflammatory biomarkers in MDD patients that have been published to date rather than the authors more expansive review that accounts the adaptations on neurogenesis, inflammation biomarkers and brain structure and how associations between improved depressive symptoms relate to hippocampal volume. We believe that our review will extend the observations around inflammation specifically beyond 2016 and is necessary to restate that the paucity of studies remains in this area of research and that limitations presented within existing studies precludes a definitive conclusion that exercise reduces inflammation in people with MDD."

I agree entirely with the authors. This response is exactly what should be incorporated into the introduction.
The same should be done for the responses to the other questions I've made.

1.
The eligibility criteria are not informative enough. Are the authors including both studies evaluating a single bout of exercise (acute effects) and studies with trials of several weeks of intervention (chronic effects)? 2.
If they include studies evaluating the acute effects of exercise, do they need a control group of people with MDD who did not exercise? 3.
Data synthesis: Does the author think using the raw mean difference to summarize the effects is appropriate? Cytokine dosage is not standardized, and different kits may have different ranges, sensitivities.

4.
Data acquisition: are the authors planning to extract data (means and standard deviations) from figures if these values are not being presented on tables or in the body of the text? If so, how? 5.
will extend the observations around inflammation specifically beyond 2016 and is necessary to restate that the paucity of studies remains in this area of research and that limitations presented within existing studies precludes a definitive conclusion that exercise reduces inflammation in people with MDD.

The eligibility criteria are not informative enough. Are the authors including both studies evaluating a single bout of exercise (acute effects) and studies with trials of several weeks of intervention (chronic effects)?
For clarity, within this review we aim to include all existing studies that evaluate the effect of a single bout of exercise or the effect of repeated exercise training over a longer time period (e.g., weeks -months, etc...). To help make this information clear and informative to the reader we now refer to chronic exercise (defined as repeated bouts of exercise) as simply "chronic" and acute exercise (defined as a single bout of exercise) as "acute". Importantly, the exercise interventions included must provide clear quantification of exercise intensity using a clearly specified unit of measure (as example: work rate specified in watts, joules, heart rate, or a percentage of maximal oxygen uptake or peak oxygen uptake) and with a clearly stated duration.

2) If they include studies evaluating the acute effects of exercise, do they need a control group of people with MDD who did not exercise?
We agree that the inclusion of a control group of people with MDD is an important comparator group. The aim is to include those studies that evaluate the effects of an exercise-based intervention, be it a single bout or a longer-term training intervention, that investigate the effects of exercise on inflammation biomarkers, using controls that have been randomly allocated or created using another mechanism, but not before-and-after studies without controls. Ideally, these would be randomised control trials that used have used healthy controls, and/or depressed participants as controls. We intend in our discussion to comment on some inconsistencies in the existing literature on the definition of a control intervention.

3) Data synthesis: Does the author think using the raw mean difference to summarize the effects is appropriate? Cytokine dosage is not standardized, and different kits may have different ranges, sensitivities.
We thank the reviewer for raising this very good point and the lack of standardisation across the kits used to analyse cytokine levels will be an important discussion point. The standardized mean difference (SMD) test will be used to calculate the effect sizes of each mediator (see for example: Hedges and Olkin, 2014. Statistical Method for Meta-analysis. Academic Press.). For studies that evaluate the acute effects of exercise, the effect size will be calculated as the change from pre to post-test. The baseline or rest measure prior to exercise value will be used as the pre-test value. For post-test, the measure acquired immediately after exercise will be used. Where discrepancies/differences in sample timings exist, we will indicate this to allow the reader to make an informed interpretation of the observations reported.

4) Data acquisition: are the authors planning to extract data (means and standard deviations) from figures if these values are not being presented on tables or in the body of the text? If so, how?
Initially, we will contact manuscript authors to request the raw data or the reported data to avoid the need to extract data from figures, should the required data be available. Where no data is available, this will be identified and detailed with the review for transparency. Where data is required to be extracted from figures, due to no other source being forthcoming or available the following will be used. Medians and inter-quartile ranges will be transformed to means (M) and standard deviations (SD) using a web-based tool (Mean Variance Estimation, http://www.math.hkbu.edu.hk/~tongt/papers/median2mean.html). For data that are significantly skewed, an alternative validated procedure will be followed (Wan et al., 2014). Where possible and not otherwise available, numerical data will be extracted from graphs using WebPlotDigitizer (Rohatgi, 2020) and Adobe Acrobat's inbuilt measuring tool (Adobe Systems Inc., California, USA), as previously done by others (Safadi et al., 2021). I thought it was clear and well presented, but had some minor comments around terminology and clarity The purpose of this research is to 'systematically review previous studies that have investigated the effects of quantified exercise interventions on inflammatory cytokine and biomarker levels in individuals diagnosed with MDD'. The statement 'as compared to controls without any other major psychiatric or somatic co-morbidities' is odd as it makes it sounds like these are selected on the basis of different criteria. Can they clarify the scope of 'interventional'? I think that means with controls which can be randomly allocated or created using another mechanism, but not beforeand-after studies without controls. Is that correct?
I was a little unclear about the term 'quantified exercise interventions'. I think the distinction they make with PA is clear and helpful, but what does 'quantified' mean? Does this mean that the exercise regime has to have a clearly stated duration and intensity? So the intervention needs to be quantified, even if adherence to the ideal is not perfect? How will they deal with adherence issues in the analysis? Or is this restricted to studies under near-laboratory conditions where adherence is high? This is a little out of my area of expertise so this could be usefully clarified.
I felt that the statement about the review having the function to 'inform areas for further research and identify gaps in the field' could be clarified. What areas particularly?
I always think it is odd to state that systematic reviews and meta-analyses analyses will be 'excluded'. As is made clear later, they will be farmed for studies, not excluded in the same way as other studies.
I am not sure of the meaning of the term 'sub-depressive symptoms'? Do they mean sub-threshold depression?
This is not my area at all, but are there quality criteria for assessment of the inflammatory cytokines and biomarkers?
I did not really understand the phrase 'The risk of confounds will be reduced by independently examining each study for potential confounds and how they were controlled for in the published study design (e.g., inclusion criteria selectivity) or analysis (e.g. employing adjustment techniques or analysis of covariance)'. How does this play out in the analysis?
What is meant by 'intervention type (i.e., acute or chronic)'? Those terms are not used in the description of interventions on page 4. Does that mean intervention FOR acute and chronic depression? I wasn't sure those terms were current in psychiatry.
They state that 'the pooled OR and standard error (SE) will be used to generate a funnel plot in order to estimate the likelihood of publication bias if 10 or more studies are included in the metaanalysis'. Isn't the main analysis using continuous measures? What are the ORs they are referring to?
In limitations, they state 'if possible, subgroup analyses will be undertaken to evaluate the impact of these sources of heterogeneity'. This seems to be something that should be in the methods.
Is the rationale for, and objectives of, the study clearly described? Yes

near-laboratory conditions where adherence is high? This is a little out of my area of expertise so this could be usefully clarified.
Quantified exercise interventions means that the exercise intensity has been quantified using a clearly specified unit of measure (as example: work rate specified in watts, joules, heart rate, or a percentage of maximal oxygen uptake or peak oxygen uptake) and with a clearly stated duration.
The issue of adherence is a great point to raise and will certainly be noted and discussed in detail. We intend to identify where and how studies have dealt with measuring adherence since we currently believe that some but not all studies will have been able to measure and/or report adherence.

I felt that the statement about the review having the function to 'inform areas for further research and identify gaps in the field' could be clarified. What areas particularly?
We aim to inform future exercise-related research that investigates the topic of depression and inflammation by: Providing a concise summary of the existing evidence thus far, the conclusions that can be made about the response of inflammatory biomarkers to exercise. To make clear the strengths and weaknesses of current research, to help focus future research to be able to have a better understanding of the mechanisms of action that underpin exercise interventions in stress-related disorders such as depression, and to hopefully, bring about some clarity about the optimal exercise intensity and/or duration required to have a beneficial intervention where possible.

1.
Critique study designs to identify where improvements can be made.

2.
Understand which biomarkers have been robustly investigated vs. those that have received little or no attention so that we are able to suggest avenues of exploration for future (e.g., su-PAR).

I always think it is odd to state that systematic reviews and meta-analyses analyses will be 'excluded'. As is made clear later, they will be farmed for studies, not excluded in the same way as other studies.
We have clarified this statement within the study design paragraph (Page 7, paragraph 1) to now read: "The following study designs will be excluded from this review and meta-analysis: casestudies, qualitative studies or quantitative studies performed on animal models or human populations with depressive symptoms but are not diagnosed with MDD or those with psychiatric or somatic co-morbidities. Systematic reviews and meta-analyses will be used a resource to identify relevant/eligible studies but will not be reported herein."

I am not sure of the meaning of the term 'sub-depressive symptoms'? Do they mean subthreshold depression?
The term 'sub-depressive symptoms' was used to refer to subclinical symptoms of depression that fall below the threshold to be clear categorised and diagnosable as major depressive disorder. This phrase has been changed to make it more clear "The following study designs will be excluded from review and meta-analysis: case-studies, qualitative studies or quantitative studies performed on animal models or human populations with: no clear MDD diagnosis; or with subclinical symptoms of depression; or those with psychiatric or somatic co-morbidities." (Page 7, paragraph 1)

This is not my area at all, but are there quality criteria for assessment of the inflammatory cytokines and biomarkers?
There are a variety of research tools and experimental methods available and used for the assessment of inflammatory cytokines and other biomarkers, not all of which are certified as diagnostic in nature. Nevertheless, there are quality control and performance evaluation features which are reported by manufacturers for each assay type including for example with ELISA assays, linearity, precision, limits of detection, recovery and stability, and ROC curves. The assessment of biomarker validation and qualification is beyond the scope of this review. It is possible that authors may report information about the ROC curves as indicators of accuracy and performance indicators for analysis using ELISA (as example) to have understand the sensitivity and specificity, and the relevant optimal cut-off value used for ELISA-quantified biomarkers. Currently, it does not appear to be common practice to report such information in this research field -this may be in part due to the high performance (~>0.97) for an ELISA kit today.

I did not really understand the phrase 'The risk of confounds will be reduced by independently examining each study for potential confounds and how they were controlled for in the published study design (e.g., inclusion criteria selectivity) or analysis (e.g., employing adjustment techniques or analysis of covariance)'. How does this play out in the analysis?
This was a phrase we used to try and encapsulate all the PRISMA guidelines in a simple phrase. On reflection, we agree this is not as clear as we would have liked. We have simplified this sentence to read "The risk of confounds will be explored and identified by in each study at the quality assessment stage by examining the published study design (e.g., inclusion criteria, selectivity) or analysis (e.g., employing adjustment techniques or analysis of covariance)." (Page 9-paragraph 5) This information will be collated and reported using the PRISMA flowchart, further a Cochrane risk of bias assessment will be performed and reported for all eligible studies.

What is meant by 'intervention type (i.e., acute or chronic)'? Those terms are not used in the description of interventions on page 4. Does that mean intervention FOR acute and chronic depression? I wasn't sure those terms were current in psychiatry.
The term "acute" and "chronic" was meant to refer to the type of exercise training. We very much thank the reviewer for indicating that this is not very clear at present. We have attempted to make it clearer with the inclusion of an explanation: "A typical exercise training intervention requires an individual to repeat bouts of exercise over a short (1-2 weeks) or longer time period (i.e., weeks, months, or years) and can be defined as "chronic exercise" while, "acute exercise" is defined as a single bout. For clarity, within this review we now refer to chronic exercise (defined as repeated bouts of exercise) as simply "chronic" and acute exercise (defined as a single bout of exercise) as "acute."" (see objective 3 on Page 6 and Page 9-paragraph 1) They state that 'the pooled OR and standard error (SE) will be used to generate a funnel plot in order to estimate the likelihood of publication bias if 10 or more studies are included in the meta-analysis'. Isn't the main analysis using continuous measures? What are the ORs they are referring to ? We very much thank the reviewer for highlighting this typo we missed -this should have been SMDs and not OR here. This has been corrected. The main analysis is using continuous measures.

In limitations, they state 'if possible, subgroup analyses will be undertaken to evaluate the impact of these sources of heterogeneity'. This seems to be something that should be in the methods.
We thank the reviewer for highlighting this point and have relocated this sentence to the data synthesis section.

Competing Interests:
No competing interests were disclosed.