Home > Full Text Reviews > The Clinical Effectiveness and... > Economic modelling of cost-effectiveness
  • We are sorry, but NCBI web applications do not support your browser and may not function properly. More information

PubMed Health. A service of the National Library of Medicine, National Institutes of Health.

Pavey TG, Anokye N, Taylor AH, et al. The Clinical Effectiveness and Cost-Effectiveness of Exercise Referral Schemes: A Systematic Review and Economic Evaluation. Southampton (UK): NIHR Evaluation, Trials and Studies Coordinating Centre (UK); 2011 Dec. (Health Technology Assessment, No. 15.44.)

6Economic modelling of cost-effectiveness


There is limited evidence on the cost-effectiveness of ERS. The available evidence highlights significant uncertainty, particularly around the effectiveness of ERS. The result is that decision-makers are currently making decisions on the availability of ERS with only limited evidence on its cost-effectiveness.

In light of this, a de novo analysis has been developed to further explore the cost-effectiveness of ERS. The analysis considers a target population of sedentary adults, with further analysis presented to explore the impact of ERS on those with specific pre-existing conditions, where evidence suggests that ERS might improve outcomes. The approach taken uses previous research as a point of departure, and builds on this through use of evidence synthesis (see Chapter 3) and through further analysis of the impact of PA on HRQoL.

The approach here comprises three main activities:

  1. The development of a cost–utility analysis, similar to earlier analyses, to estimate the impact of ERS on long-term outcomes based on the effectiveness evidence identified herein, including subgroup analysis, to explore the cost-effectiveness of ERS in individuals with pre-existing conditions.
  2. The development of methods to quantify and incorporate short-term benefits of PA into this cost–utility framework.
  3. A cost–consequence framework that summarises the costs and benefits associated with ERS in a disaggregated fashion.

Cost–utility analysis

Cost–utility analysis is widely considered to be the prevailing approach to economic evaluation in the UK, mainly as a result of the guidance laid out in the NICE reference case for economic evaluations.118 There are known to be challenges that are inherent in applying cost–utility analyses to public-health interventions,119 although it has been used previously to estimate the benefits of PA, notably as part of the development of guidance on PA issued by NICE.120

In order to generate generalisable findings in the form of an incremental cost per QALY and also allow for comparison of our findings with earlier analyses, we sought to develop a cost–utility analysis of ERS based on the evidence reported in Chapter 3 of this report.

Methods for cost–utility modelling approach

Modelling approach

Figure 17 illustrates our modelling approach, which is a based on the structure of the model developed by NICE.76 A decision-analytic model was developed, which followed a cohort of individuals over time to examine the impact of PA on their health. Specifically, the model considered the lifetime risk of developing a series of conditions that are known to be associated with being physically active. The model considered the impact of ERS on coronary heart disease, stroke and type 2 diabetes, because these are considered to be the conditions for which the most robust quantifiable evidence is available on the relationship between PA and incidence of disease. Furthermore, evidence on the QALY losses associated with the development of these conditions is also available from previous research.84 PA has been associated with a wide range of conditions. Owing to data limitations, no attempt was made to incorporate the effect of PA on other conditions, such as musculoskeletal or respiratory diseases.

FIGURE 17. Model structure.


Model structure.

The model considers a cohort of individuals, aged between 40 and 60 years, who present in a sedentary state. The age of the population was selected to reflect the evidence on the clinical effectiveness of ERS reported in Chapter 3. Individuals enter the model as either exposed to an ERS intervention or not; modelling considers two hypothetical cohorts, comparing costs and outcomes of a cohort exposed to ERS with a control cohort not exposed to ERS. Those exposed to ERS are assumed to have a greater probability of becoming active. A physically active individual is assumed to have both improved life expectancy and quality of life (QoL), as a result of a reduced risk of developing each of the morbidities considered in the model. The primary end point for the analysis was QALYs.

The intervention

The ERS intervention in the model is consistent with the definition used throughout this report (see Chapter 1, Physical activity promotion in primary care). Effectiveness data for ERS are derived from the meta-analysis presented herein (Figure 3). For the purposes of our analysis, we assume that the ERS is leisure centre based, as is the case for the majority of studies considered in Chapter 3. Estimates of the cost of the intervention are derived on this basis.


The comparator for the analysis is ‘usual care’, which is specified as no active intervention and as the recognised alternative in a sedentary population. This acknowledges that some sedentary individuals may choose to participate in PA without an intervention, although the probability of doing so is assumed to increase as a result of exposure to an intervention.


The model adopts a NHS/Personal Social Services perspective, in line with the NICE reference case for cost-effectiveness analysis.121 Although it is acknowledged that PA may have important effects on non-health-care costs and benefits, these are excluded from the primary/base-case cost–utility analysis, although these broader considerations are addressed in sensitivity analysis and through the presentation of cost–consequence analyses.

Time horizon

A lifetime horizon is adopted to acknowledge the long-term benefits of PA, with alternative time horizons considered in sensitivity analysis.

Model inputs

Data on costs and effects were synthesised to populate the model. Data were primarily derived from the systematic reviews undertaken in Chapters 3 and 4. Further details are provided below.

Effectiveness of exercise referral scheme comparator

Evidence of the effectiveness of ERS/comparator, measured in terms of the probability of moving from a sedentary state to an active state, was derived from the meta-analysis conducted as part of clinical effectiveness review in Chapter 3. This was based on ITT analyses, which adjusted for adherence and uptake and showed ERS to be associated with a higher probability (RR 1.11, 95% CI 0.99 to 1.25) of being active compared with usual care (Figure 3). The active state is defined in line with the effectiveness literature, i.e. doing 90–150 minutes of at least moderate-intensity PA per week. Thus, a sedentary lifestyle corresponds not only to non-participation in PA but also to participation below the requisite amount. The active state is assumed to last long enough to enable health benefits to be obtained, although this remains undefined given the inadequate evidence on the dose-response relationship between PA and the incidence of long-term outcomes. Previous analyses of behaviour change have referred to this scenario as ‘fully engaged’122 to describe an individual who makes lasting changes to his or her lifestyle following an intervention.

Risks of developing health states associated with inactivity

Evidence of the effect of PA on the development of the outcomes considered in the model (CHD, stroke and type 2 diabetes) is derived from a systematic review of economic evaluations in Chapter 4 and HSE – 2006.123 The derivation of the estimates involved a number of steps. First, the probability of developing these conditions among sedentary individuals was generated from the prevalence of these conditions in that population using the HSE – 2006123 data. Although it is acknowledged that a potential limitation of such univariate analyses is that it does not adjust for confounders, data constraints precluded the inclusion of those confounders. The second step involved estimating the probability of developing the health states among active individuals using RR estimates identified from NICE76 to adjust the estimates derived from the first step. It must be emphasised that the PA levels and study population used to measure the RR estimates match those identified in our clinical effectiveness review. A number of assumptions were made in generating these estimates. First, the risk estimates were assumed to be equivalent to the risk of developing those conditions over a lifetime. Second, the risk of experiencing any of these health states was assumed to be independent of the risk of experiencing other health states. Third, individuals were assumed to experience only one health state within the model.

Exercise referral scheme intervention costs

The cost of the ERS intervention was derived from previously published research identified as part of the review conducted for this study. The study by Isaacs et al.,61 presenting a detailed bottom-up costing exercise, was identified via a systematic review of the literature, and is regarded here as the best available evidence/estimate for costing of ERS. The estimated cost of the intervention was based on resource use in a health service and/or local authority setting, consistent with the primary perspective taken for analyses here. See Table 26 for further details (information of the calculation of these costs can be found in Isaacs et al.61 The validity of the costs estimates was assessed by the expert advisory group on this project and judged to be representative of ERS schemes currently in operation. The cost estimates were adjusted for inflation into 2010 prices using the Consumer Price Index. Discounting of the intervention costs was not undertaken as intervention costs were assumed to be wholly incurred in the first year. No attempt was made to estimate a net cost of the intervention, which subtracts any cost savings that might result from ERS from the cost of the intervention. Where this was explored in the systematic review in Chapter 4 (Isaacs et al.,61 Gusi et al.70), there was no clear evidence of a change in health-care utilisation (e.g. medications, hospital or primary care) as a result of the intervention.

Treatment costs and quality-adjusted life-years associated with coronary heart disease, stroke and type 2 diabetes

The model considers three outcomes associated with PA, CHD, stroke and type 2 diabetes. The total lifetime treatment costs and QALYs associated with each condition were estimated based on assumptions relating to the age at onset and the likely life expectancy combined with estimates of the annual cost of treating an individual with the condition. This approach was in line with the earlier analysis conducted by NICE.76

It was assumed that the treatment cost of stroke, unlike the other health states was an event cost that occurs once, rather than a recurring cost. This is acknowledged as a simplification in the model, as in reality there are likely to be acute and ongoing costs associated with stroke. Treatment costs were discounted using the prevailing discount rates as determined by the Treasury and/or NICE guidelines (i.e. 3.5% discounting rate).

Primary outcome measure (quality-adjusted life-years)

The primary outcome of the economic evaluation is expressed in terms of QALYs. QALY losses associated with each of the conditions considered in the model are calculated. QALYs were discounted at 3.5% discount rate. The formula for calculating the QALYs is:

[Equation 1]

where Q1 = mean QoL associated with being in a non-disease health state; Q2 = mean QoL associated with a particular disease health state; ts = number of years before onset of the disease health state (average age minus 55 years); t3 = age at disease health state onset and t4 = mean age of mortality associated with health state (average age of mortality minus loss of life-years associated with the particular condition). Loss of life-years was calculated by subtracting life-years remaining after onset of the disease health state from the average life-years remaining for the non-disease health state.

Assessment of uncertainty

Uncertainty in parameter estimates was explored through the use of deterministic and probability sensitivity analyses. The deterministic sensitivity analysis, which covered one-way and scenario analysis, explored a number of uncertainties that were recognised at the outset of the analysis. These included uncertainties around the effectiveness of ERS and changes in the cost of ERS to take into account costs incurred by participants as well as providers. The effectiveness of ERS was varied according to estimates of uncertainty reflected in the upper and lower limits of the 95% CI of the RR estimate. Sensitivity analysis also considered how a less intensive form of ERS might look, using evidence on a walking-based intervention (as opposed to a structured leisure centre-based intervention) from Isaacs et al.61 Further sensitivity analyses considered ‘best-case’ and ‘worst-case’ scenarios that considered the combined effect of extreme values of effectiveness and cost.

In addition, uncertainties around parameters considered to be key drivers of the cost-effectiveness of ERS were addressed simultaneously using PSAs. The parameters that had different unit values in the two arms of the model (i.e. probability to be active and probability to get the disease conditions) were specified as incremental differences between the two arms and not absolute values. The intuition is that the distributions of these parameters may be correlated and, hence, representing them as absolute values may overestimate the uncertainty. The distributions and the calculation of alpha and beta calculations were based on Briggs et al.124 In cases where there were no data on standard errors (SEs), the standard approach of using 10% of mean estimates as SE was followed. A total of 10,000 Monte Carlo simulations were generated from the PSA.

Model validation

The following procedures were employed to check the validity of the model (Chilcott et al.125):

  1. Internal validation Simulate a series of changes in the input values that are likely to vary the results of the model with checks to see that the impacts on the results are expected. For example, setting all QALY parameters to zero, and checking if the output of the QALYs in each arm is zero. In addition to this, the model was reviewed by an experienced health economist who was not part of the research team.
  2. Peer review A peer-review process that involved a modeller, who understands the complexities of the model, scrutinising the spreadsheet of the model and the formulae behind it.


Costs of exercise referral schemes

Estimates of the cost of ERS were derived from a detailed, bottom-up costing exercise conducted as part of a previous health technology assessment (Isaacs et al.61) and inflated to current prices. Estimates of the intervention costs are presented in Table 38 (see Table 26 for details).

TABLE 38. Intervention costs estimates.


Intervention costs estimates.

Effectiveness of exercise referral schemes

Estimates of the effectiveness of ERS on PA levels were derived from the meta-analyses conducted in Chapter 3. These are reported in Table 39.

TABLE 39. Inputs used in the model.


Inputs used in the model.

Estimates of the outcomes associated with physical activity

Tables 4042 summarise the derivation of the outcomes associated with PA. Firstly, the probability of experiencing an outcome (CHD, stroke or type 2 diabetes) considered in the model is generated based on the earlier analysis conducted by NICE.76 This is reported in Table 40.

TABLE 40. Probability of experiencing an outcome associated with PA.


Probability of experiencing an outcome associated with PA.

TABLE 42. Lifetime treatment costs/QALYs associated with health states.


Lifetime treatment costs/QALYs associated with health states.

Estimates of the QALYs associated with each outcome in the model are derived by multiplying the utility of being in a particular health state with the life expectancy in that health state. Life expectancy is derived by assuming an average age at onset. Assumptions about the average age at onset of a health state and the utility of health states were derived from the model developed by NICE.76 These are reported in Table 41.

TABLE 41. Inputs used in calculating QALYs/treatment costs.


Inputs used in calculating QALYs/treatment costs.

The lifetime treatment costs/QALYs for an individual in each health state are summarised in Table 42. Among the conditions included in the model, type 2 diabetes incurred the largest treatment cost and stroke the least, although it should be noted that stroke was considered as an event, whereas other chronic outcomes were associated with ongoing treatment costs.

Estimating the cost-effectiveness of exercise referral schemes

Table 43 shows the estimated ICER of the base-case analyses using a cohort of 1000 individuals and a lifetime horizon. Total costs and outcomes are divided by the cohort size (1000) to generate per-person estimates of costs and benefits. The ICER was calculated with respect to the standard comparator ‘usual care’. Compared with usual care, ERS is marginally more expensive, with additional costs of £169.54, with an incremental QALY gain of 0.008 (i.e. eight QALYs gained in the total cohort). The base-case cost per QALY of ERS compared with usual care is £20,876. If adopting a willingness-to-pay threshold of £30,000, as used by NICE, these findings indicate a net health gain, and suggest that ERS is a cost-effective use of resources.

TABLE 43. Base-case cost-effectiveness results comparing ERS with usual care.


Base-case cost-effectiveness results comparing ERS with usual care.

Deterministic sensitivity analysis

Deterministic sensitivity analysis was carried out around parameters with known uncertainty. Sensitivity analyses conducted are summarised in Table 44. Table 45 shows the impact of the variation in parameter estimates (one-way analysis) on the cost-effectiveness of ERS. Assuming a less intensive ERS or more effective ERS resulted in an ICER below £30,000 and lower than the base case. On the other hand, including intervention costs to participants led to an ICER above £30,000, although a less effective ERS resulted in ERS being dominated by usual care (negative ICER) – i.e. ERS is more expensive and leads to loss of health gains.

TABLE 44. Deterministic one-way sensitivity analysis inputs.


Deterministic one-way sensitivity analysis inputs.

TABLE 45. Cost-effectiveness results (after one-way sensitivity analyses) comparing ERS with usual care.


Cost-effectiveness results (after one-way sensitivity analyses) comparing ERS with usual care.

Further analyses were conducted which considered ‘best-case’ and ‘worst-case’ scenarios for ERS. These scenarios are summarised in Table 46. The findings of the analysis are presented in Table 47. In the worst-case scenario, ERS was dominated by the comparator. In the best-case scenario, the ICER fell to under £700 per QALY. These findings of the deterministic sensitivity analysis (excluding the dominated cases) are presented in the form of a tornado diagram (Figure 18) to illustrate the relative magnitude of effect of changing each of the parameter values or scenarios. Overall, the cost-effectiveness was found to be most sensitive to changes in the scenarios (best cases of cost and effectiveness).

TABLE 46. Deterministic scenario sensitivity analysis inputs.


Deterministic scenario sensitivity analysis inputs.

TABLE 47. Cost-effectiveness results (after scenario sensitivity analyses) comparing ERS with usual care.


Cost-effectiveness results (after scenario sensitivity analyses) comparing ERS with usual care.

FIGURE 18. Impact of deterministic sensitivity analysis on base case ICER (£20,876.27).


Impact of deterministic sensitivity analysis on base case ICER (£20,876.27).

Probabilistic sensitivity analysis

Probabilistic sensitivity analysis, based on 10,000 simulations, was also conducted. A summary of the distributions adopted in the PSA is presented below in Table 48.

TABLE 48. Probabilistic sensitivity analysis inputs.


Probabilistic sensitivity analysis inputs.

A scatterplot of the probabilistic findings, showing simulated estimates of cost difference against QALY difference between ERS and usual care, is provided in Figure 19. The scatterplot shows that all the simulations generated an improved effectiveness of ERS, but also a higher cost than usual care (i.e. all points were in the north-east quadrant of the cost-effectiveness plane). This reflects the relatively modest uncertainty around the cost of the intervention and assumptions about the distribution of uncertainty around the estimates of effect size.

FIGURE 19. Cost-effectiveness plane showing the scatter plot of 10,000 Monte Carlo simulations for ERS compared with usual care.


Cost-effectiveness plane showing the scatter plot of 10,000 Monte Carlo simulations for ERS compared with usual care.

The decision as to whether or not these findings can be considered cost-effective depends on the maximum amount decision-makers are willing to spend to obtain an additional unit of effectiveness (in this case, a QALY). This can be best presented in the form of a cost-effectiveness acceptability curve, as presented in Figure 20. At a threshold of £20,000 there is a 0.508 probability that ERS is cost-effective. This increases to 0.879 when a threshold of £30,000 is considered.

FIGURE 20. Cost-effectiveness acceptability curve showing the probability of cost-effectiveness for ERS at varying levels of threshold.


Cost-effectiveness acceptability curve showing the probability of cost-effectiveness for ERS at varying levels of threshold.

Subgroup analysis of exercise referral schemes in individuals with pre-existing conditions

The remit of this HTA report was to examine the clinical effectiveness and cost-effectiveness of ERS in individuals with a pre-existing condition. The cost-effectiveness evidence reviewed in Chapter 4 captured relatively little existing evidence on such individuals. Rather, ERS was used to mitigate against unhealthy behaviours or risk factors for future conditions.

The aim of this section is to evaluate the cost-effectiveness of ERS in people with a diagnosed condition known to benefit from PA. We focused on the top three conditions (Table 49) that have been found to benefit most from increases in PA (BHFNC34); obesity, hypertension and depression (see Appendix 1, Figure 21, for full list).

TABLE 49. Inputs used in the model.


Inputs used in the model.

Methods for subgroup analysis in individuals with pre-existing conditions

The subgroup analysis is based on the use of the same framework for cost–utility analysis reported above. The model was adjusted to reflect differences in the underlying risk of developing each of the morbidities in the model (CHD, diabetes and stroke), according to the existence of a pre-existing condition. The values (Tables 3842) of other parameters (i.e. efficacy of ERS/control, costs and utilities associated with health events) from the base-case model are assumed to hold for these cohorts. Analysis was run separately for each of the disease specific cohorts. Table 49 shows the data inputs and the data sources used for the probabilities of experiencing the health states in the respective cohorts. The sources for data were selected based on their relevance to our methodology (e.g. age and gender characteristics) given their methodological rigour. Calculation of these probabilities follows the approach in the base case. Data insufficiency precluded the fitting of different probabilities for all health states in all cohorts. In the absence of incidence data to generate the probabilities (e.g. CHD in the obese cohort), we used mortality data with the caveat that the probability of experiencing that health state was similar to the probability of death related to that condition. Also, in cases where data was observed for cardiovascular disease (in the obese and hypertensive cohorts) it was assumed that those probabilities hold for both stroke, and CHD.


Table 50 presents the estimated ICER for the disease-specific cohorts. For each of the conditions considered, the ICER is lower than the base case, reflecting the increased likelihood of developing one of the morbidities considered in the model if the individual has a pre-existing condition. Compared with usual care, ERS in these cohorts remains more costly (albeit less so than in a general population cohort). In terms of effectiveness, ERS (compared with usual care) is more effective, leading to improved QALY gains that are higher than in the base case (ranging from 0.011 to 0.017). The cost per QALY of ERS compared with usual care is between £8414 and £14,618, and thus ERS can be considered cost-effective at the NICE threshold.

TABLE 50. Cost-effectiveness results (disease specific cohorts) comparing ERS with usual care.


Cost-effectiveness results (disease specific cohorts) comparing ERS with usual care.

Summary of the cost–utility analysis

Our analysis attempts to estimate the cost-effectiveness of ERS using a cost–utility analysis framework similar to that used in previous analyses (NICE 200676). Our base-case assumptions result in a favourable cost-effectiveness ratio of £20,876 per QALY gained from ERS compared with usual care. It should be acknowledged that our base-case estimate includes some optimistic assumptions with respect to cost and effectiveness. However, our deterministic and PSAs suggest that there is a low possibility of the ICER increasing above £30,000 when these assumptions are relaxed.

Analysis of ERS in groups of individuals with pre-existing conditions suggests that it may be more cost-effective in these groups, than in a sedentary population. ERS is frequently prescribed to individuals with risk factors for CVD. Our subgroup analysis includes populations with obesity and hypertension to reflect these individuals. In these groups, the cost-effectiveness of ERS falls to around £11,000 per QALY. In a population with depression, ERS cost-effectiveness is more favourable, generating an ICER of approximately £8000. Given the higher risk of developing the long-term illnesses considered in the model in these groups, it is not surprising that the subgroup analyses produce more favourable ICERs. This is an encouraging finding and suggests that it might be possible to target ERS to individuals with pre-existing conditions in which the pay-offs/impact may be higher. However, there remain some major uncertainties over whether or not the evidence used to populate the model, derived from the meta-analysis, is applicable to these groups. There may be good reason to believe that uptake, adherence and effectiveness might differ according to the characteristics of the recipients. Although we have attempted to adjust the model to take into account differences in the rate of long-term illnesses, no data were identified as part of the effectiveness review to allow for adjustment of the effect of ERS in different populations. There is a pressing need for better primary evidence to inform these uncertainties.

Although our cost-effectiveness estimates suggest that ERS is a cost-effective use of NHS resources, it should be noted that the individual-level lifetime QALY gains are relatively modest (< 0.01 in our base-case analysis). This estimate is predicated on the evidence of effectiveness derived from the meta-analysis presented earlier in this report. We believe that the meta-analysis has provided the most robust estimate to date of the effectiveness of ERS compared with usual care. However, it should be acknowledged that the cost-effectiveness analysis is attempting to capture lifetime benefits based on evidence of relatively modest effect sizes derived from short-term studies. Any such analysis inevitably involves some assumptions about the degree to which behaviour change is lasting and fails to consider other health behaviours that may impact on long-term outcomes. The result is that the cost-effectiveness analysis estimates that ERS has a modest lifetime cost and a marginal lifetime QALY gain. Even small changes in the source data used to populate the model, particularly evidence of effect size and cost, may lead to significant changes in the resulting ICER. This can best be illustrated through consideration of the net benefit calculation. If we value each QALY gained at £30,000 and accept that our analysis is generating a lifetime QALY gain of approximately 0.008 in most cases, then the value of the benefits generated in monetary terms is approximately £240, which exceeds the cost of the intervention. However, even a modest change in the lifetime QALY gain, to 0.07, would result in the costs exceeding the benefits, making the cost-effectiveness of ERS questionable.

Although sensitivity analysis has sought to address this point, it should be acknowledged that, in many cases, source data were derived from a single study (e.g. cost data from Isaacs et al.61) and it was necessary to fit distributions to parameters to allow for PSA. Although every effort has been made to explore uncertainty, there is a possibility that the uncertainty around parameter estimates may be greater than predicted within our analysis, which would have a material impact on the ICER.

Although some caution should be taken in interpreting the findings, the authors would wish to emphasise that the estimates of cost-effectiveness generated are believed to be conservative. Our approach generates a partial analysis that considers only the impact of ERS on a number of morbidities known to be associated with PA. The impact on other morbidities was excluded owing to limitations in the available evidence. On this basis, our estimates of cost-effectiveness should be regarded as conservative, as we have made no attempt to quantify these benefits within our analysis.

Limitations of the analysis

The analysis had a number of limitations which should be acknowledged. First, we examine only the long-term impact of PA on selected morbidities. It was not possible to include other morbidities that may be affected by PA owing to uncertainty over the relationship between PA, incidence and quality-adjusted life expectancy. Nor does our model account for potential negative outcomes of PA, such as injuries. Although this may be an important determinant in taking up PA, particularly in the elderly, the evidence on injuries suggests that they are rare (Munro et al.48), and they are not expected to significantly affect results when considered at a population level. Another set of limitations include assumptions relating to constant and independent risk of experiencing disease health states and age at onset of disease. These assumptions were derived from the NICE 2006 report76 and were meant to allow our analysis to be comparable with previous research. Although we recognised that these assumptions are limiting, their impact on the ICER, when investigated through sensitivity analysis, was considered minimal.

A number of other weaknesses in the model design were identified which were prioritised for further analysis. These include:

  • the potential to capture the short-term improvements in QoL associated with PA (process benefits), which may be particularly important in certain groups, such as those who are prescribed PA for mental-health problems, such as depression
  • the wide range of health benefits associated with increases in PA, including mental health, cancer and musculoskeletal conditions, which are currently excluded from the analysis.

These points are addressed in the remaining sections of this chapter, first through further development of the cost–utility analysis and subsequently through the development of a cost–consequence framework that allows for consideration of other health and non-health costs and benefits that might be associated with ERS.

Further development of the cost–utility analysis to include short-term quality-adjusted life-year gains resulting from physical activity

The previous section highlighted the need to consider the short-term improvements in QoL (e.g. improved mental health) that might result from increased PA, as well as longer-term impacts on common conditions. A key step in achieving this is to estimate the HRQoL gain associated with increases in PA. This section seeks to address this point by first estimating the short-term QoL gain associated with PA using econometric models, and, second, incorporating the estimated QoL gains into the base-case model, reported above, to generate a revised ICER.

Participation in PA has been found to lead to enhanced QoL, an effect that is consistent across socioeconomic details.131 Nonetheless, to date, economic evaluation of exercise interventions have rarely accounted for these QoL gains. A notable exception is Beale et al.,84 who included QoL gains associated with a unit increase in PA and found a favourable impact on ICERS generated for environmental interventions to promote PA. Therefore, this section attempts to build on previous analyses by demonstrating the impact of the inclusion of QoL gains associated with an active state (via say ERS) on the cost-effectiveness of ERS.

Methods for further development of the cost–utility analysis to include short-term quality-adjusted life-year gains resulting from physical activity


Data from HSE – 20086 have been used to conduct econometric analyses to explore and estimate the impact of PA on HRQoL. The HSE is a routine cross-sectional survey that draws a nationally representative sample of persons residing in private households in England. The sample and focus of the survey vary each year. Data from the 2008 survey were used in this study and included a sample of 9191 households with 15,102 adults aged 16 years or over, and a total child sample of 7521. This study draws on data for 5537 observations of 40- to 60-year-olds among the adult sample. Sampling was based on a multistage stratified random sampling design that uses the Postcode Address File as a sampling frame. The primary focus of HSE – 20086 was PA and fitness. The method of data collection involved the use of face-to-face interviews, self-completion questionnaires, clinical measurements and physical measurements (including objective measurements of PA via accelerometers). To compensate for seasonal variation in responses, the time period for interviews covered January to December 2008, with the fieldwork spanning from January 2008 to April 2009.

Health-related quality of life

Health-related quality of life is measured in the HSE survey using the EQ-5D, and the summary measure of HRQoL (or health–state utility value) derived from the EQ-5D.132 These utility scores were generated using the descriptive system of the EQ-5D questionnaire (UK version), a standard HRQoL instrument with preference weights which are attached to combinations of responses. The EQ-5D descriptive system describes HRQoL in five dimensions (i.e. mobility, self-care, usual activities, pain/discomfort and anxiety/depression), with each dimension including three levels: no problems, some/moderate problems, and severe/extreme problems. Different health states are created from the responses to the descriptive system of the EQ-5D by combining one level from each of the dimensions. A tariff is then applied to these health states to generate utility scores.132 The utility scores usually range from ‘1’ (perfect health) to ‘0’ (death, with states that are perceived to be worse than death having a negative utility score).

Physical activity

As shown in Table 51, PA in the HSE – 20086 is measured/assessed via (1) specific activities – including walking and sports – and (2) a composite indicator – a combination of different types of PA (i.e. walking, housework, occupational activity and sports/exercise). The composite indicator was captured through either subjective (self-reports) or objective (accelerometers) measurements. Each of these activities is operationalised as a binary variable indicating being ‘physically active’ or not. The variable takes the value of 1 if PA (defined as a minimum of 90 minutes of at least moderate-intensive PA) was done per week, or defined as zero otherwise (not PA). This definition of ‘physically active’ is consistent with the approach in the literature on ERS (see Chapter 3), and was adopted to allow future modelling of the cost-effectiveness of ERS.

TABLE 51. Specification of indicators of PA.


Specification of indicators of PA.

Control variables

A set of sociodemographic, economic, health and other variables that have been found in the literature to be correlates of HRQoL were considered as covariates. Table 52 lists these variables and a priori expectations about the direction of their correlation with HRQoL (see Appendix 7, Table 62, for references). In developing the expected signs, consideration was given to the methodology (e.g. the specification of the dependent variable and the control variable; the origin and characteristics of the sample) used by the studies reporting those findings.

TABLE 52. Overview of control variables.


Overview of control variables.

Methods of statistical analysis

Means [(standard deviation (SD)] and proportions were calculated for continuous and categorical data, respectively. The chi-squared and Fischer's exact tests were used to check the association between the HRQoL (dependent variable) and dummy variables representing item non-response for independent variables in order to examine the mechanisms under which the missingness occurred (i.e. missing completely at random or not).124 If the pattern of missingness did not occur completely at random, a regression-based imputation method was used to replace missing values of continuous variables and a dummy variable specifying item non-response added. For the categorical variables, item non-response was included in the omitted category and a dummy variable for item non-response created.133

Tobit regression with upper censoring at 1.0 and robust SEs were used to model the relationship between HRQoL and indicators of PA controlling for potential confounders (covariates). Separate Tobit regressions were fitted for each of the indicators of PA to avoid unstable estimates resulting from the collinearity among those indicators. In each case, two models were used: (1) a model that excludes missing observations and (2) a model that includes missing observations. The models were estimated with sampling weights that were calculated as the inverse of the probability of being a respondent in a household multiplied by the household weight, which accounts for non-responding households.134 Reduced models were derived for each of the regression models by identifying and removing independent variables that were not statistically significant via stepwise regression. Categories of significant categorical variables that were dropped by the stepwise regression were added back into the model, after which variables with the largest p-value (average p-value for categorical variables) were removed one by one, until the reduced model had only significant variables. The Wald test was used to test significance of variable/variables before their removal.135

Specification errors and goodness-of-fit of regression models were examined using the linktest5,136 and penalised log-likelihood values via Akaike information criterion (AIC) and Bayesian information criteria (BIC),137 respectively. [The idea behind the linktest is that if a regression model is well specified, extra independent variables that are significant should be found by only chance. The linktest works by creating two variables (i.e. the variable of prediction and the variable of squared prediction), after which the model is fitted with these two variables. The null hypothesis is that there is no specification error. This is checked by looking at the statistical significance of the variable of squared prediction, which should not be a statistically significant predictor (at 5%) if the null hypothesis is to be accepted.] In addition, pseudo-R2 was computed by calculating the R2 between the predicted and observed values.138 The existence of multicollinearity among independent variables was assessed to ascertain whether or not they lie within tolerance ranges.139,140 [This was measured by indicators of variable inflated factor (VIF) (i.e. measures the amount of inflation of the SE that is caused by collinearity) and ‘tolerance’, which shows the amount of collinearity a regression model can tolerate. A tolerance value of 0.1 or less, and a VIF of 10 or more, shows a variable to be highly collinear and, hence, likely to provide imprecise estimates.] The threshold for statistical significance was set at ≤ 10% in all analyses. All analyses were undertaken using Stata version 10 (StataCorp LP, College Station, TX, USA).

Incorporation in the cost–utility analysis

To generate the ICER, the estimated QoL gain associated with PA is then included in the cost– utility model reported above. Where an individual becomes physically active (with or without ERS) they accrue an additional QALY gain. Given the absence of evidence on the duration of this QALY gain we take a conservative approach by assuming that it is a one-off gain that lasts for 1 year. Sensitivity analysis addresses the impact of this assumption on the cost-effectiveness of ERS by generating ICERs at varying levels of duration, which included 1 day, 1 week, 1 month, 6 months and lifetime.


Description of sample

The mean EQ-5D for the sample was 0.86 (SD 0.23) and few had limiting illness (23.4%). The proportion of the sample that was ‘physically active’ ranged from 11.5% (via objective measurement) to 44.4% (via subjective measurement). The sample was predominantly white (90.8%), with the remaining 9% comprising those of mixed race, Asians, Chinese, Black people and those of other race, and had a mean (SD) age of 50 (6.2) years. Of the sample, 54.5% were female and most were married and living with their partners (66.3%), most had an educational qualification (80.8%) and most were in employment (76.1%). Few (25.6%) were classified as obese and smokers (21.8%), although the majority were ‘drinkers’ (84.9%). Further details are available in Appendix 7, Table 63.

Missing observations

The dependent variable (EQ-5D) had 84 missing observations (1.5%). All of the independent variables (except walking; sports and exercise; age; marital status; and region of residence and urbanisation) had missing observations (see Appendix 7, Table 63). Most variables had around 1% of data missing and PA (via objective measurement) had the highest proportion of missing observations (84%). The mean EQ-5D utility scores for individuals who had missing values for the following independent variables were statistically significantly different from those who did not: social class, BMI or smokers. The mean EQ-5D utility scores for proportion of individuals who had missing values for the indicators of PA were not, however, statistically different from those who did not.

Regression models

Table 53 shows the reduced regression models estimating the correlation between indicators of PA and HRQoL, controlling for covariates. Emphasis is placed on the models that exclude missing observations because they provide better fit and specification. Notably, results were similar across models with or without missing observations. Recall that separate models were fitted for each indicator of PA: model 1 (walking); model 2 (sports and exercises); model 3 (objective measurement); and model 4 (subjective measurement). Hereafter, the models will be referred to by these names.

TABLE 53. Estimation results of regression models (reduced models without missing observations).


Estimation results of regression models (reduced models without missing observations).

The results indicate that being ‘physically active’ through walking was statistically significantly associated with better HRQoL (0.026; p-value at 10%) compared with being inactive. Similarly, those who were reported to be ‘physically active’, defined as participation in sports and exercise (0.034), overall PA measured via objective indicators (0.072) or subjective indicators (0.047) were all found to have a statistically significant better HRQoL (p-value at 5–10%) than inactive individuals.

Other factors statistically significantly correlated with better HRQoL included high-income earners, having no/non-limiting illness, and residing in town/fringe or village/hamlet/isolated dwelling. Conversely, people with heart problems, musculoskeletal/mental/urinary/blood pressure problems and psychosocial well-being were likely to have worse HRQoL. Being relatively older, a ‘non-drinker’ of alcohol, economically inactive or obese also had a statistically significant association with worse HRQoL.

Model diagnostics

The specification error tests show that the models had good specification and that additional statistically significant regressors could be found only by chance (see Appendix 7, Table 64). The models' estimates could be considered stable, as no sign of multicollinearity was found, with average variance inflation factors and tolerance levels at 1.2 and 0.8, respectively. A reasonable proportion (between 10% and 40%) of variation in HRQoL was explained by the models as indicated by the pseudo-R2-value. Model 3 seems to have the best fit, as it had the lowest AIC and BIC values.

Impact of short-term health gains on the incremental cost-effectiveness ratio

Table 54 shows the estimated ICER following the inclusion of the short-term QALY gains in the base-case model. As expected, the inclusion of short-term QALY gains leads to lower ICERs for ERS. Compared with usual care, ERS is still more expensive, as it incurs additional costs of £169.54, but it is more effective, leading to QALY gains ranging from 0.009 to 0.011 per person. The cost per QALY of ERS compared with usual care is estimated to be between £15,513 and £18,559. This compares with the estimate from our base-case analysis, which excluded consideration of short-term benefits, of about £20,000. The results are, however, sensitive to the duration that the short-term QALY gains last (Table 55). Assuming they last for between 1 day and 1 month leads to insignificant improvements in the ICER, albeit at 6 months and lifetime durations there is a significant improvement in the ICER to < £6000 per QALY.

TABLE 54. Cost-effectiveness results (after inclusion of short-term QALY gains) comparing ERS with usual care.


Cost-effectiveness results (after inclusion of short-term QALY gains) comparing ERS with usual care.

TABLE 55. ICERS (after inclusion of short-term QALY gains) at different duration levels of QALY gains.


ICERS (after inclusion of short-term QALY gains) at different duration levels of QALY gains.


Results from our econometric analysis support the hypothesis that PA is associated with improved QoL, as measured by the EQ-5D. It is important to note, however, that the analysis in this chapter does not prove causality. In the case of the covariates, a priori expectations formulated, based on the literature with respect to their association with the HRQoL, were all met, hence, providing validity to the models. Further confidence can be drawn from the findings because all regression models had good specification and fit.

The inclusion of short-term QALY gains for individuals who are physically active resulted in reductions in the ICER for ERS, as expected. Assuming that the health gain associated with ERS lasts for 1 year, the base-case ICER is reduced by approximately £1500–4000. If we assume that these ‘feel-good’ benefits resulting from PA are sustained if an individual remains active over the course of his or her lifetime then the ICER falls significantly to < £5000. These benefits have been referred to as short-term benefits in the current analysis to distinguish them from the longer-term impacts of PA on the development of ill-health. However, they might better be regarded as process benefits that arise from the process of engaging in PA. The degree to which the process benefits resulting from PA are lasting is an issue that warrants further exploration. ERS based on composite measure of PA appears to be associated with the greatest short-term health gain and thus the lowest ICER and walking-based ERS the highest. Further studies are needed to examine how long these short-term QALY gains last, as that is critical to its impact on ICER.

Cost–consequence analysis

In addition to the development of the cost–utility analysis, we also sought to develop a cost–consequence analysis of ERS. This was an attempt to acknowledge that ERS and PA more generally might impact on a number of conditions not considered within the cost–utility analysis because of data constraints. In many cases, these impacts relate to an association between PA and an outcome that has not been shown to be causal or has not been adequately quantified to allow for it to be included in the cost–utility analysis. An attempt was made to capture both positive and negative outcomes of ERS that were excluded from the cost–utility analysis. A cost– consequence approach allows these issues to be explored although acknowledges that in many cases the effect cannot be quantified and no attempt is made to generate a single composite end point (such as a QALY or a cost–benefit ratio).

Methods for cost–consequence analysis

The analysis was conducted from a partial societal perspective, including health- and non-healthcare costs and benefits. The intervention and its cost remain unchanged from the cost–utility analysis. However, attempts were made to identify a broader range of benefits and disbenefits that might be associated with ERS and PA more generally. The evidence incorporated into the cost– consequence analysis was derived from the base-case model and the literature reviews conducted as part of this assessment.

Outcomes are presented as a synthesis of the available evidence. Wherever possible, attempts are made to quantify the effects of ERS on the outcome under consideration. For example, based on our cost–utility analysis, it is possible to provide an indication of how many strokes might be avoided as a result of increased participation in ERS. Where quantified outcomes are possible, these are expressed as the number of events per 100,000 population.

However, in many cases it is only possible to indicate the direction of change that might be achieved through increased PA, not the magnitude of effect. As such, outcomes are ultimately presented in a disaggregated fashion.


Impacts of exercise referral schemes/physical activity

Table 56 presents the costs and benefits identified in the cost–consequence analysis and their sources of data. The identification of the benefits of ERS was primarily based on the key conditions where PA has been shown to be beneficial (see Table 1).

TABLE 56. Costs and consequences of ERS.


Costs and consequences of ERS.

The majority of the evidence identified suggested that PA could have a positive impact on health outcomes. Excluding the three health outcomes already considered in the cost–utility analysis, our searches identified evidence of an association between PA and improved outcomes in musculoskeletal disease, cancers and mental health. Non-health benefits and disbenefits were also identified. These suggest that ERS might have a positive impact on absenteeism, although it might also induce some injuries that have a countering effect. Relatively few disbenefits were identified within our searches.

Cost–consequence analysis

Table 57 shows the outcomes for ERS. The results are presented as incremental costs and outcomes attributable to ERS (compared with usual care).

TABLE 57. Results of cost–consequence analysis (a cohort of 100,000).


Results of cost–consequence analysis (a cohort of 100,000).

In an attempt to present meaningful, population-level outcomes, the analysis considers a cohort of 100,000 individuals who might be eligible for ERS. The cost of ERS for this cohort is estimated to be £22M (2010 prices) to the health-care provider and £12M (2010 prices) to the participants, generating a total cost of £33M. This is based on a leisure centre-based intervention as defined in the cost–utility analysis.

The benefits of ERS, compared with a no active intervention comparator, are summarised below. These include an additional 3900 (3.9%) people becoming physically active, 51 cases of CHD avoided, 16 cases of stroke avoided, 86 cases of diabetes avoided, 152 additional people in health states devoid of illnesses (CHD, stroke and diabetes) and resulting in an expected gain of approximately 800 QALYs. If we assume that each QALY is valued at £30,000 then this generates a positive net benefit of approximately £2M (£24–22M) from a health service perspective and a negative net benefit of approximately £9M from a societal perspective (£24–33M).

In addition to the quantifiable benefits, ERS is also expected to have a positive effect on the prevention or/and management of mental health, metabolic disease, cancer and musculoskeletal conditions. It also had an impact on non-health benefits, leading to an improvement in productivity through a reduction in absenteeism at work. There are potential adverse affects in terms of injuries and pain which are considered rare,48,61 but could still negate some of the positive impacts of ERS.

Summary of cost–consequence analysis

Our cost–utility analysis found ERS to be a cost-effective intervention. The cost-effectiveness was further improved when short-term benefits in QoL were considered and ERS was targeted at individuals with pre-existing conditions. However, it is recognised that the cost–utility analysis failed to take into account a range of costs, benefits and disbenefits associated with ERS.

The cost–consequence analysis presented above attempts to take into account some of the broader impacts of ERS. In addition to reducing rates of CHD, stroke and diabetes, the evidence also suggests that ERS has the potential to reduce the incidence or severity of a number of other conditions. Although it has not proven possible to estimate the costs and benefits (in terms of QALYs) associated with these conditions, the majority of the evidence reviewed suggests that ERS may have a favourable effect on a number of other health outcomes. In addition to this, there is evidence that ERS may lead to non-health benefits, notably an improvement in productivity.

The only major disbenefit associated with ERS is an increased risk of injury, although this is relatively modest and likely to have only a marginal effect on its cost-effectiveness. However, it could be that there is some degree of publication bias in the evidence identified as the majority indicated positive effects of ERS with relatively few, suggesting that there were any negative effects for participants.

The cost–consequence analysis was conducted as a means of presenting the economic findings generated herein in a manner that might be more easily digested by a broader group of stakeholders. By providing disaggregated benefits, for example in the form of the number of cases strokes avoided per 100,000 population, it is hoped that this makes the outcomes of ERS more easily understood. However, it should be noted that the cost–consequence analysis was entirely based on the cost–utility analysis and literature reviews presented herein. No attempt was made to undertake a systematic review of the literature to identify further evidence on the impacts of ERS and it might be that some evidence has been overlooked.

The findings of the cost–consequence analysis support our hypothesis that the cost-effectiveness estimates generated by our cost–utility analysis are conservative. A more holistic analysis, taking into account the broader range of benefits associated with ERS, is likely to lead to much improved cost-effectiveness ratios compared with those presented earlier in this report. However, there is a pressing need to generate further evidence on both the short- and longer-term impacts of ERS to better determine whether or not it is a cost-effective use of health-care resources.

Comparisons with previous research findings

Previous studies have tended to conclude that ERS is a cost-effective use of resources, although they too have highlighted the uncertainty around many of the estimates of effect and cost-effectiveness. Isaacs et al.61 generated an ICER in the form of the incremental cost per unit change in SF-36 score and concluded that, in comparison with controls, ERS led to an incremental cost of £19,500 per unit change in SF-36 score at 6-month follow-up. Given the outcome measure adopted in the study comparison with our own findings is impossible, although it should be noted that this study also found only a modest change in health status.

In contrast, the study by Gusi et al.70 showed that ERS resulted in an incremental QALY gain of 0.132 over a 6-month period as measured by change in the EQ-5D, at an incremental cost of £41 per participant, generating an ICER of £311/QALY. The individuals in this study were obese and/or depressed and the findings may provide further evidence to suggest that PA can have process benefits far greater than those suggested by our own analysis. However, no attempt was made to ascertain whether or not the benefits might be sustained beyond the study period.

The findings in NICE76 showed that ERS compared with controls led to an incremental cost per person of £25.10 and a lifetime QALY gain of 0.31 per person, equating to an incremental cost per QALY of £80.96. We are inclined to relate our findings more directly to NICE76 because of similarities in the methods used in both studies. For example, the model used in our study was based on NICE.76

The analysis conducted for NICE showed a greater QALY gain than our own findings. This might be partially explained by the inclusion of colon cancer as an additional outcome in the NICE model. In addition to this, the NICE model adopted higher estimates of the effectiveness of ERS than our analysis (RR of becoming active of 1.60 vs 1.11 herein) and there are differences in the handling of uptake and adherence between the two analyses. Coupled with a lower estimated cost of ERS, this resulted in the NICE analysis generating improved ICERs compared with our own findings. In testing our own model we sought to reproduce the findings of the NICE model by incorporating the improved effectiveness of ERS. Despite slight differences in the modelling approach, it produced relatively consistent findings. Although we have based our approach to modelling the cost-effectiveness of ERS on the original NICE work, we believe that our meta-analysis of effectiveness has resulted in more robust input data and ultimately more accurate estimates of the cost-effectiveness of ERS.


  • The cost–utility analysis presented herein was an attempt to adhere to best practice principles in economic evaluation119 and also replicate the methods adopted in previous research.76
  • Using this method our base-case analysis in a sedentary individuals aged 40–60 years shows an indicative ICER for ERS versus usual care of £20,876/QALY This result was sensitive to changes in key input parameters, particularly the estimate of effectiveness of ERS (change in PA) sourced from our systematic review. There was a 51% probability that ERS was cost-effective at £20,000/QALY and 88% probability that ERS was cost-effective at £30,000/QALY
  • Further developments of this model to incorporate short-term benefits in HRQoL associated with ERS reduced the base-case ICER somewhat to £17,032 to £18,559/QALY
  • The cost-effectiveness of ERS appeared to be improved in disease-specific subgroups compared with base case, i.e. obesity £14,618/QALY, hypertension £12,834/QALY, and depression £8414/QALY
  • The cost–consequence analysis presented above is an attempt to support this hypothesis and reports further benefits of ERS that could not be incorporated into the cost–utility analysis, although, had they been included, they would almost certainly have further improved the cost-effectiveness of ERS.
  • The previous sections include some lengthy discussion about the limitations of the approaches adopted, in particular the use of decision-analytic modelling and cost–utility analysis to model ERS. ERS involves a complex process, from the point at which an individual is ‘prescribed’ ERS, to the point at which he or she accesses the service and then the degree to which he or she adheres in the programme and beyond. Interventions of this sort, which comprise behaviour change, are difficult to simplify into standard economic evaluation frameworks, and this is exemplified by the analyses presented herein, which include a significant number of assumptions (some of which could fairly be described as heroic) and are partial, capturing only some of the costs and benefits of ERS.
  • Consideration needs to be given to the trade-off between developing a simple model (as we have done here) which can be populated and acknowledges its limitations versus a more complex model which may be a better representation of reality but can only be partially populated, which might result in even greater uncertainty. In both cases, the fundamental issue that needs to be addressed is improvements in the source data on the effectiveness of ERS, including evidence on long-term outcomes.
Image ch3f3
Image app1f1
© 2011, Crown Copyright.

Included under terms of UK Non-commercial Government License.

Cover of The Clinical Effectiveness and Cost-Effectiveness of Exercise Referral Schemes: A Systematic Review and Economic Evaluation
The Clinical Effectiveness and Cost-Effectiveness of Exercise Referral Schemes: A Systematic Review and Economic Evaluation.
Health Technology Assessment, No. 15.44.
Pavey TG, Anokye N, Taylor AH, et al.

PubMed Health Blog...

read all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...