- Journal List
- BMJ Open Access
- PMC2659855

# Evidence of methodological bias in hospital standardised mortality ratios: retrospective database study of English hospitals

^{}

^{1}Jonathan J Deeks, professor of health statistics,

^{1}Alan Girling, senior research fellow,

^{1}Gavin Rudge, data scientist,

^{1}Martin Carmalt, consultant physician,

^{2}Andrew J Stevens, professor of public health and epidemiology,

^{1}and Richard J Lilford, professor of clinical epidemiology

^{1}

^{1}Unit of Public Health, Epidemiology and Biostatistics, University of Birmingham, Birmingham B15 2TT

^{2}Royal Orthopaedic Hospital, Birmingham B31 2AP

^{}Corresponding author.

**This article has been corrected.**See BMJ. 2009 April 01; 338: b1348.

## Abstract

**Objective** To assess the validity of case mix adjustment methods used
to derive standardised mortality ratios for hospitals, by examining the
consistency of relations between risk factors and mortality across
hospitals.

**Design** Retrospective analysis of routinely collected hospital data
comparing observed deaths with deaths predicted by the Dr Foster Unit case mix
method.

**Setting** Four acute National Health Service hospitals in the West
Midlands (England) with case mix adjusted standardised mortality ratios ranging
from 88 to 140.

**Participants** 96948 (April 2005 to March 2006), 126695 (April 2006
to March 2007), and 62639 (April to October 2007) admissions to the four
hospitals.

**Main outcome measures** Presence of large interaction effects between
case mix variable and hospital in a logistic regression model indicating
non-constant risk relations, and plausible mechanisms that could give rise to
these effects.

**Results** Large significant (P≤0.0001) interaction effects were seen
with several case mix adjustment variables. For two of these variables—the
Charlson (comorbidity) index and emergency admission—interaction effects could
be explained credibly by differences in clinical coding and admission practices
across hospitals.

**Conclusions** The Dr Foster Unit hospital standardised mortality ratio
is derived from an internationally adopted/adapted method, which uses at least
two variables (the Charlson comorbidity index and emergency admission) that are
unsafe for case mix adjustment because their inclusion may actually increase the
very bias that case mix adjustment is intended to reduce. Claims that variations
in hospital standardised mortality ratios from Dr Foster Unit reflect
differences in quality of care are less than credible.

## Introduction

The longstanding need to measure quality of care in hospitals has led to publication
of league tables of standardised mortality ratios for hospitals in several
countries, including England, the United States, Canada, the Netherlands, and
Sweden.^{1}
^{2}
^{3}
^{4}
^{5}
^{6} Standardised mortality ratios for
hospitals in these countries have been derived with methods heavily influenced by
the seminal work of Jarman et al,^{1} who first
developed standardised mortality ratios for National Health Service (NHS) hospitals
in England in 1999, and by the subsequent methodological developments by the Dr
Foster Unit.^{7}
^{8} The Dr Foster Unit methodology is used by
Dr Foster Intelligence, a former commercial company that is now a public-private
partnership, to annually publish standardised mortality ratios for English hospitals
in the national press.

A consistent, albeit controversial,^{9}
^{10}
^{11} inference drawn from the wide variation
in published standardised mortality ratios for hospitals is that this reflects
differences in quality of care. In the 2007 hospital guide for England,^{12} Dr Foster Intelligence portrayed
standardised mortality ratios for hospitals as “an effective way to measure and
compare clinical performance, safety and quality.” Although an increasing
international trend exists for standardised mortality ratios for hospitals to be
developed and published,^{13}
^{14} we must be sure that the underlying case
mix adjustment method is fit for purpose before inferences about quality of care are
drawn.

Case mix adjustment is widely used to overcome imbalances in patients’ risk factors
so that fairer comparisons between hospitals can be made. Methods for case mix
adjustment are often criticised because they can fail to include all the important
case mix variables and do not adequately adjust for a variable because of
measurement error.^{10}
^{11} Despite these criticisms, case mix
adjustment is widely done because the adjusted comparisons, although imperfect, are
generally considered to be less biased than unadjusted comparisons.

However, a third, more serious problem exists that can affect the validity of case
mix adjustment. In a study that compared unadjusted and case mix adjusted treatment
effects from non-randomised studies against treatment effects from randomised
trials, Deeks et al observed that on average the unadjusted and not the adjusted
non-randomised results agreed best with the randomised comparisons.^{15} In this instance, case mix adjustment had
increased bias in the comparisons. Nicholl pointed out that case mix adjustment can
create biased comparisons when underlying relations between case mix variables and
outcome are not the same in all the comparison groups.^{16} This phenomenon has been termed “the constant risk
fallacy,” because if the risk relations are assumed to be constant, but in fact are
not, then case mix adjustment may be more misleading than crude comparisons.^{16} Two key mechanisms can give rise to
non-constant risk relations. The first mechanism involves differential measurement
error, and the second one involves inconsistent proxy measures of risk. Each is
illustrated below.

Consider two hospitals that are identical in all respects (case mix, mortality, quality of care) except that one hospital (B) systematically under-records comorbidities (measurement error) in its patients. If mortality is case mix adjusted for comorbidity then the expected (but not the observed) number of deaths in hospital B will be artificially depleted, because its patients seem to be at lower risk than they really are. The effect of case mix adjustment is to erroneously inflate the standardised mortality ratio (observed number of deaths/expected number of deaths × 100) for that hospital. The box presents a numerical example of this scenario.

### Example of differential measurement error

To illustrate the constant risk fallacy we construct hypothetical hospital mortality data with a single case mix variable—a comorbidity index (CMI) that takes values 0 to 6. The relation between in-hospital mortality and CMI value has been modelled for the population, estimating risks of in-hospital death of 0.02, 0.04, 0.08, 0.14, 0.25, 0.40, and 0.57 in the seven CMI categories (equivalent to an odds ratio of two for each unit increase in the index).

Consider two hospitals, A and B, both of which admit 1000 patients a year in each of the seven CMI categories. Assume that the case mix of the groups of patients and the quality of care in the two hospitals are identical and that 1500 deaths are observed in both hospitals. Consider that hospital A correctly codes the comorbidity index, whereas hospital B tends to under-code, such that in hospital B for each true CMI the following are recorded:

- CMI=0: all are coded as 0
- CMI=1: 50% coded 0, 50% coded 1
- CMI=2: 33% coded 0, 33% coded 1, 33% coded 2
- CMI=3: 25% coded 0, 25% coded 1, 25% coded 2, 25% coded 3
- CMI=4: 20% coded 0, 20% coded 1, 20% coded 2, 20% coded 3, 20% coded 4
- CMI=5: 20% coded 1, 20% coded 2, 20% coded 3, 20% coded 4, 20% coded 5
- CMI=6: 20% coded 2, 20% coded 3, 20% coded 4, 20% coded 5, 20% coded 6.

The consequence of this is that rather than observing 1000 patients in each of the seven CMI categories, in hospital B the numbers instead are 2283, 1483, 1184, 850, 600, 400, and 200. It thus looks as if a difference exists in the distribution of the CMI between the two hospitals, with hospital B having on average a lower CMI. Computation of expected numbers of deaths taking into account the reported (rather than true) CMI is done to calculate standardised mortality ratios on the basis of the modelled values.

The expected number of deaths in hospital A is (1000×0.02)+(1000×0.04)+(1000×0.08)+(1000×0.14)+(1000×0.25)+(1000×0.40)+(1000×0.57)=1500, yielding a standardised mortality ratio (observed/expected deaths) of 1500/1500=100.

The expected number of deaths in hospital B is (2283×0.02)+(1483×0.04)+(1184×0.08)+(850×0.14)+(600×0.25)+(400×0.40)+(200×0.57)=743, yielding a standardised mortality ratio of 1500/743=202.

It thus wrongly seems that the mortality in hospital B is twice that in
hospital A. Adjustment has changed a fair comparison (1500
*v* 1500) into a biased comparison. This is an
illustration of the constant risk fallacy. Furthermore, modelling the data
by using logistic regression reveals that whereas the relation between CMI
and mortality in hospital A is the same as in the population (odds ratio=2.0
per category increase), the relation in hospital B is weaker (odds ratio=1.6
per category increase in CMI) (as would be expected through
misclassification introducing attenuation bias) and the interaction between
hospital B and CMI is clinically and statistically significant (P<0.001).
If CMI was measured with equal measurement error in all hospitals the
problem would be one of residual confounding caused by regression dilution
or attenuation bias (in which case the standardised mortality ratios would
be preferable to crude mortality but will not fully adjust for the risk
factor). Because measurement errors differ among hospitals, the constant
risk fallacy (where standardised mortality ratios may be more misleading
than the crude mortality comparison) is a possibility.

The second mechanism can occur even in the absence of measurement error. Consider, for example, emergency admissions to hospitals. Patients admitted as emergencies are usually regarded as being seriously ill, but if an individual hospital often admits the “walking wounded” (who are not seriously ill) as emergencies, then the risk associated with being an emergency admission in that hospital will be reduced. Variation in this practice across hospitals leads to a non-constant relation between emergency admission and mortality. The standardised mortality ratio for hospitals that admit more walking wounded will receive an unjustified downward case mix adjustment, because elsewhere emergencies are generally the sickest patients and the case mix adjustment will endeavour to reflect this.

A general feature of these two mechanisms that allows identification of case mix variables prone to the constant risk fallacy is that the value recorded for a given patient would change if he or she presented at a different hospital. Comorbidity would be under-coded in one hospital compared with another, whereas the patient may be admitted (and thus coded) as an emergency in some hospitals and elsewhere treated and discharged without being admitted at all. Case mix variables such as age, sex, and deprivation (on the basis of the patients’ home address) are not prone to these two mechanisms because their values do not change with different hospitals.

A simple way to screen case mix variables for their susceptibility to non-constant
risk relations, on a scale sufficient to bias the case mix adjustment method, is to
do a statistical test for interaction effects between hospital and case mix
variables in a logistic regression model that predicts death in hospital.^{16} If large interaction effects are not found
then no apparent evidence of non-constant risk relations exists and the constant
risk fallacy (within the limits of statistical inference) may be discounted
(although the other challenges in interpreting standardised mortality ratios, such
as omitted covariates, will still remain^{9}
^{10}). However, if a large interaction effect
is found, then this indicates a non-constant risk relation. If this is due to
inconsistent measurement practices across hospitals (as in the comorbidity index
example in the box), it will result in a misleading adjustment to standardised
mortality ratios. If the interaction occurs because the covariate genuinely has
different relations with death across hospitals (as in the emergency admission
example above), this too will result in a misleading adjustment to standardised
mortality ratios. Alternatively, the interaction could occur if different levels of
the covariate were associated with different standards of care across hospitals, in
which case the standardised mortality ratio will appropriately reflect the average
of the associated increases in mortality. Unfortunately, no statistical method
exists for teasing apart these non-exclusive explanations, but they can be explored
and resolved, to some extent, by doing “detective work” seeking a likely cause for
the observed interaction effect.

In this paper we screened the Dr Foster Unit method,^{1} which is used to derive standardised mortality ratios for English
hospitals and which has been adopted/adapted internationally,^{1}
^{2}
^{3}
^{4}
^{5}
^{6}
^{12} for its susceptibility to the constant
risk fallacy. We first tested for the presence of large interaction effects and
then, in respect of two key case mix variables (comorbidity index and emergency
admission), we did detective work to seek likely explanations.

## Methods

### Dr Foster Unit case mix adjustment method

The Dr Foster Unit case mix adjustment method uses data derived from routinely
collected hospital episode statistics.^{12} These data include admission date, discharge date, in-hospital
mortality, and primary and secondary diagnoses according to ICD-10
(international classification of disease, 10th revision) codes on every
inpatient admission (or spell) in NHS hospitals in England. The Dr Foster Unit
standardised mortality ratio is derived from logistic regression models, which
are based on 56 primary diagnosis groups derived from hospital episode
statistics data accounting for 80% of hospital mortality. Covariates for case
mix adjustment in the model are sex, age group, method of admission (emergency
or elective), socioeconomic deprivation, primary diagnosis, the number of
emergency admissions in the previous year, whether the patient was admitted to a
palliative care specialty, and the Charlson (comorbidity) index (range 0-6),
which is derived from secondary ICD-10 diagnoses codes.^{17}

### Study hospitals and data sources

This study involves four hospitals, representing a wide range of the published case mix adjusted Dr Foster Unit standardised mortality ratios (88-143, for the period April 2005-March 2006), which had purchased the Dr Foster Intelligence Real Time Monitoring computer system and so were able to provide anonymised output data (including case mix variables, the Dr Foster Unit predicted risk of death, and whether a death occurred) for this study. The hospital with the lowest standardised mortality ratio (88) is a large teaching hospital (University Hospital North Staffordshire, 1034 beds); those with higher ratios were one large teaching hospital (123: University Hospitals Coventry and Warwickshire, 1139 beds) and two medium sized acute hospitals (127: Mid Staffordshire Hospitals, 474 beds; 143: George Eliot Hospital, 330 beds).

Our analyses are based on data and predictions from the Real Time Monitoring system, which were available for the following time periods: April 2005 to March 2006 (year 1), April 2006 to March 2007 (year 2), and April to October 2007 (part of year 3—the most recent data available at the time of the study).

### Statistical analyses

We constructed logistic regression models to test for interactions to assess whether the case mix adjustment variables used in the Dr Foster Unit method were prone to the constant risk fallacy. The Dr Foster Unit dataset includes the predicted risk of death for each patient, generated from the Dr Foster Unit case mix adjustment model, which we included (after logit transformation) as an offset term in a logistic regression model of in-hospital deaths. To this model we added terms for each hospital (thus allowing for the differences between hospitals in adjusted mortality) and then interaction terms for each hospital and case mix variable in turn (which estimate the degree to which the relation between the case mix variable and mortality in each hospital differed from that implemented in the Dr Foster Unit case mix adjustment model).

Interaction terms that produced odds ratios close to one indicated that the relation between the case mix variable and mortality was constant and so not prone to the constant risk fallacy. The presence of large significant interactions suggested that the case mix variable was potentially prone to the constant risk fallacy, because its relation to mortality differed from the Dr Foster Unit national estimate. We tested the significance of interactions by using likelihood ratio tests; we deemed P values ≤0.01 to be statistically significant. We report the odds ratios, including 95% confidence intervals and P values, for each hospital-variable interaction over the three years.

### Selected variables

The following patient level variables included in the Dr Foster Unit adjustment were available and tested: Charlson index (0-6, a measure of comorbidity), age (10 year age bands), sex (male/female), deprivation (fifths), primary diagnosis (1 of 56), emergency admission (no/yes), and the number of emergency admissions in the previous year (0, 1, 2, 3, or more). We excluded the palliative care variable from our analyses because no admissions to this specialty occurred in the hospitals. We excluded less than 1.5% of all the data from the Real Time Monitoring system because of missing data (for example, age not known, deprivation not known). The total numbers of admissions for each year were 96948 (April 2005 to March 2006), 126695 (April 2006 to March 2007), and 62639 (April to October 2007, a part year).

For two prominent case mix variables—the Charlson index of comorbidity and emergency admission—we did detective work to seek explanations for the presence of large interaction effects, as described below.

### Investigation of interaction effects seen with Charlson index

Patients with a lower Charlson index (less comorbidity) have lower expected mortality in the Dr Foster Unit model. Therefore, if the Charlson index was systematically under-coded in some hospitals they would be assigned artificially inflated standardised mortality ratios. We investigated the possibility of such misclassification in the Charlson index in two ways.

Firstly, we investigated changes in the depth of clinical coding (number of ICD-10 codes for secondary diagnoses identified per admission) over time within the hospitals and examined the hypothesis that the increase would be most rapid in those starting with the lowest Charlson indices (as they have the greatest headroom to improve through better coding). We formed the contingent hypothesis that any such change would be accompanied by diminished interactions between Charlson index and mortality across hospitals.

Secondly, we considered that if clinical coding was similarly accurate in all hospitals, then differences in the Charlson index should reflect genuine differences in case mix profiles. We postulated that hospitals with higher Charlson indices were therefore more likely to admit older patients and to have higher proportions of emergency admissions, longer lengths of stay, and a higher crude mortality. If this was not the case, then this finding would corroborate a hypothesis that differences in the Charlson indices across hospitals were primarily attributable to systematic differences in clinical coding practices.

### Investigation of interaction effects seen with emergency admission

In the original analyses by Jarman et al, the emergency admission variable was
noted to be the best predictor of hospital mortality.^{1} We explored this variable in more depth by investigating
the proportion of emergency admissions that were recorded as having zero length
of stay (being admitted and discharged on the same day). Although clinically
valid reasons may exist to admit patients for zero stay, and some patients may
die on admission, the practice of admitting less seriously ill patients has been
recognised as a strategy that is increasingly used in the NHS to comply with
accident and emergency waiting time targets.^{18}
^{19} This potentially leads to a
reduction in the mortality risk associated with emergency admissions in
hospitals that more often follow this practice. We examined the magnitude of
differences in the proportion of emergency admissions with zero length of stay
both within hospitals over time and between hospitals, as well as the observed
risk associated with zero and non-zero lengths of stay.

## Results

We determined the extent to which case mix variables used in the Dr Foster Unit method had a non-constant relation with mortality across hospitals by examining the odds ratios of interaction terms for hospital and case mix variables derived from a logistic regression model (with death as the outcome). Table 11 reports the odds ratios of tests of interactions for six case mix variables.

Two variables (sex and deprivation) had no significant interaction with hospitals, indicating that these two variables are safe to use for case mix adjustment because they are not prone to the constant risk fallacy. However, the remaining variables had significant interactions. The number of previous emergency admissions was significant in year 2; the three hospitals with high standardised mortality ratios had 6% to 10% increases in odds of death with every additional previous emergency admission over and above the allowance made in the Dr Foster Unit model. Age had a significant interaction in year 2, but the effect was small—a 10 year age change was associated with an additional 1% increase in odds of death across the hospitals. Primary diagnosis also had significant interactions in all three years (results not shown as 56 categories and four hospitals produce 224 interaction terms).

The Charlson index had significant interaction effects in year 1 and year 2 but not in year 3. A unit change in the Charlson index was associated with a wide range of effect sizes—up to a 7% increase in odds of death (George Eliot Hospital, year 1) and an 8% reduction in odds of death (University Hospital North Staffordshire, year 2) over and above that accounted for in the Dr Foster Unit model. Across the full range of the Charlson index, these correspond to increases in odds of death of 50% or decreases of 39%.

We found significant interactions with being an emergency admission in all years across all hospitals. The effect sizes ranged from 38% (University Hospital North Staffordshire, year 3) to 355% (Mid Staffordshire Hospitals, year 2) increases in odds of death above those accounted for in the Dr Foster Unit equation.

### Investigation of interaction effects seen with Charlson index

The 96948 admissions in the four hospitals for 2005/06 had an overall mean Charlson index of 1.17 (median 1, interquartile range 0-2). Table 22 shows the mean Charlson index for the four study hospitals. The hospital with a low standardised mortality ratio (University Hospital North Staffordshire) had the highest mean Charlson index (1.54), whereas the three hospitals with high standardised mortality ratios had mean Charlson index values near or below the median (1).

An indicator of completeness of coding is depth of coding—the number of ICD-10 codes per admission (table 22).). University Hospital North Staffordshire had the highest mean coding depth and Charlson index in all years; more importantly, as coding depth increased over the years in all hospitals (table 33),), the interaction between the Charlson index and hospitals became smaller and statistically non-significant (table 11).). We also explored the extent to which differences in the Charlson index between hospitals reflect genuine differences in case mix profiles (table 22).). Although University Hospital North Staffordshire serves a more deprived population with a higher proportion of male patients than the other hospitals, the percentage of emergency admissions, readmissions, length of stay, and crude mortality are at variance with the view that this hospital treats a systematically “sicker” population of patients. The evidence from table 22 is therefore inconsistent with the explanation that differences in the Charlson index reflect genuine differences in case mix profiles.

### Investigation of interaction effects seen with emergency admission

We investigated the emergency admission variable in more depth by considering proportions of emergency/non-emergency admissions with a zero length of stay (days). Combining data across hospitals, the crude in-hospital mortality for non-emergency admissions was 1/1000 for zero length of stay and 23/1000 for non-zero length stay; the mortality for emergency admissions was 46/1000 for zero length of stay and 107/1000 for non-zero length of stay. Table 44 shows that the proportion of emergency admissions with zero length of stay varied between 10.4% and 20.4% across hospitals. The hospital with the lower case mix adjusted standardised mortality ratio (University Hospital North Staffordshire) had the highest proportion of zero stay emergency patients in years 2 and 3 (20.4% and 17.7%), whereas the hospital with the highest standardised mortality ratio (George Eliot Hospital) had the lowest proportion of zero stay emergency patients in all three years (10.4%, 11.0%, and 12.9%). The large variations in proportions of emergency/non-emergency patients with zero length of stay indicate that systematically different admission policies were being adopted across hospitals. The net effect of this is that the relation between an emergency admission and risk of death varies sustainably across hospitals (that is, the risk of death is not constant), apparently because of differences in hospital admission policies.

## Discussion

The league tables of mortality for NHS hospitals in England from Dr Foster
Intelligence,^{12} compiled by using case
mix adjustment methods that have been internationally adopted or adapted,^{2}
^{3}
^{4}
^{5}
^{6} have been published annually since 2001
and continue to raise concerns about the wide variations in standardised mortality
ratios for hospitals and quality of care.^{20}
Unsurprisingly perhaps, similar concerns have been raised in other countries that
have developed their own standardised mortality ratios.^{5}
^{21} Before such concerns can be legitimately
aired, we must ensure that methods used by Dr Foster Unit are fit for purpose and
not potentially misleading.^{8}
^{9}

Our results show that a critical, hitherto often overlooked, methodological concern
is that the relation between risk factors used in case mix adjustment and mortality
differs across the hospitals, leading to the constant risk fallacy. This phenomenon
can increase the very bias that case mix adjustment is intended to reduce.^{16} The routine use of locally collected
administrative data for case mix variables makes this a real concern.^{16} A serious problem is that no statistical
fix exists for overcoming the challenges of variables susceptible to this constant
risk fallacy.^{16} It has to be investigated
by a more painstaking inquiry.

As the Dr Foster Unit method, like other case mix adjustment methods, does not report
screening variables for non-constant risk,^{1}
^{12} we investigated seven variables and
found that three of them—age, sex, and deprivation—were safe in this respect.
However, we found that emergency admission, the Charlson (comorbidity) index,
primary diagnosis, and the number of emergency admissions in the previous year had
clinically and statistically significant interaction effects. For two variables, the
Charlson index and emergency admission, we found credible evidence to suggest that
they are prone to the constant risk fallacy caused by systematic differences in
clinical coding and emergency admission practices across hospitals.

For the Charlson index variable, we showed how the interaction effects seemed to
relate to the number of ICD-10 codes (for secondary diagnoses) per admission—that
is, depth of clinical coding.^{22} Overall, we
reasoned that as the increased depth of coding (over time) was accompanied by a
decrease in the interaction effect and as differences in the Charlson index did not
reflect genuine differences in case mix profiles, we could reasonably conclude that
the Charlson index is prone to the constant risk fallacy largely as a result of
differential measurement error from clinical coding practices. Drawbacks in
determining the Charlson index by using administrative datasets have been reported
previously.^{23} Hospitals with a lower
depth of coding were disadvantaged because this was associated with a lower Charlson
index, which in turn underestimated the expected mortality and so inflated the
standardised mortality ratio. For the emergency admission variable, we found strong
evidence of systematic differences across hospitals in numbers of patients admitted
as emergencies who were admitted and discharged on the same day. The higher risk
usually associated with emergencies would be diluted by the inclusion of zero length
of stay admissions in some hospitals. Thus, we judge these two variables—the
Charlson index and emergency admission—to be unsafe to use in case mix adjustment
methods because, ironically, their inclusion may have increased the bias that case
mix adjustment aims to reduce. Further research to understand the mechanisms behind
the other variables with large interactions is clearly warranted.

Given that our analyses are based on a subset of hospitals in the West Midlands, our
study urgently needs to be replicated with more hospitals (for example, at the
national level) to examine the extent to which our findings are generalisable.
Furthermore, given the widespread use of standardised mortality ratios for hospitals
in other countries (such as the United States,^{2}
^{3} Canada,^{4} the Netherlands,^{5} and
Sweden^{6}), with similar methods to those
of the Dr Foster Unit, we are concerned that these comparisons may also be
compromised by the possibility of the constant risk fallacy. In addition, given the
widespread use of case mix adjusted outcome comparisons in health care (for example,
for producing standardised mortality ratios to compare intensive care units^{8}), we urge that all case mix adjustment
methods should screen (and report) variables for their susceptibility to the
constant risk fallacy. A similar analysis could also be done within a single
hospital, such that a logistic regression model with an offset term could be used to
discover which set of the case mix variables has any systematic relation with
mortality over and above the original adjustments. This may be an effective way for
a hospital to identify variables that are susceptible to the constant risk fallacy
and may give hospitals, especially those with a high standardised mortality ratio, a
focal point for their subsequent investigations. Hospitals with low standardised
mortality ratios may also find this analysis useful in increasing their
understanding of their standardised mortality ratio.

Our findings suggest that the current Dr Foster Unit method is prone to bias and that
any claims that variations in standardised mortality ratios for hospitals reflect
differences in quality of care are less than credible.^{8}
^{12} Indeed, our study may provide a partial
explanation for understanding why the relation between case mix adjusted outcomes
and quality of care has been questioned.^{24}
Nevertheless, despite such evidence, assertions that variations in standardised
mortality ratios reflect quality of care are widespread,^{25} resulting, unsurprisingly, in institutional stigma by
creating enormous pressure on hospitals with high standardised mortality ratios and
provoking regulators such as the Healthcare Commission to react.^{20}

We urge that screening case mix variables for non-constant risk relations needs to
become an integral part of validating case mix adjustment methods. However, even
with apparently safe case mix adjustment methods, we caution that we cannot reliably
conclude that the differences in adjusted mortality reflect quality of care without
being susceptible to the case mix adjustment fallacy,^{10} because case mix adjustment by itself is devoid of any
direct measurement of quality of care.^{26}

### What is already known on this topic

- Case mix adjusted hospital standardised mortality ratios are used around the world in an effort to measure quality of care
- However, valid case mix adjustment requires that the relation between each case mix variable and mortality is constant across all hospitals (a constant risk relation)
- Where this requirement is not met, case mix adjustment may be misleading, sometimes to the degree that it will actually increase the very bias it is intended to reduce

### What this study adds

- Non-constant risk relations exist for several case mix variables used by the Dr Foster Unit to derive standardised mortality ratios for English hospitals, raising concern about the validity of the ratios
- The cause of the non-constant risk relation for two case mix variables—a comorbidity index and emergency admission—is credibly explained by differences in clinical coding and hospitals’ admission practices
- Case mix adjustment methods should screen case mix variables for non-constant risk relations

## Notes

Editor's note: The embargoed copy of this article, sent to the media, wrongly attributed to Dr Foster Intelligence the authorship of the standardised mortality ratio method that is considered here. The article, as published here, now attributes this standardised mortality ratio method to the Dr Foster Unit at Imperial College.

This independent study was commissioned by the NHS West Midlands Strategic Health Authority. We are grateful for the support of all the members of the steering group, chaired by R Shukla. We especially thank the staff of participating hospitals, in particular P Handslip. Special thanks go to Steve Wyatt for his continued assistance with the project. We also thank our reviewers for their helpful suggestions.

Contributors: MAM drafted the manuscript. MAM and GR did the preliminary analyses. JJD designed and did the statistical modelling to test for interactions, with support from AG. RJL and AJS provided guidance and support. MC provided medical advice and did preliminary investigations into the Charlson index. All authors contributed to the final manuscript. MAM is the guarantor.

Funding: The study was part of a study commissioned by the NHS West Midlands Strategic Health Authority. AG is supported by the EPSRC MATCH consortium.

Competing interests: None declared.

Ethical approval: Not needed.

## Notes

Cite this as: *BMJ* 2009;338:b780

## References

## Formats:

- Article |
- PubReader |
- ePub (beta) |
- PDF (131K) |
- Citation

- Weekend admission to hospital has a higher risk of death in the elective setting than in the emergency setting: a retrospective database study of national health service hospitals in England.[BMC Health Serv Res. 2012]
*Mohammed MA, Sidhu KS, Rudge G, Stevens AJ.**BMC Health Serv Res. 2012 Apr 2; 12:87. Epub 2012 Apr 2.* - Including post-discharge mortality in calculation of hospital standardised mortality ratios: retrospective analysis of hospital episode statistics.[BMJ. 2013]
*Pouw ME, Peelen LM, Moons KG, Kalkman CJ, Lingsma HF.**BMJ. 2013 Oct 21; 347:f5913. Epub 2013 Oct 21.* - A simple insightful approach to investigating a hospital standardised mortality ratio: an illustrative case-study.[PLoS One. 2013]
*Mohammed MA, Stevens AJ.**PLoS One. 2013; 8(3):e57845. Epub 2013 Mar 5.* - Explaining differences in English hospital death rates using routinely collected data.[BMJ. 1999]
*Jarman B, Gault S, Alves B, Hider A, Dolan S, Cook A, Hurwitz B, Iezzoni LI.**BMJ. 1999 Jun 5; 318(7197):1515-20.* - Effects of short-term exposure to air pollution on hospital admissions of young children for acute lower respiratory infections in Ho Chi Minh City, Vietnam.[Res Rep Health Eff Inst. 2012]
*HEI Collaborative Working Group on Air Pollution, Poverty, and Health in Ho Chi Minh City, Le TG, Ngo L, Mehta S, Do VD, Thach TQ, Vu XD, Nguyen DT, Cohen A.**Res Rep Health Eff Inst. 2012 Jun; (169):5-72; discussion 73-83.*

- Using clinical variables and drug prescription data to control for confounding in outcome comparisons between hospitals[BMC Health Services Research. ]
*Colais P, Di Martino M, Fusco D, Davoli M, Aylin P, Perucci CA.**BMC Health Services Research. 14(1)495* - Variations in the Hospital Standardized Mortality Ratios in Korea[Journal of Preventive Medicine and Public H...]
*Lee EJ, Hwang SH, Lee JA, Kim Y.**Journal of Preventive Medicine and Public Health. 2014 Jul; 47(4)206-215* - Variation in the recording of common health conditions in routine hospital data: study using linked survey and administrative data in New South Wales, Australia[BMJ Open. ]
*Lujic S, Watson DE, Randall DA, Simpson JM, Jorm LR.**BMJ Open. 4(9)e005768* - Variations and inter-relationship in outcome from emergency admissions in England: a retrospective analysis of Hospital Episode Statistics from 2005–2010[BMC Health Services Research. ]
*Holt PJ, Sinha S, Ozdemir BA, Karthikesalingam A, Poloniecki JD, Thompson MM.**BMC Health Services Research. 14270* - Ethnic variations in unplanned readmissions and excess length of hospital stay: a nationwide record-linked cohort study[The European Journal of Public Health. 2013...]
*de Bruijne MC, van Rosse F, Uiters E, Droomers M, Suurmond J, Stronks K, Essink-Bot ML.**The European Journal of Public Health. 2013 Dec; 23(6)964-971*

- PubMedPubMedPubMed citations for these articles

- Evidence of methodological bias in hospital standardised mortality
...Evidence of methodological bias in hospital standardised mortality ratios: retrospective database study of English hospitalsBMJ Open Access. 2009; 338()

Your browsing activity is empty.

Activity recording is turned off.

See more...