Included under terms of UK Non-commercial Government License.
NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.
Sandall J, Murrells T, Dodwell M, et al. The efficient use of the maternity workforce and the implications for safety and quality in maternity care: a population-based, cross-sectional study. Southampton (UK): NIHR Journals Library; 2014 Oct. (Health Services and Delivery Research, No. 2.38.)
The efficient use of the maternity workforce and the implications for safety and quality in maternity care: a population-based, cross-sectional study.
Show detailsThe over-riding aim of this research was to understand the relationships between maternity workforce size, skill mix and quality outcomes including patient safety and quality, effectiveness and unit-level efficiency. The following sections first report the unadjusted variations between trusts on the range of selected outcomes. In our adjusted models, we then investigate the relationship between maternity workforce staffing and outcomes, and to what extent there may be an optimal staffing level (objective 1). We then explore the relationship between maternity skill mix and outcomes, and to what extent there may be an optimal staffing mix (objective 2). In both the above analyses we take into account how organisational factors may affect variability in outcomes. Finally, we explore the relationships between maternity workforce size, skill mix and quality outcomes including patient safety and quality, effectiveness and cost (objective 3).
Profile of women who gave birth in 2011
Two-thirds of women who gave birth in 2011 were aged between 20 and 34 years (67%); most were nulliparous (43%) or had one previous live birth (32%) (Table 14). Women were more likely to be classified as higher risk than lower risk (55% vs. 45%) according to the definition of women at increased risk of complications based on the NICE intrapartum care guidelines.80 Included in the 55% were 4% of women who required individual assessment to determine if they were at increased risk of complications. About two-thirds of women were categorised as white British, a further 9% were from another white background, 4% were Pakistani, 3% were African, 3% were Indian and 2% were from other Asian backgrounds. A higher proportion of women were living in a deprived area based on the IMD;78 28% versus 15% from the least deprived quintile. Most women lived in denser urban areas (86%).
TABLE 14
Women’s demographic and sociodemographic profile
Profile of NHS trusts in 2011
Tables 15 and 16 show that most trusts operated their maternity service either through one or more OUs only (42%) or through an OU with an AMU (OU/AMU, 32%). Nearly 30% of trusts were attached to a university and London SHA had the greatest number of trusts (24, 17%). The average number of births per trust in 2011 was 4620 (range 1214 to 10,678). There were 4.80 FTE staff for every 100 births, of which 0.82 FTE were doctors (0.21 to 1.65) (one trust had a particularly low FTE, the next lowest had 0.44 FTE and all other trusts had 0.50 FTE and above), 3.08 FTE were midwives (1.11 to 4.71) and 0.90 FTE were support workers (0.05 to 2.88). The ratio of doctors to midwives averaged 0.27 (0.07 to 0.50) and support workers to midwives 0.30 (0.02 to 0.85).
TABLE 15
Trust profile
TABLE 16
Trust staffing variables
Outcome indicators
The analysis that follows focuses on 10 indicators described in Chapter 2, Table 1 and listed in Table 17. Five of these were composites of three or more other indicators. The 10 indicators have been placed into three groups. The healthy mother and healthy baby indicators form a natural group, as do the mode of birth and caesarean indicators. Healthy outcomes comprised (1) healthy mother (delivery with bodily integrity, mother returned home within 2 days, not readmitted within 28 days and without instrumental delivery, maternal sepsis or anaesthetic complication), (2) healthy baby (baby’s weight 2.5–4.5 kg, gestational age 37–42 weeks, live baby) and (3) healthy mother/healthy baby dyad (cases in which both the mother and the baby were healthy). Mode of birth comprised (4) delivery with bodily integrity (without caesarean, uterine damage, second-/third-/fourth-degree tear, sutures and episiotomy), (5) normal birth (without induction, instrumental and caesarean birth, episiotomy, general and/or regional anaesthetic), (6) spontaneous vaginal delivery and (7) intact perineum. Caesarean indicators comprised (8) elective caesarean, (9) emergency caesarean and (10) all caesareans. The full reports of the models informing the results are in Appendix 4.
TABLE 17
Ten indicators: variation by trust
How each indicator varies across trusts is shown in Table 17 and variation by the variables listed in Tables 15 and 16 is shown graphically in Appendix 3. Intact perineum was the indicator that had the greatest variation between trusts and elective caesarean the least.
Unadjusted variation in outcomes by mothers’ characteristics, sociodemographic, trust-level and staffing variables
Those mothers whose age and IMD were not known were excluded from this and subsequent analyses. In the unadjusted analysis, a large proportion of the variation observed at the level of the individual is attributable to mothers’ age, parity and level of clinical risk (Table 18).
TABLE 18
Variation in outcomes by mother’s characteristics: unadjusted analysis (%)
The variation was much lower for sociodemographic, trust-level and staffing variables than for mothers’ characteristics (see Tables 18–21). Most indicators varied by mothers’ age, with the prime examples being normal birth [20% (aged ≥ 45 years) to 44% (aged ≤ 19 years)], spontaneous vaginal delivery [37% (aged ≥ 45 years) to 73% (aged ≤ 19 years)] and elective caesarean [2% (aged ≤ 19 years) to 30% (aged ≥ 45 years)]. Parity had a strong influence on certain outcomes (e.g. healthy mother, model of delivery indicators) but less so on the healthy baby outcome. Most outcomes varied considerably by clinical risk; the one exception was intact perineum (43% for both lower and higher risk) but intact perineum varied more by ethnicity than any other indicator (26% to 63%).
TABLE 21
Variation in outcomes by staffing variables: unadjusted analysis (%)
For a number of indicators, deprived mothers had better outcomes (healthy mother, healthy mother/healthy baby dyad, delivery with bodily integrity, normal birth, spontaneous vaginal delivery, intact perineum and elective caesarean) than less deprived mothers. The two exceptions were the healthy baby indicator and emergency caesareans (see Table 19).
TABLE 19
Variation in outcomes by sociodemographic variables: unadjusted analysis (%)
It was difficult to discern any clear pattern of variation by the rural/urban classification or by SHA, although London seemed to perform less well than other regions on a number of indicators. The amount of variation attributable to trust-level factors was generally low (Table 20).
TABLE 20
Variation in outcomes by trust-level variables: unadjusted analysis (%)
University hospitals consistently performed less well than non-university hospitals although the differences tended to be quite small. Configuration appeared not to have much bearing upon outcomes. Trusts of the size 5.03 to 5.79 seemed to have highest intervention rates and lowest positive measures of birth. However, it is possible that larger trusts may have been often been split between OU sites, so that largest OUs were in the fourth quintile). Levels of staffing did not markedly impact upon the variation in outcomes but there were a few exceptions (Table 21).
Higher levels of doctor staffing improved a woman’s chance of delivery with bodily integrity (lowest quintile 31.5%; highest quintile 33.2%) and intact perineum (lowest quintile 42.0%; highest quintile 44.5%). Higher levels of midwifery staffing improved a woman’s chance of delivery with bodily integrity (lowest quintile 29.6%; highest quintile 33.7%) and intact perineum (lowest quintile 39.9%; highest quintile 44.5%) and having a birth where the mother was healthy (lowest quintile 25.5%; highest quintile 28.5%). More support staff improved a woman’s chance of delivery with bodily integrity (lowest quintile 30.0%; highest quintile 33.9%) and intact perineum (lowest quintile 40.4%; highest quintile 45.3%). A doctor-to-midwife ratio of 0.22–0.25 : 1 (second quintile) generally seemed to be best.
A summary of unadjusted analyses has been presented to provide complete information, but because these data are not risk adjusted, these results should be interpreted with caution.
Multilevel modelling
Given the detailed nature of the tables reporting the results of the multilevel models, we have placed them in Appendix 4 and have reproduced the results in summary form to aid the reader (Tables 22–26). Those mothers whose age and IMD were not known were excluded from the analysis. Note there were two different staffing models. The first model adds three variables to the model that contains mothers’ characteristics, sociodemographic and trust-level variables: FTE doctors per 100 maternities, FTE midwives per 100 maternities and FTE support workers per 100 maternities. The second model replaces the three previous staffing variables with FTE all staff per 100 maternities, doctor-to-midwife ratio and support worker-to-midwife ratio.
TABLE 22
Magnitude of model effects measured using the relative chi-squared value
TABLE 26
Ten indicators: model adjusted for variation between trusts
TABLE 24
Mode of birth indicators: summary of findings from the multilevel models
The residual variance for the 10 intercept-only models (i.e. the model without any independent variables) ranged from 0.203 to 0.283. Based on these estimates, approximately 1–2% of the total variation in the outcome indicators is attributable to differences between trusts whereas 98–99% of the variation is attributable to differences between mothers within trusts. We are not able to say how much of the variation could be due to unknown characteristics that are predictive of outcome, or to variations in the quality of care and/or different models of care received by different women in the same trust.
The area under the curve (AUC) statistics discussed below provide some indication of how well these models fit the data. An AUC of 0.5 is no better than tossing a coin whereas an AUC of 1 implies perfect prediction.
Most of the variation observed for each individual indicator was explained by mothers’ characteristics. The relative chi-squared values for the effects of mother’s age, parity and clinical risk were often of a different order of magnitude from most of the other independent variables entered into the models (see Table 22). Age, parity and clinical risk were the only variables where the relative chi-squared value exceeded 1000 on one or more occasions (four, nine and nine occasions respectively). Parity and clinical risk were the only two variables where the relative chi-squared value exceeded 10,000 (three and nine occasions respectively). Ethnicity and the IMD were the only other two variables where the relative chi-squared value managed to exceed 100 on one or more occasions (two and four occasions respectively).
The effect sizes of the staffing variables were very small by comparison. For example, the relative chi-squared value for clinical risk was 15,841 for the healthy mother indicator whereas the largest effect size for a staffing variable for this indicator was only 3. In Tables 22–25 effect sizes have been summarised using symbols (see key beneath each table). A monotonic relationship is one that increases consistently upwards (or downwards) in either a linear or a curvilinear fashion.
TABLE 25
Caesarean indicators: summary of findings from the multilevel models
There was marginal improvement in a model’s capacity to predict outcomes following the addition of sociodemographic, trust-level and staffing variables. The largest improvement was for intact perineum (AUC 0.722 to 0.732). The improvement in the AUC brought about by adding the staffing variables to the model was negligible with hardly any change in the AUC. Based on the AUC, most models meet the criteria for fair (0.70 to 0.80) to good (0.80 to 0.90) prediction. The model for elective caesarean provided the best predictions (AUC 0.814) and the model for emergency caesareans was least able to predict this outcome accurately (AUC 0.698). Potentially there is capacity to improve the fit of these models by adding further variables to raise the AUC to 0.9 and above, although a number of the variables that might help in this respect were inadequate for use in the analysis (e.g. smoking, body mass index) because they were either poorly or not consistently recorded.
The addition of variables representing mothers’ characteristics into the intercept-only models (i.e. without any independent variables) more often than not resulted in an increase in the residual sigma (between trusts). The only indicators where this did not occur were normal birth (falling from 0.270 to 0.225) and spontaneous vaginal delivery (falling from 0.226 to 0.200). Conversely, sociodemographic variables had the reverse effect with a marked reduction in the variation for four of the indicators when these variables were added to the mother level model: healthy mother (falling from 0.308 to 0.256), healthy mother/healthy baby dyad (falling from 0.306 to 0.257), delivery with bodily integrity (falling from 0.297 to 0.244) and intact perineum (falling from 0.334 to 0.280). This suggests that when studying variability between trusts it is important to include in the model both mothers’ characteristics, which increase variation between trusts, and sociodemographic variables, which decrease variation, otherwise our assessment of the variation will be biased. This provides a better indication of unexplained variation between trusts that adjusts for trust differences in women’s risk profile.
Healthy mother and healthy baby
The results for the three healthy mother and healthy baby indicators (healthy mother, healthy baby, healthy mother/healthy baby dyad) are summarised in Table 23. The detailed results from the multilevel modelling are shown in Appendices 3 and 5.
TABLE 23
Healthy mother and healthy baby indicators: summary of findings from the multilevel models
Mothers’ characteristics
Clinical risk was the most dominant predictor, followed by mother’s parity; there were moderate effects for mother’s age and IMD. The relationship with parity was linear for the healthy mother indicator, with mothers more likely to achieve a healthy outcome with increasing parity, and curvilinear for the healthy baby indicator. Babies were least likely to be born healthy if their mothers were nulliparous, and more likely to if this was their mother’s second live birth (parity 1). Parity had a much more dominant effect upon the healthy mother indicator (relative χ2 = 10,018) than on the healthy baby indicator (relative χ2 = 615).
For mother and baby indicators, and the healthy mother/healthy baby dyad, clinical risk was a strong and dominant predictor with relative chi-squared values of 15,841, 26,718 and 23,436 respectively. The effect of mother’s age on these three indicators was noticeably smaller, with relative chi-squared values ranging from 14 for the healthy baby indicator to 1083 for the healthy mother indicator (noting that mother’s age is a confounder for parity and clinical risk). The relationship with age was linear for the healthy mother indicator, with mothers aged ≤ 19 years [odds ratio (OR) 3.744] most likely and those aged ≥ 45 years (OR 1.00) least likely to have a healthy mother outcome. By comparison, this relationship was far weaker and flatter for the healthy baby indicator with upwards step changes at 25–29 years (OR 1.213) and 40–44 years (OR 1.306) and then a step down at ≥ 45 years (OR 1.000). Babies born to this last age group had the lowest chance of being born healthy.
Ethnicity had a stronger effect upon the healthy mother indicator (relative χ2 = 74) than the healthy baby indicator (relative χ2 = 6). Mothers of white and black Caribbean origin were most likely to experience a healthy mother outcome (OR 1.776) and mothers of Indian origin (OR 0.605) the least likely. For the healthy mother/healthy baby dyad Caribbean mothers (OR 1.657) were most likely and Indian mothers (OR 0.597) were least likely to experience a healthy outcome.
Sociodemographic factors
The chance of a healthy mother outcome was negatively associated with deprivation. Mothers belonging to the most deprived IMD quintile were more likely to achieve a healthy mother outcome than mothers belonging to the least deprived quintile [OR 1.382, 95% confidence interval (CI) 1.344 to 1.422]. This relationship was reversed, and less strong, for the healthy baby indicator (OR 0.854, 95% CI 0.826 to 0.883). Mothers from the most deprived IMD quintile were most likely to have a birth that resulted in a healthy outcome for both mother and baby (OR 1.323, 95% CI 1.285 to 1.363) noting that the proportion of elective caesareans increases as deprivation decreases (see Table 19: Most deprived 8.5% vs. Least deprived 12.1%). The healthy mother/healthy baby dyad was clearly weighted towards the healthy mother indicator.
There was some variation by SHA and rural/urban classification, although the size of these effects was of a much lower order of magnitude than for mothers’ characteristics. Mothers giving birth in the North East were most likely to be have healthy mother outcome, while those in London were least likely (OR 1.240 vs. 0.891). For the healthy mother/healthy baby dyad the outcome was achieved most often in the East Midlands and least often in London (OR 1.253 vs. 0.907).
Trust factors
Mothers attending a university hospital trust were less likely to give birth to a healthy baby (OR 1.134, 95% CI 1.016 to 1.265). Size of trust had no impact although there was a negative effect (i.e. the more births in a trust the poorer the outcome) that approached statistical significance for the healthy mother indicator (OR 0.972, 95% CI 0.944 to 1.001; p = 0.060). This may be because sick mothers and babies were referred to large units, skewing their proportions of healthy mothers and healthy babies, although clinical risk has been controlled for in these models.
Staffing factors
Staffing levels were not statistically related to any of the three healthy mother and healthy baby indicators.
Model fit
The overall fit of these models based on the AUC were all in the range 0.70 to 0.80 (fair). The AUC for the healthy mother/healthy baby dyad was highest for the three indicators reported in this section (0.753 vs. 0.742 and 0.726 for the separate healthy mother and healthy baby indicators respectively). There was some reduction in the between-trust variation, after adding the independent variables to the intercept-only model, for the healthy mother indicator model (residual sigma from 0.283 to 0.238), and the mother and baby dyad model (from 0.275 to 0.236), but there was a slight increase in variation for the healthy baby indicator model (from 0.227 to 0.233).
Mode of birth
The results for the four mode of birth indicators are summarised in Table 23. The detailed results from the multilevel modelling are shown in Appendices 3 and 5.
Mothers’ characteristics
Parity and clinical risk were the two dominant predictors of outcome. The effect of increasing parity increased the chance of a delivery with bodily integrity, a normal birth, a spontaneous vaginal delivery and an intact perineum. Being at increased clinical risk of complications during the birth reduced the chances of these outcomes. The effect of clinical risk was stronger than parity for normal birth and spontaneous vaginal delivery. The difference was less for delivery with bodily integrity (relative χ2 17,470 vs. 14,185). Parity was far more important than clinical risk in determining whether or not a woman gave birth with her perineum intact (relative χ2 13,310 vs. 945).
The relationship with age was curvilinear for all four indicators. A positive outcome was most likely for mothers aged ≤ 19 years [(≤ 19 years vs. ≥ 45 years) delivery with bodily integrity OR 3.638, 95% CI 3.151 to 4.201; normal birth OR 4.116, 95% CI 3.448 to 4.915; spontaneous vaginal delivery OR 5.877, 95% CI 5.175 to 6.676; intact perineum OR 1.871, 95% CI 1.545 to 2.266]. A positive outcome was least likely for mothers aged ≥ 45 years, although an intact perineum was least likely for mothers aged 35–39 years (OR 0.762, 95% CI 0.630 to 0.922).
The effect of ethnicity was stronger for delivery with bodily integrity and intact perineum than the other two indicators (relative χ2 124 and 158 vs. 28 and 20). Mothers of Caribbean (black or black British) origin were most likely to deliver with bodily integrity (OR 1.799), have a normal birth (OR 1.249) and have an intact perineum (OR 2.242). In contrast, mothers of Indian and Chinese ethnicity were least likely to experience a positive outcome for delivery with bodily integrity (OR 0.594 and 0.627 respectively). Indian mothers were least likely to experience a spontaneous vaginal delivery (OR 0.830) and Chinese mothers to have an intact perineum (OR 0.563). Mothers of Irish origin were least likely to experience a normal birth (OR 0.731) and white and black Caribbean mothers (mixed) were most likely to experience a spontaneous vaginal delivery (OR 1.304).
Sociodemographic factors
Deprivation had a stronger effect upon delivery with bodily integrity and intact perineum than the other two indicators (relative χ2 300 and 337 vs. 32 and 23). Women living in more deprived areas were more likely to deliver with bodily integrity, have a normal birth, experience a spontaneous vaginal delivery and have an intact perineum. The effects of geographical location, defined by SHA, and type and density of the area in which mothers lived were of a much lower order of magnitude than the effects of mothers’ characteristics. The following outcomes were more likely to occur for mothers living in the East Midlands (delivery with bodily integrity, normal birth), North West (spontaneous vaginal delivery) and North East (intact perineum) and less likely to occur for those living in London (delivery with bodily integrity, normal birth, spontaneous vaginal delivery) and the South East Coast (intact perineum).
Trust factors
Giving birth in larger trusts with more deliveries lowered the chances of delivery with bodily integrity (OR 0.975, 95% CI 0.952 to 0.999) and an intact perineum (OR 0.971, 95% CI 0.945 to 0.998) but the effects were small (relative χ2 4 and 5 respectively). For spontaneous vaginal delivery (OR 1.090, 95% CI 1.012 to 1.175) the outcome was better in trusts not attached to a university but again the effect was small (relative χ2 5). Trust configuration, i.e. whether or not it had midwife-led units, appeared to have no effect upon mode of birth outcomes.
Staffing factors
A higher number of midwives (FTE per 100 maternities) was associated with improved chance of delivery with bodily integrity (OR 1.110, 95% CI 1.005 to 1.227) and an intact perineum (OR 1.132, 95% CI 1.010 to 1.268). The second staffing model suggests for both these indicators that higher levels of overall staffing increased the chances of a positive outcome (delivery with bodily integrity OR 1.079, 95% CI 1.016 to 1.147; intact perineum OR 1.092, 95% CI 1.019 to 1.170).
Model fit
The overall fits of these models based on the AUC were all in the range 0.70 to 0.80 (fair), similar to the healthy mother and healthy baby indicators. The AUC for the normal birth was highest (0.756 vs. 0.733, 0.724 and 0.732 for delivery with bodily integrity, spontaneous vaginal delivery and intact perineum respectively). There was a reduction in the between-trust variation (residual sigma) across all four indicators, particularly for normal birth, from 0.270 to 0.198 when mothers’ characteristics, sociodemographics, trust-level and staffing variables were added to the intercept-only model (i.e. the model that contains no independent variables). This reduction was somewhat smaller for delivery with bodily integrity (0.283 to 0.230), spontaneous vaginal delivery (0.226 to 0.179) and intact perineum (0.283 to 0.263).
Caesareans
The results for the three caesarean indicators are summarised in Table 25. The detailed results from the multilevel modelling are shown in Appendices 3 and 5.
Mothers’ characteristics
The chances of a caesarean were lowest for mothers aged ≤ 19 years [(≤ 19 years vs. ≥ 45 years) elective caesarean OR 0.130, 95% CI 0.112 to 0.152; emergency caesarean OR 0.348, 95% CI 0.302 to 0.402], rising thereafter with increasing age.
The relationship between the chance of caesarean and parity depended on whether it was elective or emergency. For elective caesareans, the probability was lowest for nulliparous mothers and highest for mothers with two previous live births. For emergency caesareans, nulliparous women were the most susceptible, with a sharp decline in risk thereafter. Women at increased clinical risk were far more likely to undergo a caesarean whether this was elective (OR 22.666, 95% CI 21.726 to 23.647) or emergency (OR 3.263, 95% CI 3.208 to 3.319). The relative chi-squared value for increased clinical risk (20,858) was of a much higher order of magnitude than for either mother’s age (713) or parity (1785) for elective caesarean. The difference was smaller for emergency caesarean (parity 4290 vs. clinical risk 18,388).
Some variation between ethnic groups remains in the model having adjusted for all other independent variables. Irish women were most likely, and white and black Caribbean (mixed) women least likely, to undergo an elective caesarean (OR 1.233 vs. 0.692). African women were most likely, and women of any other white background were least likely, to have an emergency caesarean (OR 1.453 vs. 0.841).
Sociodemographic factors
Women living in the most deprived area, based on the IMD, were less likely to undergo an elective caesarean than those living in the least deprived areas [(most deprived vs. least deprived) OR 0.816, 95% CI 0.789 to 0.844] having corrected for mothers’ characteristics (age, parity, ethnicity, clinical risk), other sociodemographic, trust-level and staffing variables. The relationship is reversed for emergency caesarean and the effect of IMD is less strong [(most deprived vs. least deprived) OR 1.113, 95% CI 1.081 to 1.145]. There was some variation across SHAs for all caesareans, with women living in London most likely to have a caesarean (OR 1.119) and women in East Midlands least likely (OR 0.875).
Trust-level factors
None of the trust-level variables (trust size, university trust status or configuration) was statistically associated with the chances of undergoing a caesarean.
Staffing factors
Staffing variables had a non-significant effect upon the chances of a caesarean. The correlation between ONS maternities (size) and FTE doctors per 100 maternities is –0.20, p = 0.017, n = 143.
Model fit
The overall model for elective caesareans achieved a noticeably higher AUC statistic than for emergency caesareans (0.814 vs. 0.698) and therefore was far better at predicting the outcome.
Trust variation
Variation between trusts represents a comparatively small component of the overall variation (approximately 1–2%). This variation was not substantially reduced by the addition of the independent variables, although reduction was more evident amongst the mode of delivery indicators (see Table 26).
The funnel plots for the 10 indicators (see Appendix 3) provide further confirmation of this finding. The funnel plot limits suggest that there was more variability than expected by chance, with more data points outside the control limits. One possible reason could be that important explanatory variables have been omitted from the model. The funnel plots generally confirmed what was found in the multilevel models, based on the change in the residual variance estimates, that the models did not reduce the variability between trusts to any great degree.
Sensitivity analysis
The sensitivity analysis consisted of refitting the model using a three-category clinical risk variable (lower, individual assessment, higher), fitting the model to the 50 trusts that operated their maternity service solely through a single OU, testing the interactions between parity and individual staffing variables (FTE doctors per 100 maternities, FTE midwives per 100 maternities, FTE support workers per 100 maternities) and between clinical risk and the same staffing variables. It was not possible to fit all the interactions simultaneously to the pre-existing (main effects) model because of data dependency issues, so each interaction variable (e.g. parity × FTE doctors per 100 maternities) was fitted separately to the model. The final sensitivity analysis compared the healthy baby indicator multilevel model for all mothers with the model confined to deliveries that were not preterm (< 37 weeks) and were not antepartum stillbirths. Note that the model for the healthy mother indicator, confined to the same subset, was similar to the model for all mothers and for that reason is not reported here.
A consistent picture emerged when the parameter estimates from the pre-existing model (clinical risk dichotomised into lower and higher risk) were compared with the model using the three-category clinical risk variable. Those women whose risk was based on an individual assessment nearly always had a better outcome than those at higher risk (see Appendix 5). The one exception was for intact perineum; women in this group had poorer outcomes than women in the higher-risk group (OR 0.782 vs. 0.812).
The single OU analysis (see Appendix 5) seemed to throw up differences in the relationships between trust-level variables and the indicators from the all trusts model. There was a tendency for London SHA to improve its position. For example, London rose from last place to joint fifth for the healthy mother outcome and from ninth to fourth for intact perineum. For the healthy baby indicator there was statistically significant variation between SHAs that was less evident previously (p = 0.079 vs. p < 0.001). The effect of trust size upon the healthy mother outcome strengthened, and the healthy mother/healthy baby dyad outcome was now less likely to occur in bigger trusts. This finding was replicated for delivery with bodily integrity, normal birth, spontaneous vaginal delivery, intact perineum, emergency caesarean and all caesareans. Attending a university trust was now advantageous, compared with previously, in terms of normal birth outcome (p = 0.050). Non-university trusts were no longer better for the healthy baby indicator or for spontaneous vaginal delivery. More support workers reduced the chance of a healthy baby outcome (p = 0.048). The effect of FTE midwives upon delivery with bodily integrity was no longer significant despite the β-coefficient increasing in size from 0.105 to 0.113. This was also the case for intact perineum (β 0.124 to 0.147).
A decision was taken to ascertain whether or not the effect of staffing levels upon outcomes could vary according to either a woman’s parity or her clinical risk by fitting interaction terms to the main model. These tests are shown in full in Appendix 6 and are summarised in Tables 27–29. Five tests involving the midwifery staffing variable were not possible because starting values for the model covariance matrix could not be obtained. Tables 27–29 show the OR at each level of the categorical variable (parity, clinical risk), the combined OR in brackets and the probability values associated with the chi-squared interaction test.
TABLE 27
Healthy mother and healthy baby indicators: effect of staffing levels by parity and clinical risk (from the multilevel model)
TABLE 29
Mode of birth indicators: effects of staffing levels by parity and clinical risk (from the multilevel model)
For example, in Table 27 the odds of a healthy mother outcome associated with staffing levels of doctors varies by parity. The effect of increasing the number of doctors is stronger for nulliparous women than it is for women of higher parity. The overall effect of FTE doctors for nulliparous women is calculated by multiplying 0.95 (shared across all parities) and 1.20 to give an OR of 1.14. For women of parity 4 and above, the OR is 0.95.
Therefore, increasing FTE doctors improves outcomes for nulliparous women but has the reverse effect for women of parity 3 or more, although it should be noted that the ORs for woman falling into these two higher parity groups are close to 1 (parity 3 OR = 0.96, parity 4 or more OR = 0.95). The results in Table 27 suggest that higher staffing levels for doctors will result in improved outcomes for nulliparous women. The chances of the healthy mother outcome being met are reduced when the number of support workers is increased, irrespective of parity (ORs range from 0.87 to 0.93). Support worker staffing levels are not associated with a healthy baby outcome (ORs range from 0.91 to 1.00). The chance of a healthy mother/healthy baby dyad outcome mirrors the finding for the healthy mother outcome (ORs range from 0.87 to 0.90).
Midwives have a more positive bearing upon the outcomes of women at lower risk for all three indicators in Table 27, whereas this observation applies to only the healthy mother indicator for doctors.
The results are presented in the same manner for mode of birth indicators (Table 28). It was not possible to fit the FTE midwives × parity and FTE midwives × clinical risk interactions to the normal birth model because obtaining initial estimates for the model covariance matrix was not possible.
TABLE 28
Mode of birth indicators: effects of staffing levels by parity and clinical risk (from the multilevel model)
Higher staffing levels of doctors decreased the chances of a normal birth in the lower parities and increased the chances of a spontaneous vaginal delivery in the higher parities. Women giving birth with an intact perineum were associated with higher staffing levels of midwives and support workers, especially in the higher parities. The effect of higher support worker levels on delivery with bodily integrity was advantageous for lower-risk women (OR 1.04) and not advantageous for higher-risk women (OR = 0.96). This broad trend that favoured lower-risk women was also apparent for normal birth (OR 1.06 vs. 0.96) and intact perineum (OR 1.04 vs. 0.99). Conversely, more doctors decreased the chances of a spontaneous vaginal delivery in lower-risk women (OR = 0.94) and increased the chances in higher-risk women (OR = 1.06).
Higher staffing levels of doctors were associated with bigger reductions in the proportion of elective caesareans in the higher parities (p = 0.001) (see Table 29). The combined ORs for FTE doctors ranged from 1.00 for parity 0 women to 0.75 for women of parity 2. The trend for midwives was similar but less strong statistically (p = 0.003). The effect of more doctors upon the chances of an emergency caesarean was felt most by women of parity 2 (OR 1.03) and least by women of parity 4 or more (OR 0.84). A higher number of doctors therefore reduces the odds of an emergency caesarean amongst women in the lowest (OR 0.87) and highest parities (OR 0.84). For all caesareans, a higher number of doctors is associated with fewer caesareans in the highest parities.
A higher number of doctors is associated with a reduction in the number of elective caesareans in the clinically higher-risk group (OR 0.84), but is associated with an increase in the number for lower-risk women (OR 1.43). This statistically significant interaction (p < 0.001) is not replicated for emergency caesareans (p = 0.38). The effect of doctor staffing levels upon the chances of an emergency caesarean is similar in both the higher- and lower-risk groups (OR 0.89 vs. 0.93). All caesareans incorporates the characteristics of both elective and emergency caesareans so a higher number of doctors reduces the chances of any caesarean but more so in the clinically higher-risk group [(higher vs. lower) OR 0.83 vs. 0.96].
Clinical risk is a less dominant predictor of the healthy baby outcome when preterm births (< 37 weeks) and antepartum stillbirths were excluded from the analysis sample (Table 30). The number of deliveries contributing to the analysis dropped from 431,391 to 403,052, a fall of 6.6%. The results from the sensitivity analysis show the model parameters and global statistical tests for model effects for all deliveries and deliveries that were not preterm births or antepartum stillbirths. While remaining the dominant factor, the effect of clinical risk, as measured by the chi-squared value, fell from 26,718 to 9606, and the trust-level random variance increased from 0.233 to 0.324 for the model that included mother-level, sociodemographic, trust-level and staffing variables. The model excluding preterm births and antepartum stillbirths is less able to predict the healthy baby outcome, with the area under the curve falling from 0.726 to 0.675. There is a stronger effect of midwife staffing when preterm births and antepartum stillbirths are excluded (OR 1.172, 95% CI 0.991 to 1.387; p = 0.063) than found for all deliveries (OR 1.029, 95% CI 0.912 to 1.161; p = 0.65).
TABLE 30
Healthy baby multilevel model with and without preterm births and antepartum stillbirths
Staffing, outcome and costs
The final objective was to examine the relationship between maternity workforce staffing levels, quality and safety outcomes and cost. Data on medical staffing were not detailed enough to ascertain the split between obstetric and gynaecological responsibilities in a trust. Therefore, this analysis investigated the relationships between midwifery staffing levels, where data quality was very good, midwifery-related outcome measures and the cost of providing maternity services for NHS trusts in England. This section draws on analyses conducted for the RCM.95
For this analysis we used trust-level data to investigate relationships between outcome measures, midwifery staffing levels and the cost of providing maternity services for NHS trusts in England.
Initial analyses
Number of deliveries
Maternities for each site derived from ONS data matched well with the number of deliveries calculated using reference cost data. Two NHS trusts had discrepancies, with > 70% more deliveries according to ONS data than recorded in the activity from reference costs. These were excluded from further analyses. For further analyses, the number of deliveries derived from reference costs was used. In the following text the expression y ∼ x ⊕ z is used for the schematic relationships of ‘y depends on x and z’ being tested at each stage, although the functional relationship may be more complicated if transformations or link functions have been used.
Trust profiles and average costs
The average and range of variables used for the analyses are shown in Table 31.
TABLE 31
Variables used in cost analysis
Hypothesis 1: investing in providing good antenatal services results in reduced operative delivery rates and lower delivery costs
Operative delivery rate ∼ antenatal cost per delivery (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
When the case-mix adjustments were included in the model there was no relationship with antenatal spend. Including the reference costs category ‘attendance’ (either face to face or not face to face, which is largely thought to be antenatal contact time) in the definition of antenatal expenditure reduced the dependence further.
Delivery cost per delivery ∼ antenatal cost per delivery ⊕ midwives per delivery ⊕ trust size ⊕ (age ⊕ % nulliparous ⊕ % risk ⊕ IMD)
An initial analysis showed that delivery costs have a significant dependence on the antenatal spend, with increased antenatal costs associated with a decrease in delivery costs. However, taken at face value, the extra amount incurred on antenatal care exceeded the reduction in delivery costs.
Figure 1 shows that there are around a dozen trusts with very high antenatal costs and correspondingly low delivery costs which are responsible for the regression relationship. When the analysis was repeated excluding trusts with antenatal costs in excess of £1500 there was no relationship between antenatal cost and delivery cost per delivery.

FIGURE 1
Relationship between antenatal cost/delivery and delivery cost/delivery (adjusted for Market Forces Factor).
Hypothesis 2: trusts with a high antenatal expenditure as a proportion of all maternity expenditure will have lower operative delivery rates
Operative delivery rates ∼ antenatal/total costs (⊕ risk ⊕ parity ⊕ age ⊕ IMD) ⊕ trust size
No relationship was found between the proportion of maternity expenditure spent on antenatal care and operative delivery rates after adjusting for maternal characteristics and trust size.
Hypothesis 3: there is a relationship between delivery cost per delivery and midwifery staffing levels
Delivery cost per delivery ∼ midwives per delivery ⊕ trust size (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
Higher midwifery staffing levels were associated with higher costs of each delivery, although the relationship was not strong. This relationship strengthened when antenatal expenditure was included as an explanatory variable in the model.
Only around 17% of the variation between trusts’ delivery costs could be accounted for by variables included in this model (midwives per delivery, trust size and case mix), rising to 23% when antenatal expenditure was taken into account (hypothesis 2b). The remaining variation in the average cost of a delivery was not accounted for by maternal characteristics, size of trust, number of FTE registered midwives employed or antenatal spend and must be due to other factors not included in the analysis.
Hypothesis 4: outcomes measures are correlated with the delivery cost per delivery
Delivery cost per delivery ∼ operative delivery rate (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
The analysis found that higher operative delivery rates were not significantly associated with higher delivery costs after adjusting for maternal characteristics. Variations in costs between trusts were not related to the numbers of women having operative deliveries.
Delivery cost per delivery ∼ normal birth rate (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
There was no association between delivery cost per delivery and the normal birth rate.
Delivery cost per delivery ∼ intact perineum rate (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
There was no association between delivery cost per delivery and the intact perineum rate. As the denominator for intact perineum was vaginal deliveries, the analysis was repeated used delivery cost per vaginal delivery but this also showed no association.
Delivery cost per delivery ∼ healthy mother and healthy baby outcomes (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
There was no association between delivery cost per delivery and any of the three healthy mother and healthy baby indicators (healthy mother, healthy baby and healthy mother/healthy baby dyad).
CQC scores ∼ delivery cost per delivery
No relationship could be established between delivery cost per delivery and women’s experience of maternity care as measured by the average of the CQC scores.
CQC scores ∼ antenatal cost per delivery
No relationship could be established between antenatal cost per delivery and women’s experiences of antenatal care as measured by the antenatal component of the CQC scores.
Hypothesis 5: trusts with high operative delivery rates have a high postnatal cost as a proportion of all maternity expenditure
Postnatal costs per delivery/total costs per delivery ∼ operative delivery rates (⊕ risk ⊕ parity ⊕ age ⊕ IMD)
A relationship could not be found that explained postnatal costs in terms of variations in operative delivery rates once adjustments were made.
Hypothesis 6: a trust’s total expenditure on maternity services is related to the characteristics of the women who use the trust
Total costs per delivery ∼ trust size ⊕ risk ⊕ parity ⊕ age
Having a higher proportion of women at increased risk of complications was associated with more expensive maternity care. Level of social deprivation approached significance (p = 0.052) with deliveries in trusts with a higher proportion of women living in deprived areas having more costly deliveries. However, in this analysis trust size and case mix accounted for only 22% of the variation in total delivery costs.
Some of the previous analyses also showed relationships between costs and maternal characteristics. For trusts with a larger proportion of women at increased risk of complications, delivery costs were higher and they spent a higher proportion of their total maternity costs on postnatal care, and trusts with a higher proportion of nulliparous women had higher costs per delivery.
Hypothesis 7: the efficiency of a trust measured in terms of the expenditure per delivery is not correlated with the size of the trust measured by number of deliveries
A number of previously described analyses included trust size as an explanatory variable. Larger trusts, measured by the numbers of deliveries, appeared to offer maternity services which cost less than those in smaller trusts. Repeating the analysis for the 50 trusts whose maternity services are delivered by a single OU, the significant effect of size increased, suggesting that the economies of scale were greater in a single site.
The size of the trust had no relationship to the cost of a delivery only (i.e. when costs of antenatal or postnatal care were excluded). Deliveries were not cheaper in larger units, once adjustments have been made for maternal characteristics and numbers of midwives. This lack of relationship held when repeating the analysis for the 50 trusts that delivered maternity services through only a single OU.
Outcome measures and trust size
Larger trusts were associated with lower CQC scores of women’s experience of maternity care. When broken down into components, women’s experience of antenatal care and women’s experience of labour and birth was poorer in larger trusts. No significant relationships were found between trust size and staff during labour and birth, care in hospital after the birth or feeding the baby during the first few days, although there was a trend close to significance for larger trusts to have lower scores. Overall, the analysis of CQC was very sensitive to individual data points, with significant results being lost once outliers were removed. There was no relationship between trust size and the other outcome measures: normal birth, intact perinea and number of complaints.
Economic modelling
Descriptive statistics
For the economic analysis, the unit of analysis was the hospital trust and therefore the patient-level data were aggregated to trust level. Table 32 presents the descriptive statistics for the aggregated data. Two measures of maternity output are presented. The first measure of output is the total number of deliveries within the trust, which has a sample mean (SD) of 4600 (1991) deliveries. The second is the total number of deliveries weighted by relative cost, which combines vaginal and caesarean deliveries based upon the HRG tariff assigned to these modes of delivery. The purpose being to take account of the fact that caesarean deliveries require greater resources and have higher values (prices) attached to them. In comparison with total deliveries, the sample mean (SD) for the cost-weighted deliveries is 5740 (2491). The variation in both measures of output is relatively large.
TABLE 32
Descriptive statistics for the aggregated data
As described in the methods and data section of the report, the staffing variables are grouped into four categories: midwives, support staff, consultants and all other doctors. Note that for completeness the econometric analysis was repeated with consultants and other doctors combined into one variable but the results were statistically indistinguishable. However, there is a strong theoretical reason to consider these two groups separately, especially in relation to their complementarity or substitutability with other staffing groups. By far the largest staff group is midwives (mean FTE 135), followed by support workers (mean FTE 42). There are roughly half as many consultants as other categories of doctors (mean FTE 11 compared with 24). There is very little variation in the average number of consultants (SD 0.60 FTE), which is likely to be the result of regulations regarding consultant cover. There is relatively large variation in the number of midwives (mean FTE 135, SD 6.5). It is worth repeating that the consultant numbers in particular, although it applies to the other workforce variables as well, are at best a proxy for the delivery suite staffing levels because staff will inevitably spend time delivering gynaecological services that could not be separated from time spent in obstetric services.
A number of control variables were included to adjust for any differences in case mix faced by trusts. The choice of variables was informed by the primary multilevel statistical analysis reported above, and these variables were aggregated to generate trust-level averages. The included variables were maternal age (mean 29 years), parity (mean 1), and the proportion of mothers classified as high risk using the NICE criteria80 (mean 50%). In all cases, the variation across the sample was low, with SDs of 1, 0.30, and 6.36 respectively. This supports the findings from the multilevel analysis that reported a high degree of intrahospital variation at patient-level characteristics but relatively little interhospital variation.
Production function results
Table 33 reports the parameter estimates of an ordinary least squares regression of the Diewert production function92 as specified in the methods and data section of the report. Table 34 presents the calculated marginal productivities (estimated at the means of the variables) and the Hicks elasticities of complementarity,64 again following the methods outlined earlier.
TABLE 33
Parameter estimates for generalised linear production function
TABLE 34
Estimates of Hicks elasticities of complementarity
In Table 33 the two alternative output measures are reported side by side in columns (1) and (2). The results are similar across the two output measures, with differences that are to be expected. The regression coefficients themselves are unhelpful in understanding the relationship between staffing levels, skill mix and the level of output produced by maternity units. Instead, the marginal productivities and the elasticities of complementarity are of greater interest and these are reported in Table 34. The marginal productivity – the number of additional deliveries that would be expected on average to occur if a staffing group was increased by one additional person per year – are calculated from the regression coefficients.
Despite a relatively high adjusted R-squared (> 88% in both models), very few parameters are statistically significant in either model at any of the commonly used levels of significance. This is probably because of the relatively low degrees of freedom in this aggregated data set (141 observations to fit 15 parameters) and the relatively high correlation among the explanatory variables which results from a production function such as the Diewert92 that includes multiple cross-products (interaction effects). Therefore the model suffers from multicollinearity, which is further evidenced by the table of variance inflation factors (VIFs) (Table 35). In Table 35, all of the variables except maternal age and proportion of mothers classified as high risk using the NICE criteria80 have a VIF in excess of 10, with many of the cross-products having VIFs in the thousands.
TABLE 35
Variance inflation factors (ordered by magnitude)
Multicollinearity is a data problem rather than an econometric problem and the optimal solution is to gather more data to reduce the relative size of the SEs. We have an exhaustive sample and, therefore, obtaining further data is impossible in this case. Given the theoretical importance of the variables, omitting, combining or transforming the variables is not a satisfactory solution either. Ridge regression was attempted but the results differed little from the original ordinary least squares estimates, which we therefore continue to report here.
The marginal products are reported in Table 34, which estimates the change in output that would occur from a marginal change in each labour input holding the remaining labour variables constant. The top panel (A) in Table 34 reports these for the production function that uses all deliveries as the output measure, and the bottom panel (B) reports the marginal products for the model that uses a cost-weighted output measure. The results are qualitatively similar, so we shall confine ourselves to a discussion of the total deliveries marginal products. The marginal productivities are all positive, indicating that increasing any staff group would increase the production of deliveries that a provider could handle. This means (in non-economic terms) the number of deliveries an institutional provider could handle or accommodate. For example, adding an additional FTE midwife would allow a hospital to produce an additional 18 deliveries on average per annum. The marginal productivities for all types of delivery are highest for consultants (32 additional deliveries) and other doctors (43 additional deliveries). When the deliveries are weighted by their HRG tariff or cost then midwives become the most productive, followed by consultants. It is important to remember that the cost-weighted results tell us nothing about the actual number of deliveries and should not be relied upon for workforce planning or policy decisions.
Table 34 also reports the estimates of the elasticities of complementarity between the different staffing groups in the production of ‘deliveries’. Again, as the conclusions that are drawn from the two alternative output measures are the same, we shall limit ourselves to the discussion of the top panel (A) for total deliveries. To reiterate, a positive elasticity indicates that the two staffing groups are complements (need to be used together) and a negative elasticity indicates that the two staffing groups are substitutes (can be used in place of one another). Of the six possible combinations of staffing groups (the cross-products in the regression model), three of them are complements and three are substitutes. Therefore it is clear that a flexible production function such as ours which allows inputs to be complements is superior to more rigid, but simpler, specifications which force all inputs to be substitutes.
Midwives are quantity-complements with consultants and other doctors in the total number of deliveries produced by an organisation, as indicated by the positive Hick’s elasticities of complementarity in the top panel of Table 34. However, they are quantity-substitutes with support workers. Besides being complements with midwives, consultants are also complements with support workers but are substitutes with other doctors. This makes intuitive sense given their scope of work overlap. It is important to remember that these are quantity complements or substitutes. However, substituting support workers for midwives may also have an impact on some aspects of the quality of care depending on the groups of women involved or the care setting.
- Findings - The efficient use of the maternity workforce and the implications for...Findings - The efficient use of the maternity workforce and the implications for safety and quality in maternity care: a population-based, cross-sectional study
Your browsing activity is empty.
Activity recording is turned off.
See more...