Methods to identify postnatal depression in primary care: an integrated evidence synthesis and value of information analysis

Hewitt CE, Gilbody SM, Brealey S, et al.

Publication Details


Depression accounts for the greatest burden of disease among all mental health problems, and is expected to become the second-highest among all general health problems by 2020. Postnatal depression (PND) is an important category of depression in its own right. There is now considerable evidence to show that PND has a substantial impact on the mother and her partner, the family, mother–baby interactions and the longer-term emotional and cognitive development of the baby, especially when depression occurs in the first year of life. Unfortunately, less than 50% of cases of PND are identified by primary health-care professionals in routine clinical practice. PND screening and case identification strategies have been advocated as a remedy to this problem, but this has attracted substantial controversy.


  1. To provide an overview of all available methods to identify PND and to assess their validity (in terms of key psychometric properties).
  2. To assess the acceptability of methods to identify PND.
  3. To assess the clinical effectiveness of methods to identify PND in improving maternal and infant outcomes.
  4. To assess the cost-effectiveness of methods to identify PND in improving maternal and infant outcomes.
  5. To identify research priorities and the value of further research into methods to identify PND from the perspective of the UK NHS.
  6. To assess whether methods to identify PND meet minimum criteria outlined by the National Screening Committee (NSC) in the light of this evidence synthesis.


A large search was undertaken across all phases of the review, which involved searching 20 electronic databases (including MEDLINE, CINAHL, PsycINFO, EMBASE, CENTRAL, DARE and CDSR), forward citation searching of key literature, personal communication with authors and scrutinising reference lists. A variety of review methods were utilised across the four systematic reviews. A generalised linear mixed model approach to the bivariate meta-analysis was undertaken for the validation review with quality assessment using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool. Within the acceptability review, a textual narrative approach was employed to synthesise qualitative and quantitative research evidence. For the clinical and cost-effectiveness reviews, methods outlined by the Centre for Reviews and Dissemination and the Cochrane Collaboration were followed. Probabilistic models were developed to estimate the costs associated with different identification strategies. Scenario-based sensitivity analyses were also performed.


There were numerous generic and PND-specific measures identified that may be used to identify possible cases of PND. A total of 14 identification strategies were found to have been validated among women during pregnancy or the postnatal period: PND-specific measures that were used were the Edinburgh Postnatal Depression Scale (EPDS), Postpartum Depression Screening Scale, Pregnancy Risk Questionnaire, and Predictive Index; generic depression identification strategies were the Beck Depression Inventory (BDI), General Health Questionnaire (GHQ), Hospital Anxiety and Depression Scale, Hopkins Symptom Checklist, Hamilton Rating Scale for Depression (HAMD), Zung's Self-rating Depression Scale, Symptom Checklist-90-R, Raskin, and Montgomery–Asberg Depression Rating Scale; one study used both the EPDS and GHQ. By far the most frequently used identification strategy across all of the reviews was the EPDS. In terms of test performance, postnatally the EPDS performed reasonably well: sensitivity ranged from 0.60 (specificity 0.97) to 0.96 (specificity 0.45) for major depression only; from 0.38 (specificity 0.99) to 0.86 (specificity 0.87) for any psychiatric disorder; and from 0.31 (specificity 0.99) to 0.91 (specificity 0.67) for major or minor depression. In addition, for major or minor depression there were sufficient data to pool the BDI and HAMD data at a single cut point. Results from this analysis highlighted that generic identification strategies may be less sensitive than the EPDS, but more specific.

For the acceptability review, studies indicated that women and health professionals both felt that it was beneficial to inform women in advance that they would be asked to complete a questionnaire to identify PND and that the questionnaire should be administered in the woman's home. In general, when administering the instrument, women preferred to talk rather than complete a standardised questionnaire and were critical of the lack of dialogue that could result from a paper and pencil assessment. Both women and health professionals found that the last question on the EPDS, about the thought of self harm, caused difficulties. In addition, English women and health professionals also found difficulties with the question about sleeping. It was also identified that the interpersonal relationship between the mother and health professional was important and that this relationship was strengthened after a number of meetings and when adequate training for health professional in identifying PND was given. In summary, in the majority of studies, the EPDS was acceptable to women and healthcare professionals when women were forewarned of the process, when the EPDS was administered in the home, with due attention to training those administering the EPDS, with empathetic skills of the health visitor and due consideration of positive responses to question 10 about self harm.

Within the clinical effectiveness review, five studies were identified that compared using either the EPDS (with or without enhancement of care) or feedback of the EPDS scores with not using the EPDS or usual care. All of the studies indicated beneficial effects of using the EPDS in reducing EPDS scores, although some of the individual studies did not show statistically significant differences. Studies reporting dichotomous outcomes (the number of women scoring above or below a cut point on the EPDS) were combined and the pooled estimate gave an odds ratio of 0.64 (95% confidence interval 0.52 to 0.78). It was difficult to disentangle the effects of using an identification strategy from the effects of the enhancement of care and/or any subsequent intervention given.

With regards to the cost-effectiveness of methods to identify PND, despite an extensive systematic search of the literature, none of the studies identified presented full economic evaluations of PND identification strategies, hence a decision-analytic model was developed. The results of the base-case analysis suggested that the use of formal identification strategies did not appear to represent value for money based on conventional thresholds of cost-effectiveness used in the NHS. However, the scenarios considered demonstrated that this conclusion was primarily driven by the costs of false positives assumed in the base-case model. Alternative assumptions employed in separate scenarios resulted in more favourable estimates of cost-effectiveness, such that use of the EPDS to identify women with PND, considered in some of these scenarios, fell within these conventional thresholds. For example, when the cost of a false-positive diagnosis was assumed to be a single GP attendance, the EPDS using a cut point of 10 or higher emerged as the optimal strategy in terms of cost-effectiveness. Interestingly, this corresponded closely with the results presented in the validation review, in which the trade-off between sensitivity and specificity was considered. A definitive answer to the question of whether formal identification strategies are cost-effective, and, if they are, which individual strategy is optimal in cost-effectiveness terms, clearly requires more reliable evidence in relation to the costs of managing false positives.

Clinical guidance on the management of antenatal and postnatal mental health care was issued by the National Institute for Health and Clinical Excellence (NICE) in October 2007. NICE recommended the use of the Whooley questions:

  1. 'During the past month, have you often been bothered by feeling down, depressed or hopeless?'
  2. 'During the past month, have you often been bothered by little interest or pleasure in doing things?'

A third help question should be considered if the woman answers 'yes' to either of the initial questions:

  1. 'Is this something you feel you need or want help with?'

No evidence was identified across the four systematic reviews for these three questions in a postnatal population in terms of validity, acceptability and clinical and cost-effectiveness.


In light of the results of our evidence synthesis and decision modelling we revisited the examination of PND screening against five of the NSC criteria. We found that the accepted criteria for a PND screening programme were not currently met. The evidence suggested that there is a simple, safe, precise and validated identification strategy, that in principle a suitable cut-off level could be defined and that the strategy is acceptable to the population. Evidence surrounding the clinical effectiveness and cost-effectiveness of methods to identify PND is lacking.

Implications for research

The results from the systematic reviews, the probabilistic decision model and the value of information analysis indicated that further research should aim to identify the:

  • Optimal identification strategy, in terms of key psychometric properties, for postnatal populations. Further research comparing the performance of the Whooley and help questions, the EPDS and a generic depression measure would be informative.
  • Acceptability of the identification strategies outlined above, with particular emphasis on collating acceptability data by whether women were correctly classified (i.e. true positives or true negatives) or not (i.e. false positives or false negatives).
  • Natural history of PND over time in populations in which formal methods to identify PND have been used and in populations in which formal methods of identification have not been used.
  • Costs associated with false positives.
  • Impact of PND on health-related quality of life.
  • Epidemiological data regarding prevalence rates of PND.
  • Clinical effectiveness of the most valid and acceptable method to identify PND. This could be achieved by carrying out further research within a randomised controlled trial.


  • Hewitt CE, Gilbody SM, Brealey S, Paulden M, Palmer S, Mann R, et al. Methods to identify postnatal depression in primary care: an integrated evidence synthesis and value of information analysis. Health Technol Assess 2009;13(36). [PubMed: 19624978]

NIHR Health Technology Assessment programme

The Health Technology Assessment (HTA) programme, part of the National Institute for Health Research (NIHR), was set up in 1993. It produces high-quality research information on the effectiveness, costs and broader impact of health technologies for those who use, manage and provide care in the NHS. 'Health technologies' are broadly defined as all interventions used to promote health, prevent and treat disease, and improve rehabilitation and long-term care.

The research findings from the HTA programme directly influence decision-making bodies such as the National Institute for Health and Clinical Excellence (NICE) and the National Screening Committee (NSC). HTA findings also help to improve the quality of clinical practice in the NHS indirectly in that they form a key component of the 'National Knowledge Service'.

The HTA programme is needs led in that it fills gaps in the evidence needed by the NHS. There are three routes to the start of projects.

First is the commissioned route. Suggestions for research are actively sought from people working in the NHS, from the public and consumer groups and from professional bodies such as royal colleges and NHS trusts. These suggestions are carefully prioritised by panels of independent experts (including NHS service users). The HTA programme then commissions the research by competitive tender.

Second, the HTA programme provides grants for clinical trials for researchers who identify research questions. These are assessed for importance to patients and the NHS, and scientific rigour.

Third, through its Technology Assessment Report (TAR) call-off contract, the HTA programme commissions bespoke reports, principally for NICE, but also for other policy-makers. TARs bring together evidence on the value of specific technologies.

Some HTA research projects, including TARs, may take only months, others need several years. They can cost from as little as £40,000 to over £1 million, and may involve synthesising existing evidence, undertaking a trial, or other research collecting new data to answer a research problem.

The final reports from HTA projects are peer reviewed by a number of independent expert referees before publication in the widely read journal series Health Technology Assessment.

Criteria for inclusion in the HTA journal series

Reports are published in the HTA journal series if (1) they have resulted from work for the HTA programme, and (2) they are of a sufficiently high scientific quality as assessed by the referees and editors.

Reviews in Health Technology Assessment are termed 'systematic' when the account of the search, appraisal and synthesis methods (to minimise biases and random errors) would, in theory, permit the replication of the review by others.

The research reported in this issue of the journal was commissioned by the HTA programme as project number 05/39/06. The contractual start date was in October 2006. The draft report began editorial review in May 2008 and was accepted for publication in January 2009. As the funder, by devising a commissioning brief, the HTA programme specified the research question and study design. The authors have been wholly responsible for all data collection, analysis and interpretation, and for writing up their work. The HTA editors and publisher have tried to ensure the accuracy of the authors’ report and would like to thank the referees for their constructive comments on the draft document. However, they do not accept liability for damages or losses arising from material published in this report.

The views expressed in this publication are those of the authors and not necessarily those of the HTA programme or the Department of Health.

Editor-in-Chief: Professor Tom Walley CBE

Series Editors: Dr Aileen Clarke, Dr Chris Hyde, Dr John Powell, Dr Rob Riemsma and Professor Ken Stein