Home > Executive Summaries > Systematic review and individual patient...

PubMed Health. A service of the National Library of Medicine, National Institutes of Health.

NIHR Health Technology Assessment programme: Executive Summaries. Southampton (UK): NIHR Journals Library; 2003-.

NIHR Health Technology Assessment programme: Executive Summaries.

Systematic review and individual patient data meta-analysis of diagnosis of heart failure, with modelling of implications of different diagnostic strategies in primary care

J Mant, J Doust, A Roalfe, P Barton, MR Cowie, P Glasziou, D Mant, RJ McManus, R Holder, J Deeks, K Fletcher, M Qume, S Sohanpal, S Sanders, and FDR Hobbs.

Author Information

Published: 2009.


Heart failure is a syndrome resulting from a structural or functional cardiac disorder. For a diagnosis of heart failure to be made there should be symptoms or signs such as breathlessness, effort intolerance or fluid retention together with objective evidence of cardiac dysfunction. Heart failure is associated with significant morbidity and mortality, and health-care expenditure. However, there is a good evidence base for interventions to improve prognosis. Diagnosis of heart failure in primary care is often inaccurate. Current National Institute for Health and Clinical Excellence (NICE) recommendations are that patients in whom heart failure is suspected should undergo an electrocardiogram (ECG) and/or a B-type natriuretic peptide (BNP) test, where available, and that if either of these is positive, then they should be referred for echocardiography as part of their diagnostic workup. The purpose of this work is to determine the potential value of clinical features in the diagnostic assessment, and the relative value of the different diagnostic tests that are available in primary care, with the aim of producing clear recommendations on the optimal approach to diagnosis of heart failure in primary care in the UK.


  1. To perform a systematic review to assess the accuracy in diagnosing heart failure of:
    1. clinical features – both singly and, if possible, in combination
    2. potential primary care investigations – plasma natriuretic peptides, ECG and chest X-ray (CXR) (singly and, if possible, in combination).
  2. To perform an individual patient data (IPD) analysis to address the following questions:
    1. Can a clinical scoring system based on symptoms and signs usefully predict the presence of heart failure?
    2. To rule out heart failure in primary care, what is the optimum decision cut-off point for plasma natriuretic peptides (BNP)?
    3. Does the diagnostic performance of plasma natriuretic peptides vary according to patient characteristics?
    4. How accurate is the combination of plasma natriuretic peptides with ECG at diagnosing heart failure?
  3. To perform a decision analysis to test the impact of plausible diagnostic strategies for the diagnosis of heart failure in primary care on costs and diagnostic yield in the UK health-care setting.


Systematic review

Data sources

Primary studies were identified by searching MEDLINE and CINAHL, with supplementary checks of reference lists of all studies that met the inclusion criteria and any review articles. ‘Grey literature’ databases and conference proceedings were searched, and authors of relevant studies were contacted for data that could not be extracted from the published papers.

Study selection

Studies were included if they estimated the diagnostic accuracy of symptoms, signs or investigations for detecting heart failure. There needed to be an adequate reference standard [e.g. use of European Society of Cardiology (ESC) criteria for diagnosis of heart failure]. Studies in which the reference standard was echocardiographic assessment of left ventricular systolic dysfunction (LVSD) alone were reviewed but were not included in the meta-analysis.

Data extraction

Potentially relevant studies were assessed by two reviewers against the inclusion criteria, with a third reviewer arbitrating when necessary. Data were extracted by both reviewers and quality was assessed using the Quality Assessment of Diagnostic Accuracy of Studies (QUADAS) criteria.

Data synthesis

Sensitivity and specificity were plotted on receiver operating characteristic (ROC) graphs. The data were pooled using a bivariate random-effects meta-analysis and summary estimates of test accuracy calculated. To explore the impact of setting and prevalence, predictive values were plotted against heart failure prevalence.

Individual patient data analysis

Inclusion criteria for the IPD required the study to be set in primary care and to have a minimum of 100 recently symptomatic patients. A total of 11 studies were identified, and data were obtained from nine of these. A logistic regression model to predict heart failure was developed on one of the data sets. This was then validated on the other data sets that had the required variables. Validation included calculation of the area under the ROC curve (AUC) and use of goodness-of-fit calibration plots. The resultant model was then simplified into a decision rule that would be usable in clinical practice. The impact of potential effect modifiers (e.g. use of drugs, co-morbidity) was examined by their inclusion as interactions with BNP [and N-terminal pro-B-type natriuretic peptide (NT-proBNP)] adjusted for clinical score.

Cost-effectiveness analysis

The cost-effectiveness modelling was based on a decision tree that compared different plausible investigation strategies. The outputs of the model were in terms of investigation costs and cases detected, from which an incremental cost-effectiveness ratio (ICER) was calculated comprising the cost per additional case detected. The amount of money that it would be worth spending to diagnose an extra case of heart failure was calculated in two ways. First, only the costs to the NHS were taken into account (including extra admissions through delayed diagnosis). Second, patient benefit in terms of improved quality-adjusted life-years (QALYs) was also taken into account, based on estimates of improved survival as a result of earlier diagnosis leading to earlier initiation of treatments with proven effects on survival. The robustness of the results of the model was tested by sensitivity analyses that varied the costs of the investigations and the time horizon over which the benefits accrued.


Systematic review

Dyspnoea was the only symptom or sign with high sensitivity (89%), but it had poor specificity (51%). Several clinical features had relatively high specificity, including history of myocardial infarction (89%), orthopnoea (89%), oedema (72%), elevated jugular venous pressure (JVP) (70%), cardiomegaly (85%), added heart sounds (99%), lung crepitations (81%) and hepatomegaly (97%). However, the sensitivity of all of these features was low, ranging from 11% (added heart sounds) to 53% (oedema). ECG, BNP and NT-proBNP all had high sensitivities (89%, 93% and 93% respectively). CXR was moderately specific (76–83%) but insensitive (67–68%). BNP was more accurate than ECG, with a relative diagnostic odds ratio of ECG/BNP of 0.32 (95% CI 0.12–0.87). There was no difference between the diagnostic accuracy of BNP and NT-proBNP.

Individual patient data analysis

A model based upon simple clinical features (male gender, history of myocardial infarction, basal crepitations, oedema; ‘MICE’) and BNP derived from one data set was found to have good validity when applied to other data sets, with an AUC between 0.84 and 0.96 and reasonable calibration. A model substituting ECG for BNP was less predictive. From this a simple clinical rule was developed and is proposed by the authors:

  • In a patient presenting with symptoms such as breathlessness in whom heart failure is suspected, refer directly to echocardiography if the patient has any one of:
  • Otherwise, carry out a BNP test and refer for echocardiography depending on the results of the test:
    • female without ankle oedema – refer if BNP > 210–360 pg/ml depending upon local availability of echocardiography (or NT-proBNP > 620–1060 pg/ml)
    • male without ankle oedema – refer if BNP > 130–220 pg/ml (or NT-proBNP > 390–660 pg/ml)
    • female with ankle oedema – refer if BNP > 100–180 pg/ml (or NT-proBNP > 190–520 pg/ml).

Cost-effectiveness analysis

On the basis of the cost-effectiveness analysis carried out, such a decision rule is likely to be considered cost-effective to the NHS in terms of cost per additional case detected. The cost-effectiveness analysis further suggested that, if likely patient benefit in terms of improved life expectancy is taken into account, the optimum strategy would be to refer all patients with symptoms suggestive of heart failure directly for echocardiography.


The analysis that we have performed points to the need for important changes to the NICE recommendations. First, BNP (or NT-proBNP) should be recommended over ECG and, second, some patients should be referred straight for echocardiography without undergoing any preliminary investigation.

Implications for health care

  • If there is sufficient local capacity, the evidence synthesised here suggests that the optimal diagnostic strategy for many patients with symptoms indicating possible heart failure would be direct referral for echocardiography.
  • In the presence of a limited supply of echocardiography the authors suggest the following:
    • patients with symptoms suggestive of heart failure should be referred directly for echocardiography only if they have a history of myocardial infarction or if they have basal crepitations on examination or if they are male and have ankle oedema
    • otherwise, they should have a BNP (or NT-proBNP) test performed and the decision to refer for echocardiography should depend upon the BNP (or NT-proBNP) result, interpreted in the light of their gender and the presence or absence of ankle oedema.
  • There is no need to perform an ECG as part of the assessment of whether or not heart failure is present (although it is recognised that there may be other indications for performing an ECG).

Recommendations for research

  1. Evaluation of the usability of the clinical rule described above in clinical practice.
  2. Evaluation of the diagnostic value of repeated BNP (or NT-proBNP) measurements for the diagnosis of heart failure.
  3. Evaluation of the diagnostic accuracy of automated ECG readings in the diagnosis of heart failure compared with ECG reading by a specialist.
  4. Further development of methods to conduct IPD meta-analysis for diagnostic tests.


  • Mant J, Doust J, Roalfe A, Barton P, Cowie MR, Glasziou P, et al. Systematic review and individual patient data meta-analysis of diagnosis of heart failure, with modelling of implications of different diagnostic strategies in primary care. Health Technol Assess 2009;13(32). [PubMed: 19586584]

NIHR Health Technology Assessment programme

The Health Technology Assessment (HTA) programme, part of the National Institute for Health Research (NIHR), was set up in 1993. It produces high-quality research information on the effectiveness, costs and broader impact of health technologies for those who use, manage and provide care in the NHS. 'Health technologies' are broadly defined as all interventions used to promote health, prevent and treat disease, and improve rehabilitation and long-term care.

The research findings from the HTA programme directly influence decision-making bodies such as the National Institute for Health and Clinical Excellence (NICE) and the National Screening Committee (NSC). HTA findings also help to improve the quality of clinical practice in the NHS indirectly in that they form a key component of the 'National Knowledge Service'.

The HTA programme is needs led in that it fills gaps in the evidence needed by the NHS. There are three routes to the start of projects.

First is the commissioned route. Suggestions for research are actively sought from people working in the NHS, from the public and consumer groups and from professional bodies such as royal colleges and NHS trusts. These suggestions are carefully prioritised by panels of independent experts (including NHS service users). The HTA programme then commissions the research by competitive tender.

Second, the HTA programme provides grants for clinical trials for researchers who identify research questions. These are assessed for importance to patients and the NHS, and scientific rigour.

Third, through its Technology Assessment Report (TAR) call-off contract, the HTA programme commissions bespoke reports, principally for NICE, but also for other policy-makers. TARs bring together evidence on the value of specific technologies.

Some HTA research projects, including TARs, may take only months, others need several years. They can cost from as little as £40,000 to over £1 million, and may involve synthesising existing evidence, undertaking a trial, or other research collecting new data to answer a research problem.

The final reports from HTA projects are peer reviewed by a number of independent expert referees before publication in the widely read journal series Health Technology Assessment.

Criteria for inclusion in the HTA journal series

Reports are published in the HTA journal series if (1) they have resulted from work for the HTA programme, and (2) they are of a sufficiently high scientific quality as assessed by the referees and editors.

Reviews in Health Technology Assessment are termed 'systematic' when the account of the search, appraisal and synthesis methods (to minimise biases and random errors) would, in theory, permit the replication of the review by others.

The research reported in this issue of the journal was commissioned by the HTA programme as project number 05/06/01. The contractual start date was in February 2006. The draft report began editorial review in October 2007 and was accepted for publication in January 2009. As the funder, by devising a commissioning brief, the HTA programme specified the research question and study design. The authors have been wholly responsible for all data collection, analysis and interpretation, and for writing up their work. The HTA editors and publisher have tried to ensure the accuracy of the authors’ report and would like to thank the referees for their constructive comments on the draft document. However, they do not accept liability for damages or losses arising from material published in this report.

The views expressed in this publication are those of the authors and not necessarily those of the HTA programme or the Department of Health.

Editor-in-Chief: Professor Tom Walley CBE

Series Editors: Dr Aileen Clarke, Dr Chris Hyde, Dr John Powell, Dr Rob Riemsma and Professor Ken Stein

© 2009 Crown Copyright.

Included under terms of UK Non-commercial Government License.

PMID: 19586584


PubMed Health Blog...

read all...

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...