Results

Cynthia Balion; Andrew Don-Wauchope; Stephen Hill; P Lina Santaguida; Ronald Booth; Judy A Brown; Mark Oremus; Usman Ali; Amy Bustamam; Nazmul Sohel; Robert McKelvie; Parminder Raina

NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

Balion C, Don-Wauchope A, Hill S, et al. Use of Natriuretic Peptide Measurement in the Management of Heart Failure [Internet]. Rockville (MD): Agency for Healthcare Research and Quality (US); 2013 Nov. (Comparative Effectiveness Reviews, No. 126.)

This publication is provided for historical reference only and the information may be out of date.

This publication is provided for historical reference only and the information may be out of date.

Cover of Use of Natriuretic Peptide Measurement in the Management of Heart Failure

Use of Natriuretic Peptide Measurement in the Management of Heart Failure [Internet].

Show details

Contents

< Prev Next >

Results

The search yielded 25,864 records identified from six bibliographic databases (Figure 2). An additional 35 records were identified from three grey literature sources: regulatory agency Web sites, clinical trial databases, and conference sources. After duplicates were removed, a total of 16,893 records were screened at title and abstract level; a total of 3,616 citations moved on to be screened at full text. Following the application of full text screening criteria, there were 310 eligible papers for all research questions in this review. See Appendix G for list of all excluded articles.

Figure 2

Flow diagram showing the numbers of articles processed at each level. *6 articles deal with two KQ groups. Three dealt with both diagnosis and prognosis and three dealt with both prognosis and treatment. ** 22 publication in KQ4 were selected from KQ3 (more...)

A total of 104 papers were allocated for diagnostic accuracy, and from these 76 articles were evaluated for Key Question (KQ) 1, and 28 for KQ2. For KQ3, KQ4, and KQ5, 190 articles were eligible to address the research questions related to prognosis; from these 183 were eligible for KQ3, 22 for KQ4, and seven publications for KQ5. A total of nine articles were evaluated for treatment guided by BNP or NT-proBNP for KQ6, and seven articles for KQ7 focusing on biological variation.

Key Question 1. In patients presenting to the emergency department or urgent care facilities with signs or symptoms suggestive of heart failure (HF)

What is the test performance of BNP and NT-proBNP for HF?
What are the optimal decision cutpoints for BNP and NT-proBNP to diagnose and exclude HF?
What determinants affect the test performance of BNP and NT-proBNP (e.g., age, gender, comorbidity)?

Sample and Design Characteristics of Papers Assessing BNP

There were 51 publications that met the criteria for KQ1 and examined cutpoints for BNP. Thirty-seven examined BNP only³^,⁷²^-¹⁰⁷ and 14 examined both BNP and NT-proBNP.¹⁰⁸^-¹²¹ See Appendix H KQ1 Evidence Set.

Study Design

Prospective study designs included two randomized controlled trials (RCT)⁹⁷^,¹⁰² and nine cohort studies.⁹²^,⁹⁸^,¹⁰⁶^,¹¹⁶^-¹²¹ The remaining papers (n=40) used a cross-sectional design. The selected articles were published between 2001 and 2011 and were conducted in a wide range of regions: nine in North America,⁷²^,⁸²^,⁸³^,⁹⁰^,¹⁰¹^,¹⁰⁵^,¹⁰⁷^,¹¹⁷^,¹²⁰ twenty-two in Europe⁷⁴^,⁷⁹^,⁸⁵^,⁸⁷^,⁸⁸^,⁹⁴^,⁹⁶^,¹⁰⁰^,¹⁰²^-¹⁰⁴^,¹⁰⁶^,¹⁰⁹^-¹¹⁵^,¹¹⁸^,¹¹⁹^,¹²¹ two in Asia,⁸⁶^,⁹⁵ one in South America,⁷⁸ two in Australia,⁸⁹^,⁹⁷ and one in New Zealand.¹⁰⁸ Thirteen papers were conducted in multinational sites³^,⁷³^,⁷⁵^-⁷⁷^,⁸⁰^,⁸¹^,⁸⁴^,⁹¹^-⁹³^,⁹⁸^,⁹⁹ and one was unclear as to region of conduct.¹¹⁶

Population Characteristics

Most articles, with the exception of ten,⁷⁴^,⁸⁴^,⁸⁷^,⁹⁰^,⁹¹^,⁹⁶^,¹⁰⁹^,¹¹⁵^,¹¹⁹^,¹²⁰ provided diagnostic information on the overall study sample. Some papers provided diagnostic information on populations grouped according to age,⁷³^,⁷⁴^,⁸⁵^,⁸⁹^,¹⁰¹^,¹¹¹^,¹¹³^,¹¹⁹ sex,⁷³^,⁷⁴ and ethnicity.⁷³

Some papers presented diagnostic information according to body mass index (BMI) status,⁹¹^,¹⁰¹^,¹⁰² diabetes status,⁸⁴ previous history of heart failure (HF),⁷²^,⁸⁹^,⁹⁶ permanent/paroxysmal atrial fibrillation (AF),⁹² renal function/estimated glomerular filtration rate (eGFR),¹⁰¹^,¹⁰⁹^,¹¹³^,¹¹⁴^,¹²⁰ history of hypertension or blood pressure elevation on admission,⁹⁹ and left ventricular ejection fraction (LVEF).¹¹⁶ Three papers included information on HF populations.⁷⁶^,¹⁰⁰^,¹⁰²

In all papers, study patients presented to emergency departments with shortness of breath and were 18 years of age and older. Seventeen articles had a patient population with mean or median ages from 60 to 69 years old⁷²^,⁷³^,⁷⁵^-⁷⁷^,⁷⁹^-⁸¹^,⁸³^,⁸⁴^,⁹¹^,⁹³^,⁹⁹^,¹⁰⁴^,¹¹⁸^,¹²¹^,¹²² and 14⁷⁴^,⁷⁸^,⁸⁷^,⁸⁹^,⁹⁵^-⁹⁷^,¹⁰¹^,¹⁰²^,¹⁰⁶^-¹⁰⁹^,¹¹⁴ had populations with mean or median age ranges between 70 and 79. Four studies had a mean or median patient population over 80 years of age⁸⁵^,⁹⁴^,¹⁰⁵^,¹¹² and ten did not report on age of study population.³^,⁸²^,⁸⁶^,⁸⁸^,⁹⁰^,⁹⁸^,¹¹⁶^,¹¹⁷^,¹¹⁹^,¹²⁰ Six articles reported ages in the following ranges: 65 to 100,¹¹¹ 43 to 90,¹¹³ 67 to 82,¹²³ 58 to 82,¹¹⁰ 68 to 82,¹⁰⁰ and 30 to 95 years.¹⁰³

The percentage of males enrolled in each study ranged from 5.6 percent⁸⁴ to 100 percent⁷² (mean=66.2%; median=66.2%). Sample size populations (including subpopulations) ranged from 9⁸⁹ to 1,614³ (mean=404, median=251). The prevalence of HF in the study populations ranged from 8.3 percent¹⁰⁰ to 84 percent⁹⁶ (mean=45.1%; median=46.6%).

Component Articles

Of the 51 selected papers, 11 used data from the Breathing Not Properly Multinational Study,⁷³^,⁷⁵^-⁷⁷^,⁸⁰^,⁸¹^,⁸⁴^,⁹¹^-⁹³^,⁹⁹ three used data from the B-type Natriuretic Peptide for Acute Shortness of Breath Evaluation (BASEL) study,⁹⁶^,¹⁰²^,¹⁰⁶ one from the Biomarkers in Acute Heart Failure (BACH) study,³ and one from the BNP in Shortness of Breath study.⁹⁷ One article used data from the Heart Failure and Audicor technology for Rapid Diagnosis and Initial Treatment (HEARD-IT) study,⁹⁸ and one was from the epidemiological study of acute dyspnea in elderly patients (EPIDASA) study.⁸⁵ One set of authors published results on the same data sets¹¹⁴^,¹¹⁹ and the remaining articles (n=31) were independent papers, publishing results on unique data sets.

Assays Tests

Seven articles used the Abbott AxSYM^® B-Type Natriuretic Peptide (BNP) Microparticle Enzyme Immunoassay (MEIA)),⁹⁵^,⁹⁷^,⁹⁸^,¹⁰⁰^,¹⁰⁶^,¹¹⁰^,¹¹⁵ five used the TRIAGE-B-Type Natriuretic Peptide (BNP) test for the Beckman Coulter Immunoassay Systems,³^,¹⁰³^,¹¹⁶^,¹¹⁸^,¹²¹ two used the I-STAT BNP test,¹⁰¹^,¹⁰⁷ two used the ADVIA-Centaur^® BNP Assay, Bayer Diagnostics ACS:180^® BNP Assay,⁹⁸^,¹¹³ and two used the ADVIA-Centaur^® B-Type Natriuretic Peptide (BNP) Assay.⁸⁸^,⁹⁸ The remaining papers (n=35) used the TRIAGE-B-Type Natriuretic Peptide (BNP) test.

Diagnosis of Heart Failure in Papers

The majority of articles (n=45) based the diagnostic reference standard on clinical judgment.³^,⁷²^-⁸¹^,⁸³^-⁸⁵^,⁸⁷^,⁸⁹^-⁹⁹^,¹⁰¹^-¹⁰⁴^,¹⁰⁶^-¹⁰⁹^,¹¹¹^-¹²¹ Of these 45 articles, most (n=34) had a reference standard agreed upon by at least two physicians (mostly cardiologists), ten based the final diagnosis on the opinion of a single cardiologist or other type of clinician,⁷²^,⁷⁸^,⁸⁹^,⁹⁶^,¹⁰²^,¹⁰⁷^,¹⁰⁹^,¹¹⁸^-¹²⁰ and one article did not indicate this information.¹²¹ The adjudication physicians each arrived at a diagnosis of HF based on their interpretation of all available clinical data; this often included echocardiography results. One article¹⁰⁶ included BNP in the data used for adjudication. Of the 45 papers using clinical judgment to make the final diagnosis, the Framingham criteria were used in 15, and the National Health and Nutrition Examination Survey (NHANES) was used in 10.

Of the remaining articles (n=6), three based the final diagnosis of HF both on clinical judgment and results of echocardiography,⁸²^,⁸⁸^,¹⁰⁰ one based it on echocardiography results alone,⁸⁶ one reported that the definitive diagnosis was based on the Framingham criteria,¹¹⁰ and one reported that the HF status was based on discharge diagnosis.¹⁰⁵

BNP: Test Performance and Optimal Cutpoints in Emergency Department

Diagnostic Properties in BNP

The 51 papers evaluating BNP in the emergency department used several cutpoints ranging from 12.5⁸⁶ to 983.5⁸⁶ pg/mL or ng/L (mean=213.1; median=162). One study measured BNP in pmol/L and had cutpoints ranging from 20 to 100.¹⁰⁸ These were converted to pg/mL for analysis. Reported sensitivities ranged from 36 percent⁹² to 100 percent⁷⁴^,⁷⁸^,⁸⁶^,⁸⁹^,¹¹³ (mean=82.4 percent; median=86 percent), specificities from 14 percent⁷⁶ to 99 percent⁹⁶ (mean=75.4 percent; median=79.5%), and areas under the curve (AUC) of 0.08⁹² to 0.99⁷⁸^,⁸² (mean=0.84; median=0.89). Of the 51 papers looking at BNP, 14 also looked at NT-proBNP.⁸⁸^,¹⁰⁸^-¹²⁰ Appendix H Tables H-1 and H-2 present summary tables of these studies.

The majority of papers reported on the Triage BNP Point-of-Care test. Two papers reported on the Triage BNP test licensed to Beckman Coulter for use on their laboratory instruments.³^,¹⁰³ Four papers reported using the Abbott AxSYM,⁹⁷^,¹⁰⁰^,¹⁰¹^,¹¹⁰ and one reported using the ADVIA-Centaur system.⁸⁸ Gorissen et al.¹¹³ reported on two systems (ADVIA-Centaur and Triage).

Data were extracted, 2x2 tables prepared, and forest plots of sensitivities, specificities, positive and negative likelihood ratios (LRs), diagnostic odds ratios (DORs), and summary receiver operator characteristic (ROC) curves are presented (see Appendix H Figures H-1 to H-12). Three cutpoints were selected: lowest presented, manufacturers' suggested, and the optimal cutpoint as chosen by the authors.

If the lowest cutpoint presented by the authors is chosen, all papers except four¹¹¹^,¹¹³^,¹¹⁹^,¹²⁰ return sensitivities greater than 90 percent (summary estimate 95 percent, (95% confidence interval (CI) 93 to 97 percent)). Negative LRs (LR^-) were all less than 0.20 for this group. Overall, specificity was lower and much more variable, ranging from 27 to 88 percent (summary estimate 67 percent (95% CI, 58 to 75 percent)).

Among papers that reported a sensitivity less than 90 percent, Ray et al.¹¹¹ and Chevenier-Gobeaux et al.¹¹⁹ enrolled patients older than 65 years. Both papers used higher cutpoints than most other papers (Ray: 250 pg/mL; Chevenier-Gobeaux: 270 pg/mL 65-84 years, and 290 pg/mL >85 years). deFilippi et al.¹²⁰ enrolled a population with a high prevalence (47 percent) of subjects with eGFR <60 mL/min/1.73 m². Gorrison et al.¹¹³ reported using the ADVIA-Centaur and Triage assay systems. They also selected a high cutpoint (225 pg/mL) and report a sensitivity of 65 percent and 73 percent, below all other papers.

Using package inserts, 501(k) submission forms, and product brochures, we determined the manufacturers' recommended cutpoints. In all cases the manufacturer suggested a cutpoint of 100 pg/mL to rule out the diagnosis of HF. Twenty-one papers reported for this cutpoint. Sensitivities ranged from 86 to 100 percent (summary estimate 95 percent (95% CI, 93 to 96%)), and specificities ranged from 31 to 97 percent (summary estimate 66 percent (95% CI, 56 to 74 percent)).³^,⁷⁴^,⁷⁹^,⁸¹^-⁸³^,⁸⁵^,⁸⁶^,⁸⁸^,⁸⁹^,⁹³^,⁹⁵^-⁹⁷^,¹⁰¹^,¹⁰⁴^,¹⁰⁷^,¹⁰⁸^,¹¹⁰^,¹¹²^,¹¹⁴

Twenty-eight papers³^,⁷⁴^,⁷⁷^-⁷⁹^,⁸¹^-⁸³^,⁸⁵^,⁸⁶^,⁸⁹^,⁹¹^,⁹³^-⁹⁸^,¹⁰⁰^,¹⁰⁴^,¹⁰⁸^,¹¹⁰^-¹¹⁴^,¹¹⁹^,¹²⁰ examined an optimal cutpoint. The majority (n=19) of the studies determined a cutpoint that maximized accuracy, either using an ROC curve or by examining several arbitrary cutpoints⁷⁴^,⁷⁷^-⁷⁹^,⁸¹^-⁸³^,⁸⁵^,⁸⁶^,⁹⁴^,⁹⁶^,⁹⁷^,¹⁰⁸^,¹¹⁰^-¹¹³^,¹¹⁹^,¹²⁰ Three studies maximized sensitivity,⁸⁹^,⁹³^,¹⁰⁴ three others used the manufacturers' suggested cutpoint or other accepted threshold³^,⁹¹^,¹¹⁴ and one study used multiple logistic regression,⁹⁵ one set the sensitivity at 90 percent and determined specificity,¹⁰⁰ and one set the sensitivity at 96 percent in all subgroups and determined specificity.⁹⁸ Sensitivities ranged from 65 percent to 100 percent (summary estimate 91 percent (95% CI, 88 to 94 percent)), specificities ranged from 34 percent to 97 percent (summary estimate 80 percent (95% CI, 74 to 85 percent)). Using the optimal cutpoint resulted in a higher overall estimate of the positive LR (LR⁺ (4.61, 95% CI, 3.49 to 6.09) compared to either the lowest cutpoint (2.85 (95% CI, 2.23 to 3.65)), or the manufacturer cutpoint (2.76 (95% CI, 2.12 to 3.59)). The LR^- was not significantly different (p>0.05).

Choosing the lowest, manufacturer, or the optimal cutpoint had little effect on the diagnostic performance of the test. The test displayed high sensitivity and a high LR^-, but a low specificity and LR⁺.

BNP: Determinants of Test Performance in Emergency Department

The effect of various determinants upon the diagnostic performance of BNP for the diagnosis of HF were examined.

Age

Eight articles⁷³^,⁷⁴^,⁸⁵^,⁸⁹^,¹⁰¹^,¹¹¹^,¹¹³^,¹¹⁹ examined the relationship between age and BNP. In all cases, increasing age was associated with an increase in BNP concentration, but the correlation of age with the diagnostic performance of the test was not clear in the papers.

Six papers examined the effect of age on the AUC (Table 1).⁷³^,⁷⁴^,⁸⁵^,⁸⁹^,¹¹³^,¹¹⁹

Table 1

Effect of age on AUC for BNP.

Four papers⁷³^,¹⁰¹^,¹¹¹^,¹¹⁹ examined different decision cutpoints based upon age, each using different reasoning and criteria (Table 2). Maisel et al.⁷³ suggested cutpoints no greater than 100 pg/mL for both age groups, above and below 70 years of age. These decision points maximized sensitivity, with specificity being the second concern. Their reasoning was that a false negative result was less desirable than a false positive in terms of cost to the patient.

Table 2

Effect of age on diagnostic performance of BNP.

Rogers et al.¹⁰¹ using the manufacturers' suggested cutpoint of 100 pg/mL, established the sensitivity of the entire cohort at 91 percent. To achieve 91 percent sensitivity in those 75 years of age and older, the decision point was set at 184 pg/mL. The specificity at this point was 54 percent.

Chenevier-Gobeaux¹¹⁹ examined the very elderly, 85 years of age and older, compared with those aged 65 to 84. For the younger group, the optimal cutpoint was 270 pg/mL (sensitivity 73%, specificity 83%), whereas for the very elderly the optimal cutpoint was 290 pg/mL (sensitivity 80%, specificity 69%).

For those aged 65 and older, Ray et al.¹¹¹ established an optimal cutpoint of 250 pg/mL (sensitivity 73%, specificity 91%). In an earlier paper,⁸⁵ these authors also established an optimal cutpoint of 250 pg/mL (sensitivity 78%, specificity 90%). It is not clear if these publications used independent study populations.

Gorissen et al.¹¹³ examined two different BNP assays and divided their population into three age groups. For the Triage assay, the optimal cutpoint for those less than 65 years was 91 pg/mL (sensitivity 55%, specificity 100%), for those 65 to 75 years of age it was 260 pg/mL (sensitivity 83%, specificity 82%), and for those greater than 75 years the optimal cutpoint was 309 mg/mL (sensitivity 71%, specificity 68%). Similarly, for the Siemens Centaur assay the cutpoints were 91 mg/mL (sensitivity 55%, specificity 100%), 188 pg/mL (sensitivity 83%, specificity 73%), and 247 pg/mL (sensitivity 77%, specificity 68%) respectively.

All authors reported that the optimal BNP threshold for diagnosis of HF increases with age, but there is no consensus on how to set the threshold.

Sex

Two papers examined sex and BNP⁷³^,⁷⁴ (Table 3). Maisel et al.⁷³ reported that the difference in BNP concentrations between men and women was not significant. Knudson et al.⁷⁴ noted differences in sensitivity between males and females using 100 pg/mL as the decision point (males: sensitivity 94.3%, specificity 54.9%; females: sensitivity 90.0%, specificity 55.2%).

Table 3

Effect of sex on AUC for BNP.

Ethnicity

One study examined the effect of ethnicity on the diagnostic properties of BNP. Maisel et al.⁷³ reported that the prevalence of HF in their population was significantly greater among whites than among African Americans. Similarly, the concentration of BNP in the white population was significantly greater than in the African American population (200 vs. 117 pg/mL, p<0.001). The AUC is shown in Table 4.

Table 4

Effect of ethnicity on AUC for BNP.

Obesity/Body Mass Index

Three papers⁹¹^,¹⁰¹^,¹⁰² examined the effect of obesity on the diagnostic properties of BNP. All three showed that increasing BMI was associated with reduced BNP concentrations. This was true if BMI and BNP were examined in the whole population,¹⁰¹^,¹⁰² or if the population was examined in two groups: those with and without HF.⁹¹

Daniels et al.⁹¹ examined the diagnostic properties using a fixed decision point of 100 pg/mL. The sensitivity decreased, but the specificity increased as the BMI increased. In this study the decision points to achieve 90 percent sensitivity was 170 pg/mL for BMI less than 25 kg/m², 110 pg/mL for BMI 25 to 35 kg/m², and 54 pg/mL for BMI greater than 35 kg/m². Specificity was greater than 70 percent in all three subgroups. Rogers et al.¹⁰¹ also adjusted the decision point of the BMI greater than 35 kg/m² group to achieve the same sensitivity (91%) as the entire cohort (100 pg/mL). This decision point (25 pg/mL) resulted in a reduced specificity. Noveanu et al.¹⁰² examined the diagnostic properties at two decision points, 100 and 500 pg/mL. Table 5 displays the diagnostic properties of these papers.

Table 5

Effect of body mass index on diagnostic performance of BNP.

Renal Function

Five papers¹⁰¹^,¹⁰⁹^,¹¹³^,¹¹⁴^,¹²⁰ examined the relationship between renal function and the diagnostic properties of BNP. Four¹⁰⁹^,¹¹³^,¹¹⁴^,¹²⁰ examined eGFR (Table 6) and one¹⁰¹ examined serum creatinine concentration. Three papers¹⁰⁹^,¹¹⁴^,¹²⁰ optimized the decision point based on eGFR, two¹⁰⁹^,¹¹⁴ maximized sensitivity, and one¹²⁰ maximized accuracy.

Table 6

Effect of renal function on diagnostic performance of BNP.

The BNP concentration was inversely related to renal function: as the eGFR decreased or creatinine concentration increased, the BNP concentration increased.

Using the recommended cutpoint of 100 pg/mL, Rogers et al.¹⁰¹ reported a sensitivity of 100 percent and a specificity of 30 percent for those subjects with serum creatinine ≥2 mg/dL. They then adjusted the decision point for those subjects with serum creatinine ≥2 mg/dL to equal the sensitivity of the entire cohort using the recommended decision point of 100 pg/mL (sensitivity 91%, specificity 54%). This resulted in a cutpoint of 449 pg/mL (specificity 78%).

While these authors recognized that sex, ethnicity, obesity, and renal function have significant effects upon concentration of BNP and potentially on the diagnostic performance of BNP in the diagnosis of HF in the emergency department, all also recognized the difficulty in establishing multiple decision points.

Diabetes

One study⁸⁴ examined the effect of diabetes mellitus on the use of BNP for the diagnosis of HF. This study reported a nonsignificant difference in the AUC of 0.888 (95% CI, 0.860 to 0.912) for nondiabetics versus 0.878 (95% CI, 0.837 to 0.913) for diabetics.

Sample and Design Characteristics of Papers Assessing NT-proBNP

Thirty-nine articles met the criteria for KQ1 and examined NT-proBNP. Twenty-five examined NT-proBNP only¹^,²^,²⁶^,⁸⁸^,¹²²^,¹²⁴^-¹⁴³ and 14 examined both BNP and NT-proBNP.¹⁰⁸^-¹²¹ (Appendix H Table H-3).