Author + information
- †Cardiology Division, Massachusetts General Hospital, Boston, Massachusetts
- ‡Department of Cardiology, University Medical Center Utrecht, Utrecht, the Netherlands
- ↵∗Reprint requests and correspondence:
Dr. James L. Januzzi, Jr, Division of Cardiology, Massachusetts General Hospital, Yawkey 5984, 32 Fruit Street, Boston, Massachusetts 02114.
“The important thing is to not stop questioning.” (1)
Following the introduction of B-type natriuretic peptide (BNP) and its amino-terminal equivalent (NT-proBNP), the use of biomarkers for the evaluation and management of heart failure (HF) has grown. Indeed, natriuretic peptide testing for diagnosis and prognosis recently earned a Class I Level of Evidence: A in the 2013 American Heart Association/American College of Cardiology clinical practice guidelines for HF (2). Although it took years to develop such support, this is indeed the proper recognition of the clinical role played by these important biomarkers.
The studies that ultimately led to this recognition more often than not indicated the value of BNP and NT-proBNP. However, during this time, a great evolution occurred in the standards for evaluating the merits of biomarkers in HF: compared with earlier analyses (which occasionally appear primitive, in retrospect), the standards for assessment of novel biomarkers have become considerably more complex. Newer assays appropriately have a higher bar to surpass, but this is critical, as there are important considerations of incremental cost and benefit ratio from their measurement.
To this point, more noteworthy than the ascendancy of BNP or NT-proBNP in the same HF clinical practice guidelines, “biomarkers of myocardial fibrosis” were given a Class IIb recommendation for use. The guidelines were specifically referring to galectin-3 and soluble(s) ST2. Both biomarkers have a scientific rationale for their measurement in patients with HF, and since early publications identified them as candidate prognostic biomarkers in HF (3,4), both have data showing their prognostic importance in a wide range of patient types, including those at risk for HF, and those with established left ventricular dysfunction (with or without symptomatic congestion). Both galectin 3 and sST2 also have regulatory approval for clinical use in a broad range of markets worldwide, and both are commercially available.
Importantly, analogous to the early BNP and NT-proBNP experience, whereas galectin-3 and sST2 have a biological rationale for use, and both have been shown to be prognostic in HF, no head-to-head data exist with respect to the prognostic information provided by these 2 novel assays. This latter point is not insignificant: differences between galectin-3 and sST2 are considerably greater than those of the 2 natriuretic peptides, and whereas both share a common category and indication for their measurement, they are fundamentally different biomarkers and provide different information. Therefore, understanding their individual and collective merit is an important exercise, given their regulatory approval and availability for clinical use.
It is in this context that investigators from Barcelona examined the comparative prognostic value of galectin 3 and sST2 in a thorough manner. As shown by Bayes-Genis et al. (5), in this issue of the Journal, when tested in 876 patients with chronic, ambulatory HF (median age: 70 years, average left ventricular ejection fraction: 34%), both biomarkers were prognostic for adverse outcomes (including all-cause and cardiovascular mortality, as well as hospitalization for HF) when analyzed in a univariate manner. In adjusted models, sST2 had improved discrimination, good calibration, and considerably reclassified clinical risk for each outcome measure; in contrast, galectin-3 did not survive adjustment for baseline variables. In head-to-head comparisons, sST2 was more prognostic.
When considering the merits of the exploding number of biomarkers available for testing in HF, we recently stated the importance of careful evaluation of candidates for clinical application (6), emphasizing the importance of comparative analyses with statistical rigor. In this regard, it is worth reviewing the methods by which a HF biomarker may be evaluated, in order to understand the potential clinical significance of the present results comparing galectin-3 and sST2.
Among the factors to consider in the study of biomarkers in HF are discrimination, calibration, and reclassification (summarized in Table 1); all are applied well in the present study.
“Discrimination” is a statistical term that reflects the ability of a prognostic tool to predict an event versus nonevent; in contrast, “calibration” measures how much the model estimate of a specific outcome matches the true occurrence of the outcome, and it is an important component to validate models assessing discrimination.
The most commonly used way to evaluate the discrimination of a HF biomarker is to assess the area under the receiver-operating characteristic curve, a balance of sensitivity and specificity for a gold standard outcome; the area under the curve is also known as the C-statistic, particularly when used in logistic regression. The advantage of this approach is the ability to compare changes in C-statistics resulting from the addition or subtraction of variables to a pre-existing model. However, small but statistically significant changes in the C-statistic may be clinically irrelevant, and there is no generally agreed upon “clinically important” improvement in C-statistics, creating challenges for translating data into clinical practice. In some cases, a valuable variable may provide little change to a very robust baseline model, whereas lesser important variables may result in substantial changes in a more unstable model. As well, receiver-operating characteristic testing is most valuable when an outcome is already present, such as for a diagnostic test; use for prognosis is less trustworthy. To verify discriminative results, calibration is an important next step. A result may be significant in discriminative models but not sound if calibration is poor, such as the all-cause mortality risk predicted by galectin-3 in this study.
To provide clinical relevance to results, “reclassification” measures have been introduced, with the net reclassification improvement or integrated discrimination improvement approaches applied most often. These methods essentially total the overall change in risk prediction from the addition of results to a model through the sum of correctly upward-classified events and downward-classified nonevents, subtracted by the proportions of incorrectly downward-classified events and upward-classified nonevents. The value of reclassification analyses is that they provide immediately clinically useful information for assessing the potential impact of a novel biomarker when applied to a patient population that it may be tested in. Limitations of reclassification analyses include the fact that a lack of gold standard categories of risk may result in the use of arbitrary risk strata.
In the analysis by Bayes-Genis et al. (5), change in the C-statistic (discrimination) was significant when sST2 was added to a strong clinical model for predicting all-cause mortality, cardiovascular mortality, the combination of both outcome measures, as well as for death or HF hospitalization; galectin 3 showed modest comparable performance. That said, it is worth noting the 95% confidence intervals of the C-statistic overlapped between sST2 and galectin-3; in this regard, superiority for sST2 is not necessarily conclusive. However, calibration measures were superior for sST2 versus galectin-3. Perhaps most clinically relevant and most significant, in this cohort, sST2 reclassified risk by a substantial degree, whereas galectin 3 did not. In fact, galectin-3 actually misclassified risk in those patients destined to die. Lastly, head-to-head analyses between the 2 markers in this study favored sST2.
This study has little to assail in terms of methodologic approach, with results that send a clear message. However, there are limitations worth emphasizing. First, it is a relatively small analysis of a generally male population, with a wide range of left ventricular function. Whereas study participants were nearly always taking angiotensin-converting enzyme inhibitors/angiotensin II receptor blockers or beta-adrenergic blockers, we are not provided the doses being administered; furthermore, only 39% were taking mineralocorticoid receptor antagonists, and a low percentage had received cardiac resynchronization therapy and/or implantable cardioverter-defibrillators. These differences make validation of these results in a broad range of other patient types important. Moreover, the analysis only examined a single measurement of both biomarkers; serial measurement adds considerable statistical and clinical value for prognostic biomarkers and may have affected the results of this analysis as happened in recent analyses of galectin 3 (7). Finally, whereas this analysis found sST2 prognostically meaningful, translating prognostic merit to a strategic clinical response was not explored, something we should all hope to see with promising HF biomarkers in the future.
Whereas the results of this excellent study do not shut the door on galectin-3, they should make us take pause, just as much as any other novel biomarker being evaluated for use in our patients. The rigorousness of this analysis is a fine example of the evolution in comparative biomarker studies and how they may inform a path forward to clinical care. As the use of biomarkers expands for the care of our patients, we should never stop questioning and improving how we do so.
The authors thank Dr. Mona Fiuzat and Dr. Christopher O'Connor for their insights regarding this editorial.
↵∗ Editorials published in the Journal of the American College of Cardiology reflect the views of the authors and do not necessarily represent the views of JACC or the American College of Cardiology.
Dr. Januzzi has received grant support from Roche Diagnostics, Critical Diagnostics, BG Medicine, Siemens, and Thermo-Fisher, as well as consulting income from Singulex. Dr. van Kimmenade has reported that he has no relationships relevant to the contents of this paper to disclose.
- American College of Cardiology Foundation
- ↵Einstein A. From the memoirs of William Miller, an editor, quoted in Life magazine, May 2, 1955; expanded, p. 281.
- Yancy C.W.,
- Jessup M.,
- Bozkurt B.,
- et al.
- Januzzi J.L. Jr..,
- Peacock W.F.,
- Maisel A.S.,
- et al.
- van Kimmenade R.R.,
- Januzzi J.L. Jr..,
- Ellinor P.T.,
- et al.
- Bayes-Genis A.,
- de Antonio M.,
- Vila J.,
- et al.
- van Kimmenade R.R.,
- Januzzi J.L. Jr..