Author + information
- Received March 26, 2018
- Revision received April 13, 2018
- Accepted April 16, 2018
- Published online June 18, 2018.
- Department of Medical Statistics, London School of Hygiene & Tropical Medicine, London, United Kingdom
- ↵∗Address for correspondence:
Dr. Stuart J. Pocock, Department of Medical Statistics, London School of Hygiene & Tropical Medicine, Keppel Street, London WC1E 7HT, United Kingdom.
The late-breaking clinical trials presentations at the American College of Cardiology Scientific Sessions in March 2018 are an important contribution to the field of cardiology. This paper presents a constructive critical appraisal of 7 key studies: ODYSSEY OUTCOMES (Evaluation of Cardiovascular Outcomes After an Acute Coronary Syndrome During Treatment With Alirocumab), VEST (Vest Prevention of Early Sudden Death Trial), SECURE-PCI (Statins Evaluation in Coronary Procedures and Revascularization), TREAT (Ticagrelor in Patients with ST-Elevation Myocardial Infarction treated with Pharmacological Thrombolysis), POISE (PeriOperative ISchemic Evaluation), SMART-DATE (Safety of 6-Month Duration of Dual Antiplatelet Therapy After Percutaneous Coronary Intervention in Patients With Acute Coronary Syndrome), and CVD-REAL 2 (Comparative Effectiveness of Cardiovascular Outcomes in New Users of SGLT-2 Inhibitors). For each study, our aim is to document and interpret the main findings, noting particularly when “positive spin” appears to occur, and to provide a balanced account of each study, paying attention to both constructive new findings and study limitations. These topical examples also provide useful general insights on what to look for when critiquing clinical trial presentations and publications.
Each year, the American College of Cardiology (ACC) Scientific Sessions are a major forum for presentations of original findings across a broad spectrum of research activities in cardiology. Of particular interest are the late-breaking clinical trials sessions, because they provide the latest pivotal evidence on both new and established treatment practices in cardiology.
This year, from March 10 to 12, 2018, there were 8 such sessions in which 37 studies were presented. To review all of these studies would be an immense task; hence, we chose to provide a constructive critical appraisal of 7 key presentations (Central Illustration). These studies were chosen as they were: 1) of major clinical importance; and 2) within our sphere of expertise.
For each study, our aim is to place it in context, summarize the design, present the main findings, and then provide a critical interpretation. We paid particular attention to the multiplicity of data available for presentation and the consequent problems that arise (e.g., in having multiple secondary endpoints or multiple subgroup analyses). Potential biases (e.g., in the 1 nonrandomized study we review) are assessed.
There is a natural desire for trialists to wish to emphasize the more positive aspects of their study findings. This “positive spin” carries the risk that presentations may not provide a balanced account of the totality of evidence (1). We point out instances when this appears to occur.
Overall, we hope this paper provides a meaningful commentary on some of the most topical (and sometimes controversial) presentations at ACC 2018.
The ODYSSEY OUTCOMES Trial
Alirocumab in acute coronary syndrome
The ODYSSEY OUTCOMES (Evaluation of Cardiovascular Outcomes After an Acute Coronary Syndrome During Treatment With Alirocumab) trial (2) recruited 18,924 patients who: 1) had an acute coronary syndrome (ACS) event in the past 1 to 12 months; 2) were on high-intensity statin therapy; and 3) had inadequate control of lipids (e.g., low-density lipoprotein [LDL] cholesterol ≥70 mg/dl). Patients were randomized to alirocumab (a PCSK9 inhibitor) or placebo. The primary composite efficacy endpoint was coronary heart disease (CHD) death, nonfatal myocardial infarction (MI), fatal or nonfatal ischemic stroke, or unstable angina requiring hospitalization. As is common practice, we will refer to these as major adverse cardiovascular events (MACE). Median follow-up was 2.8 years. As expected, patients on alirocumab had a marked reduction in LDL cholesterol compared with placebo: −62.7% at 4 months, which attenuated slightly to −54.7% at 4 years.
Results for the primary efficacy endpoint and its components are shown in the top half of Table 1. MACE had a highly significant 15% relative reduction (hazard ratio [HR]: 0.85) with 95% confidence interval (CI): 7% to 22%; p = 0.0003. All 4 components of MACE had fewer events on alirocumab compared with placebo, although this was not significant for CHD death.
It is relevant to express this primary result on an absolute scale. There were 149 fewer patients with a MACE event on alirocumab of 9,462 patients per arm followed for a median 2.8 years. This translates into a reduction of 5.62 first MACE events per 1,000 years of treatment, with 95% CI: 2.35 to 8.89 per 1,000 patient-years. This can be converted to a number needed to treat: to prevent 1 MACE event, one needs to treat 63 patients for a median of 2.8 years (95% CI: 41 to 141 patients). This is helpful in elucidating whether an overall strategy of prescribing alirocumab to all eligible patients is sufficiently effective and in turn cost-effective.
There are several important considerations here:
1. We are confined to the trial’s inevitably limited follow-up, so we cannot generalize to the effects of longer-term treatment.
2. The plot of cumulative MACE events over time by treatment group (Figure 1) reveals no separation of the curves out to 1 year. This significant treatment-time interaction (p = 0.03) means that all the benefit appears to kick in after 1 year of treatment. This departure from proportional hazards calls into question whether an HR is the best overall summary of the treatment effect.
3. This absolute benefit will vary from patient to patient: that is, higher-risk patients are liable to have a higher absolute benefit. For instance, the 27% of patients who were >65 years of age had a MACE rate around 55% higher than the rest. We would encourage the authors to undertake appropriate multivariable analysis so patients can be stratified according to their risk status (3). This will help refine which patients benefit the most from alirocumab treatment.
Now, we turn to the main secondary endpoints (bottom half of Table 1), which are listed in a pre-defined order for hierarchical statistical testing (4). This is to keep the overall type 1 error at 0.05. The first 4 on the list were all highly significant, but CHD death and cardiovascular (CV) death were not (p = 0.38 and p = 0.15, respectively).
For all-cause death, there is an observed 15% relative risk reduction (HR: 0.85) with a 95% CI: 2% to 27% reduction; p = 0.026. However, because this sits lower in the hierarchy of statistical testing, it does not fit in the formal list of claims for treatment efficacy within the bounds of strict type 1 error control. A counter-argument is that overall survival is clearly the most important matter for patients and, hence, merits special attention beyond statistical formalities. A weakness in this statement is that the all-cause death finding rests on combining nonsignificant reductions in both CV and non-CV deaths (31 and 27 fewer deaths, respectively), and the latter has no obvious rationale.
The next concern is over the interpretation of subgroup analyses for the primary MACE outcome. For the 5 main pre-specified subgroups, there were no statistically significant interactions with treatment. This would normally be the end of the matter: insufficient evidence that there are any identifiable effect-modifiers. But, in this case, the idea is pursued that alirocumab may be more effective in the 30% of patients who had baseline LDL cholesterol ≥100 mg/dl: the observed relative risk reduction becomes 24% (95% CI: 13% to 35%), but it is questionable whether a post hoc emphasis on this finding is justifiable (5,6).
Even more doubtful is the claim that all-cause mortality is reduced by 29% (95% CI: 10% to 44%) in patients with LDL cholesterol ≥100 mg/dl. Such data dredging amongst subgroup analyses for a secondary endpoint has little merit.
Last, it is interesting to compare the main ODYSSEY findings with those of the FOURIER (Further Cardiovascular Outcomes Research with PCSK9 Inhibition in Subjects with Elevated Risk) trial of evolocumab (7), another PCSK9 inhibitor. The 2 study populations are different: FOURIER focused on patients with a history of MI, nonhemorrhagic stroke, or peripheral artery disease. Nevertheless, there is some consistency of findings. Both trials show that a PCSK9 inhibitor reduces the risk of MI and ischemic stroke. Also, neither trial shows an effect on CV death. Inconsistencies are that ODYSSEY shows apparent reductions in both unstable angina and in all-cause death, whereas FOURIER does not. This further weakens the claim that alirocumab reduces mortality in patients with ACS.
The VEST Trial
Wearable cardioverter-defibrillator in post-MI patients
The hypothesis posed for VEST (Vest Prevention of Early Sudden Death Trial) (8) is: can a wearable cardioverter-defibrillator (WCD) reduce the risk of sudden death in the immediate post-MI period (up to 90 days) in patients with reduced ejection fraction (EF)? The trial recruited 2,309 patients within 7 days of hospital discharge after acute MI who had EF ≤35%. They were randomized in a 2:1 ratio to WCD + guideline treatment (n = 1,524) versus guideline treatment only (n = 778) and were then followed for 90 days.
Results for the primary outcome (sudden death) and several pre-defined fatal and nonfatal secondary outcomes are shown in Table 2. There is not a significant reduction in sudden death (p = 0.18), and hence, some have called this a “negative” trial. This we find too dismissive, because the observed difference in incidence of sudden death (1.6% vs. 2.4%) is in favor of WCD: a 32.8% relative reduction, but with a wide 95% CI ranging from a 21.2% increase to a 62.8% decrease. A better term is to call the trial “inconclusive.”
The problem is that the trial only has good statistical power to detect very marked treatment differences. For instance, had the total of 44 sudden deaths split 22 (1.4%) on WCD and 22 (2.8%) on control, then this hypothetical 50% risk reduction would have been significant with p = 0.02.
Even if the trial had been twice as big (n = 4,604) the observed 32.8% reduction would still only have p = 0.06. It would require 3 times as many patients (n = 6,906) for such a risk reduction to achieve p = 0.02.
This is the dilemma we face when undertaking trials of an intervention strategy (9), such as wearing a WCD in the VEST trial. Patient recruitment is much harder than in drug trials (in VEST it took almost 10 years to recruit 2,302 patients), so that definitive evidence of efficacy is much harder to achieve.
A further issue is patients’ compliance with wearing the WCD; this averaged around 18 h/day initially and declined to around 12 h/day by 90 days (including nonusers). Such reduced compliance over time must inevitably compromise the ability to prevent sudden deaths.
Among the pre-defined secondary outcomes (Table 2), the one that really matters is all-cause death, with a 90-day incidence of 3.1% on WCD versus 4.9% on control. This is a 35.5% relative risk reduction with 95% CI: 2.2% to 57.5% reduction; p = 0.04.
It is a natural instinct to now label VEST as a “positive” trial. After all, surely a significant result for all-cause death justifies such a claim? But a more cautious interpretation is warranted. First, the result is statistically fragile: if there had been just 1 less death in the control arm, the p value becomes >0.05. Second, all-cause death is not the primary outcome. Third, it seems illogical that the WCD is equally effective in preventing both sudden and nonsudden deaths. Thus, although it is plausible that a WCD really does reduce mortality, the VEST trial’s evidence is not sufficiently convincing by itself.
The SECURE-PCI Trial
Loading dose of atorvastatin prior to planned percutaneous coronary intervention
The double-blind SECURE-PCI (Statins Evaluation in Coronary Procedures and Revascularization) trial (10) randomized 4,191 patients with ACS who had an angiogram with the intention of planned percutaneous coronary intervention (PCI) to either 280-mg loading doses of atorvastatin or matching placebo. All patients subsequently received 40 mg atorvastatin daily for 30 days. The primary MACE outcome was a composite of all-cause death, MI, stroke, and unplanned coronary revascularization through 30 days.
Table 3 shows the results for the primary outcome and its components. Although there were numerically fewer primary events in the atorvastatin arm (130 of 2,087 [6.2%] vs. 149 of 2,104 [7.1%] in the placebo arm), this did not achieve statistical significance (HR: 0.88; 95% CI: 0.69 to 1.11; p = 0.27).
Similarly, none of the components of the primary outcome showed a significant treatment effect.
However, this apparently “negative” primary result should not be interpreted as proof that loading doses of atorvastatin have no effect (11). The confidence interval extends out to a 31% risk reduction, and such uncertainty means that a type 2 error (a false negative) is possible. In addition, the fact that patients in both arms received 40 mg atorvastatin for 30 days may have diluted any effect of the loading doses per se.
Among several exploratory subgroup analyses, there was no evidence of statistical interactions except when comparing those who did or did not undergo PCI (interaction p = 0.02) (bottom of Table 3, Figure 2). For those 65% of patients who actually underwent PCI, the HR was 0.72 (95% CI: 0.54 to 0.96; subgroup p = 0.02), whereas for the rest, the apparent treatment effect is in the opposite direction (HR: 1.36; 95% CI: 0.89 to 2.09; subgroup p = 0.15). The latter is a curious finding because it is counter-intuitive (perhaps due to chance): it is hard to imagine how loading doses of atorvastatin could increase CV risk in patients who are not undergoing PCI.
The subgroup claim that loading doses of atorvastatin substantially reduce the risk of MACE in patients actually undergoing PCI requires a cautious interpretation for the following reasons:
1. In any trial, post hoc emphasis on the most positive of several subgroup analyses tends to lead to an exaggeration of the true treatment effect.
2. The significant interaction (p = 0.02) is reached because the observed effect for the non-PCI subgroup is in the opposite direction. Such qualitative interactions are inherently implausible.
3. The subgroup of patients undergoing PCI is an improper subgroup (12,13) in the sense that it was not known at the time of randomization. Patient factors determining who underwent PCI (or who did not) may have affected the outcome: indeed, a few patients may have had a very early MACE event before PCI could begin.
4. This subgroup finding is of little practical value, because for future patients, the decision whether to give loading doses of atorvastatin needs to be taken before one knows whether the patient will actually undergo PCI. In the trial, 35% of patients did not undergo PCI.
The perils of subgroup analyses are further illustrated when one does separate PCI/no PCI comparisons for ST-segment elevation myocardial infarction (STEMI) and non–ST-segment elevation myocardial infarction (NSTEMI) patients (bottom of Figure 2). There is a significant interaction for STEMI patients (p = 0.04), but not for NSTEMI patients. At face value, this implies that the benefits of loading doses of atorvastatin are primarily confined to STEMI patients undergoing PCI. But, common sense suggests that a finding achieved by data dredging across multiple possible subgroup analyses should not be taken seriously.
The TREAT Study
Ticagrelor versus clopidogrel after fibrinolytic therapy
Worldwide, many STEMI patients receive fibrinolytic therapy rather than primary PCI, and hence, there is a need to evaluate the relative safety and efficacy of different antiplatelet regimens in this context. In the open-label TREAT (Ticagrelor in Patients with ST-Elevation Myocardial Infarction treated with Pharmacological Thrombolysis) trial (14) 3,799 patients were randomized to either ticagrelor (180-mg loading dose, 90 mg twice daily thereafter) or clopidogrel (300- or 600-mg loading dose, 75 mg daily thereafter). Median time from fibrinolysis to randomization was 11.4 h, and 90% of patients were pre-treated with clopidogrel.
The primary outcome was TIMI (Thrombolysis In Myocardial Infarction) major bleeding through 30 days. Secondary safety outcomes used other bleeding criteria. Also, exploratory efficacy outcomes included the composite of CV death, MI, and stroke. Results for the main safety and efficacy outcomes at 30-day follow-up are in Table 4.
The primary outcome of TIMI major bleed occurred in 0.73% and 0.69% of ticagrelor and clopidogrel patients, respectively, a difference of +0.04% (95% CI: −0.49% to +0.58%). There was a pre-defined noninferiority hypothesis with an absolute margin of +1.0%. Because the 95% CI excludes +1.0%, a claim of noninferiority of ticagrelor relative to clopidogrel can be made as regards TIMI major bleed. Similar conclusions can be made regarding the PLATO (Platelet Inhibition and Patient Outcomes) and BARC (Bleeding Academic Research Consortium) major bleeding criteria. However, for all bleeds there was a significant excess on ticagrelor compared with clopidogrel: absolute difference +1.57% (95% CI: +0.24% to +2.90%; p = 0.02). The composite efficacy outcomes of CV death, MI, and stroke had a similar incidence in both groups (4.0% ticagrelor, 4.3% clopidogrel; p = 0.57), and both groups had 49 (2.6%) deaths from all causes within 30 days.
With the proviso that ticagrelor had more minor bleeds, a conclusion that ticagrelor appears as good as clopidogrel in these patients seems appropriate. However, a few outstanding issues remain:
1. This is a relatively young low-risk population of STEMI patients (e.g., patients age >75 years were excluded), and so the incidence of major bleeding is low.
2. The practical merit of demonstrating that ticagrelor is noninferior to clopidogrel is open to debate, given the former is more expensive and the latter is more widely established.
3. Was the delayed timing of ticagrelor administration in this trial making the most of its potential? An alternative trial could have explored the relative safety and efficacy of ticagrelor given alongside fibrinolytic therapy.
The POISE Trial
Metoprolol in patients undergoing noncardiac surgery
Whether beta-blockers are of benefit in patients undergoing noncardiac surgery is a controversial and unresolved topic. Previous positive recommendations in both U.S. and European guidelines were shaken when it was found that the integral DECREASE (Dutch Echocardiographic Cardiac Risk Evaluation Applying Stress Echocardiography) studies were fundamentally flawed (15). The latest European Society of Cardiology/European Society of Anaesthesiology guidelines (16) take a more cautious position and conclude that “a high priority needs to be given to new randomized clinical trials to better identify which patients derive benefit from blocker therapy in the perioperative setting, and to determine the optimal method of beta-blockade.”
The POISE (PeriOperative ISchemic Evaluation) trial's previously published findings (17) on the short-term outcomes following extended-release metoprolol alerted everyone to potential harms. The latest findings on 1-year outcome are therefore of considerable interest. The POISE trial randomized 8,351 patients with, or at risk of, atherosclerotic disease who were undergoing noncardiac surgery to receive extended-release metoprolol or placebo, starting 2 to 4 h before surgery and continuing for 30 days. Table 5 presents results for the key outcomes for both 30-day follow-up (previously reported) and 1-year follow-up (new).
The main benefit of metoprolol is a highly significant reduction in the incidence of MI: 63 fewer at 30 days (p = 0.0017), which attenuated slightly to 52 fewer at 1 year (p = 0.008). The absolute 1-year reduction in incidence of MI (metoprolol vs. placebo) is 1.24% (95% CI: 0.26% to 2.23%). In contrast, there is a significant excess of stroke on metoprolol: 22 more at 30 days (p = 0.0053), and 26 more at 1 year (p = 0.0014). The absolute 1-year increase in incidence of stroke is +0.62% (95% CI: 0.07% to 1.18%).
There is also a significant mortality excess on metoprolol: 32 more deaths at 30 days (p = 0.032), which increases to 54 more deaths at 1 year (p = 0.036). This is mainly driven by an excess of non-CV deaths (39 more on metoprolol; p = 0.043), but there are also 15 more CV deaths on metoprolol (p = 0.37). Overall, the absolute 1-year increase in mortality is +1.30% (95% CI: 0.06% to 2.54%).
The overall picture is that metoprolol appears to do more harm than good: the risks of death and stroke outweigh the benefits of fewer MIs and coronary revascularizations. It has been suggested that this problem arose because the chosen dose regimen (based on 100 mg oral extended-release metoprolol initially) was too high.
Whether there exists a more judicious choice of beta-blocker and dose regimen that can still be of benefit to appropriate patients undergoing noncardiac surgery is an open question, which is only answerable by further large-scale randomized trials. As far as we know, no such trials are currently taking place.
The SMART-DATE Trial
6 months versus 12+ months of dual antiplatelet therapy after PCI in ACS patients
There is extensive published data dedicated to determining the optimal duration of dual antiplatelet therapy (DAPT) after drug-eluting stent (DES) implantation (18). In general, a longer DAPT duration is liable to reduce the incidence of ischemic events but is accompanied by an increased risk of bleeding complications.
SMART-DATE (Safety of 6-Month Duration of Dual Antiplatelet Therapy After Percutaneous Coronary Intervention in Patients With Acute Coronary Syndrome) (19) is the most recent trial to tackle this issue. A total of 2,712 patients with ACS undergoing PCI (99% received a DES) were randomized to either 6 months DAPT or 12 months or longer DAPT. The primary MACE endpoint was a composite of all-cause death, MI, or stroke at 18 months after PCI in the intention-to-treat population.
Results for the primary and pre-defined secondary endpoints are shown in Table 6. Let us first focus on the MACE primary endpoint, which occurred in 63 (4.7%) and 56 (4.2%) patients in the 6- and 12-month DAPT groups, respectively. The trial’s primary hypothesis was noninferiority of the former relative to the latter, with a pre-defined margin of 2%. The observed difference is 0.5% with an upper 1-sided 95% CI of 1.8%. Thus, the claim of noninferiority is formally established.
However, several concerns exist regarding the interpretation of such a noninferiority trial (20)
1. It is useful to calculate a 2-sided 95% CI, which in this case is from −0.05% to +2.05%. This corresponds to a 1-sided type 1 error of 2.5% (rather than 5%), as is often done in noninferiority trials. On this basis, the CI includes the 2.0% margin and makes the trial inconclusive regarding noninferiority.
2. A 2% margin is very wide and, with a 4.5% MACE rate being anticipated, is equivalent to a margin of 1.44 on a ratio scale (i.e., ruling out such a 44% excess of events would not really be convincing).
3. Because everyone receives DAPT for the first 6 months, only MACE events occurring after 6 months are really relevant to the treatment comparison. This landmark analysis, which necessarily excludes patients having an event before 6 months, is shown in Figure 3. The MACE rate is now somewhat higher in the 6-month DAPT group: (HR: 1.69; 95% CI: 0.97 to 2.94; p = 0.07). The CI is very wide because the number of MACE events between 6 and 18 months (not given) is relatively small. We are now getting close to significant inferiority of the 6-month DAPT arm, and any claim of noninferiority is clearly ruled out.
Among the secondary endpoints in Table 6, prior trials and meta-analyses suggest it is wise to focus on MI, stent thrombosis, and bleeding events. Here, the 6-month DAPT group has 14 more MIs and 5 more stent thromboses than the 12-month DAPT group. This is counterbalanced by the former having 16 fewer BARC 2 to 5 bleeds and 4 fewer major bleeds. However, this is hard to interpret sensibly because of the third concern in the previous paragraph (i.e., the numbers include events before 6 months when the 2 groups were on identical DAPT treatment).
The meta-analysis by Giustino et al. (18) based on 10 randomized trials and 32,215 patients estimated that a shorter duration significantly increased the risk of stent thrombosis (odds ratio: 1.71; p = 0.001) and MI (odds ratio: 1.39; p < 0.001), reduced the risk of clinically significant bleeding (odds ratio: 0.63; p < 0.001), and may reduce the mortality risk (odds ratio: 0.87; p = 0.07). Put more simply, with approximately 16,000 patients in each group, shorter DAPT duration resulted in 111 more MIs, 63 more stent thromboses, 118 fewer clinically significant bleeds, and 49 fewer deaths. Of course, any meta-analysis has the potential to oversimplify findings from a heterogeneous mix of studies (e.g., the risk of stent thrombosis is lower in second-generation DES), but only through combining evidence across all relevant trials can one reach robust conclusions on this issue.
Although SMART-DATE appears to be a well conducted trial that can usefully contribute to future meta-analyses, it has too few patients and events to reach meaningful conclusions in its own right regarding the trade-off between poorer efficacy and improved safety by reducing the duration of DAPT.
The CVD-REAL 2 Study
Comparative effectiveness of SGLT2 inhibitors for cardiovascular outcomes
The CVD-REAL 2 (Comparative Effectiveness of Cardiovascular Outcomes in New Users of SGLT-2 Inhibitors) study (21) examines whether initiation of sodium-glucose cotransporter-2 inhibitors (SGLT2i) is associated with a lower risk of CV events compared with other glucose-lowering drugs (oGLD) using “real-world” data for over 400,000 patients with type 2 diabetes in 3 world regions. Before describing and interpreting the findings, it is relevant to make some general remarks on the pros and cons of this type of comparative effectiveness study.
The good news is that such studies can give access to very large numbers of patients, because they use data from medical claims, primary care/hospital records, and national registries. This facilitates more precise estimates of associations between treatments and patient outcomes than is possible in randomized trials, which are inevitably of limited size. Second, they reflect the “real-world” practice of medicine unconstrained by the strict eligibility criteria of randomized controlled trials. Thus, they have an aura of greater generalizability than randomized controlled trials.
The bad news is that the so called “real world” is a “messy place” from the perspective of seeking robust, unbiased treatment comparisons (22-24). The biggest problem is the selection process that determines which treatment gets given to each patient. Without randomization, there is always a risk that patients on any given drug (or class of drug) have an underlying better average prognosis than others. One tries to correct for this by adjusting for known patient characteristics using propensity score methods, but the presence of residual confounding, and hence bias, in the results always remains a real possibility. Another limitation of “real-world” data is that it is collected in a less reliable manner. Both baseline features and patient outcomes are left to the discretion of each practicing physician; for example, for nonfatal events (e.g., MI) and causes of death, robustness of definitions and centralized adjudication are lacking.
So, now to the results of the CVD-REAL (25) and CVD-REAL 2 study (21). The first study in 6 countries identified 166,033 eligible new users of SGLT2i and 1,226,221 eligible new users of oGLD. In every country, the former tended to be younger, have less established CVD and less chronic kidney disease, and be less frail. The second study, CVD-REAL 2, included 249,348 SGLT2i and 3,668,203 oGLD patients from a further 6 countries. This time, the former tended to be younger, have less chronic kidney disease, be less frail, be more often taking metformin and statins, and be recruited more recently. To correct for these imbalances and other patient characteristics, propensity-matched pairs of patients were extracted, leading to 235,064 patients for analysis in each group. In CVD-REAL 2, the SGLT2i use was 75% dapagliflozin, 9% empagliflozin, and 12% others.
The main results of CVD-REAL 2 are in Figure 4. Overall, the data show that patients on SGLT2i have a lower risk of all-cause death (HR: 0.51; 95% CI: 0.37 to 0.70), hospitalization for heart failure (HR: 0.64; 95% CI: 0.50 to 0.82), MI (HR: 0.81; 95% CI: 0.74 to 0.88), and stroke (HR: 0.68; 95% CI: 0.55 to 0.84). All 4 associations are highly significant, but the first 3 also show highly significant heterogeneity between countries (e.g., the associations were weaker in South Korea, which contributed >70% of patients).
There are 2 problems here. First, the random-effects meta-analysis used tends to weight countries with fewer patients more than is appropriate (26,27); for example, the biggest apparent heart failure effect in Canada pulls the overall effect in a more positive direction. A fixed-effect meta-analysis would lead to more modest overall treatment differences for death, heart failure, and stroke, because it gives more weight to the largest country, South Korea, which had much smaller treatment differences. Second, one suspects that heterogeneity across countries may be due to differing selection biases rather than genuine geographic variations in treatment effects.
These findings must be interpreted in the context of key randomized trials of SGLT2is: EMPA-REG OUTCOME (Empagliflozin Cardiovascular Outcome Event Trial in Type 2 Diabetes Mellitus Patients) (28) and CANVAS (Canagliflozin Cardiovascular Assessment Study) (29). Both showed reductions in heart failure hospitalization compared with placebo, consistent with what CVD-REAL 2 shows. For MI and stroke, the trials showed no significant benefit of SGLT2is, which casts doubt on the causal validity of such associations in CVD-REAL 2. For all-cause death, the overall claim of a 49% reduction in hazard seems liable to be an exaggeration and the 28% reduction in South Korea seems plausible, given the mortality reductions in EMPA-REG and CANVAS were 32% (p < 0.001) and 13% (p = 0.08), respectively.
Because dapagliflozin contributes most SGLT2i patients to this study, we clearly need to wait for the DECLARE (Dapagliflozin Effect on Cardiovascular Events Trial) (30,31) results before casting a final verdict on the believability of these findings in CVD-REAL 2. As always, such observational studies can only evaluate associations; whether they depict genuine beneficial treatment effects can only be established through evidence from randomized controlled trials.
In the process of reviewing these 7 late-breaking clinical trials, we find some general issues worthy of comment. First, presentations of new evidence at major scientific meetings such as ACC are not peer reviewed. Hence, the presenters and their collaborators have essentially a free rein to present their study findings as they see fit. Given that they have often devoted years of effort to conducting a major trial, there is a natural wish to present their findings “in a good light.” On the whole, presenters of major studies are top-level, highly-respected scientists for whom the quest for truth is paramount. Nevertheless, at the key moment of first presentation of pivotal findings, it is only human nature to allow a degree of “positive spin” to creep in (1). This same temptation may also be felt by study sponsors, whether commercial or public bodies. In the Central Illustration, the last 2 columns summarize how this may have occurred in these 7 specific presentations.
Second, a conference presentation is just the first step in the release and interpretation of study findings. The first peer-reviewed publication in a major medical journal is what matters in the long run. Indeed, 4 of the 7 studies we have reviewed had simultaneous publications. Given the high standards set by journal editors and reviewers and the use of CONSORT guidelines (32), such publications are less prone to “positive spin”: the conclusions usually focus on the pre-defined primary outcome in the whole trial population, with other possible claims on secondary endpoints and subgroup analyses referred to as exploratory findings. Sometimes, this practice seems a little too restrictive, as if any ideas after database lock can have no bearing on what future clinical practice should be. But, there is a fine line to be drawn between flexibility (let all the data speak) and selectivity (how can I make my trial more “positive?”).
Of course, journal publications are of finite size and are often written quickly to meet conference deadlines, so they, in turn, do not necessarily provide the “whole truth.” For drug or device trials, a much more detailed regulatory dossier gets presented to the Food and Drug Administration, European Medicines Agency, and other agencies, where the final detailed totality of evidence gets judged.
But, the importance and excitement of a first conference presentation is hard to overstate. Thus, we greatly appreciate this opportunity to review some of the key trials of ACC 2018, and we hope that our insights are a stimulus to further discussion on what each of these trials means for future best clinical practice.
Dr. Pocock has served on steering committees or data monitoring committees for trials sponsored by AstraZeneca, Bayer, Boehringer Ingelheim, Boston Scientific, Idorsia, Janssen, Medtronic, Novartis, Novo Nordisk, and Vifor; and has received grant funding from AstraZeneca and Merck. Dr. Collier has served on data monitoring committees for trials sponsored by Daiichi-Sankyo and Zoll.
- Abbreviations and Acronyms
- coronary heart disease
- dual antiplatelet therapy
- major adverse cardiovascular events
- myocardial infarction
- non–ST-segment elevation myocardial infarction
- other glucose-lowering drugs
- percutaneous coronary intervention
- sodium-glucose cotransporter-2 inhibitors
- ST-segment elevation myocardial infarction
- wearable cardioverter-defibrillator
- Received March 26, 2018.
- Revision received April 13, 2018.
- Accepted April 16, 2018.
- 2018 American College of Cardiology Foundation
- Boutron I.,
- Altman D.G.,
- Hopewell S.,
- Vera-Badillo F.,
- Tannock I.,
- Ravaud P.
- ↵Schwartz GG, Bessac L, Berdan LG, et al. Effect of alirocumab, a monoclonal antibody to PCSK9, on long-term cardiovascular outcomes following acute coronary syndromes: rationale and design of the ODYSSEY Outcomes trial. Am Heart J 2014;168:682–9.e1.
- Food and Drug Administration
- ↵ClinicalTrials.gov. Vest Prevention of Early Sudden Death Trial and VEST Registry (VEST). October 5, 2011. Available at: http://clinicaltrials.gov/ct/show/NCT01446965?order=1. Accessed March 23, 2018.
- Pocock S.J.,
- Clayton T.C.,
- Stone G.W.
- Berwanger O.,
- Santucci E.,
- de Barros e Silva P.,
- et al.
- TREAT Study Group
- ↵POISE Study Group. Effects of extended-release metoprolol succinate in patients undergoing non-cardiac surgery (POISE trial): a randomised controlled trial. Lancet 2008;371:1839–47.
- Giustino G.,
- Baber U.,
- Sartori S.,
- et al.
- Hahn J.-Y.,
- Song Y.B.,
- Oh J.-H.,
- et al.,
- for the SMART-DATE Investigators
- Macaya F.,
- Ryan N.,
- Salinas P.,
- Pocock S.J.
- Kosiborod M.,
- Lam C.S.P.,
- Kohsaka S.,
- et al.
- Freemantle N.,
- Marston L.,
- Walters K.,
- Wood J.,
- Reynolds M.R.,
- Petersen I.
- Kosiborod M.,
- Cavender M.A.,
- Fu A.Z.,
- et al.
- Borenstein M.,
- Hedges L.V.,
- Higgins J.P.T.,
- Rothstein H.R.
- ↵ClinicalTrials.gov. Multicenter trial to evaluate the effect of dapagliflozin on the incidence of cardiovascular events (DECLARE-TIMI58). November 21, 2012. Available at: http://clinicaltrials.gov/ct/show/NCT01730534?order=1. Accessed March 23, 2018.
- Raz I.,
- Mosenzon O.,
- Bonaca M.P.,
- et al.
- Moher D.,
- Hopewell S.,
- Schulz K.F.,
- et al.