Author + information
- Michelle Samuel, MPH,
- Tibor Schuster, PhD,
- Jay S. Kaufman, PhD,
- Robert W. Platt, PhD and
- James M. Brophy, MD, PhD, MEng∗ ()
- ↵∗Division of Cardiology, McGill University Health Centre, McGill University, 1001 Decarie Boulevard, Room C04.5011, Montreal QC H4A 3J1, Canada
The paper by Elze et al. (1) is an important investigation analyzing the application of different propensity score (PS) methods compared with covariate adjustment to estimate treatment effect in 4 cardiovascular studies. Although the article aimed to provide scientists with the necessary information to decide on the most suitable PS approach, necessary distinctions that would allow for a meaningful interpretation of the results have not been drawn.
Although mentioned as a limitation, marginal and conditional estimates produced by the different PS approaches were compared in the paper. Marginal methods estimate the population average and conditional methods the covariate-stratum-specific effect (2). The intrinsic differences between the model structures employed by these methods result in hazard ratios (HRs) that cannot be compared. The only exceptions are linear models and situations in which the true HR is 1. The phenomenon of incompatibility of conditional and marginal effect estimates is known as the noncollapsibility property and applies to various measures including HRs and odds ratios (2). For example, in the CHARM (Candesartan in Heart Failure Assessment of Reduction in Mortality and Morbidity) study, marginal models compare the risk of all-cause mortality in beta-blocker users and nonusers, whereas conditional models compare mortality risk in users and nonusers with similar covariates (1).
Further clarification may also be necessary for the authors’ assertion that “doubly robust methods all estimate conditional HR” as some doubly robust methods can produce marginal effect estimates (1,3).
Finally, and possibly most importantly, it is crucial to note that when using real-world data, the true effect remains unknown (4). Definitive statements on the actual performance of different inference methods can therefore not be made. For instance, in the presence of residual confounding, a larger standard error for a biased effect estimate may be desirable as the related confidence interval will be more likely to cover the true underlying effect, despite the bias. Confounding is a causal concept and cannot be adequately assessed with observational data alone (4).
Please note: The authors have reported that they have no relationships relevant to the contents of this paper to disclose. Deepak Bhatt, MD, MPH, served as Guest Editor-in-Chief for this paper.
- 2017 American College of Cardiology Foundation
- Elze M.C.,
- Gregson J.,
- Baber U.,
- et al.
- Funk M.J.,
- Westreich D.,
- Wiesen C.,
- Sturmer T.,
- Brookhart M.A.,
- Davidian M.
- Pearl J.