Author + information
- Received January 15, 2013
- Revision received March 12, 2013
- Accepted March 26, 2013
- Published online June 4, 2013.
- *Department of Internal Medicine, Division of Cardiovascular Medicine, University of Michigan, Ann Arbor, Michigan
- †Department of Thrombosis and Hemostasis and Department of Nephrology, Leiden University Medical Center, Leiden, the Netherlands
- ‡Blue Cross Blue Shield of Michigan, Detroit, Michigan
Objectives The aim of the study was to develop and validate a tool for predicting risk of contrast-induced nephropathy (CIN) in patients undergoing contemporary percutaneous coronary intervention (PCI).
Background CIN is a common complication of PCI and is associated with adverse short- and long-term outcomes. Previously described risk scores for predicting CIN either have modest discrimination or include procedural variables and thus cannot be applied for pre-procedural risk stratification.
Methods Random forest models were developed using 46 pre-procedural clinical and laboratory variables to estimate the risk of CIN in patients undergoing PCI. The 15 most influential variables were selected for inclusion in a reduced model. Model performance estimating risk of CIN and new requirement for dialysis (NRD) was evaluated in an independent validation data set using area under the receiver-operating characteristic curve (AUC), with net reclassification improvement used to compare full and reduced model CIN prediction after grouping in low-, intermediate-, and high-risk categories.
Results Our study cohort comprised 68,573 PCI procedures performed at 46 hospitals between January 2010 and June 2012 in Michigan, of which 48,001 (70%) were randomly selected for training the models and 20,572 (30%) for validation. The models demonstrated excellent calibration and discrimination for both endpoints (CIN AUC for full model 0.85 and for reduced model 0.84, p for difference <0.01; NRD AUC for both models 0.88, p for difference = 0.82; net reclassification improvement for CIN 2.92%, p = 0.06).
Conclusions The risk of CIN and NRD among patients undergoing PCI can be reliably calculated using a novel easy-to-use computational tool (https://bmc2.org/calculators/cin). This risk prediction algorithm may prove useful for both bedside clinical decision making and risk adjustment for assessment of quality.
Contrast-induced nephropathy (CIN) is a common complication among patients undergoing invasive cardiac procedures and is associated with increased morbidity, mortality, and healthcare expense (1,2). Multiple strategies have been demonstrated to be successful for prophylaxis of CIN, including adequate hydration, minimization of contrast dose, and the use of iso-osmolar or certain low osmolar contrast media (3–6). Prospectively identifying patients at risk of CIN would be of immense value in targeting prophylactic therapy to those at high risk. Current guidelines recommend use of prophylactic therapy in patients with altered renal function, although this approach has limited sensitivity and specificity (7–10).
For better risk stratification of patients, efforts have been made to develop risk prediction tools or risk scores to identify patients most likely to develop CIN (11,12). The most commonly used risk score was described by Mehran et al. (13) and is based on the presence of 8 factors (hypotension, use of intra-aortic balloon pump, congestive heart failure, chronic kidney disease, diabetes, age >75 years, anemia, and volume of contrast). This model cannot be used for pre-procedure identification of at-risk patients because it incorporates procedural variables for risk prediction. To overcome this limitation, Tsai et al. (14) recently reported a risk score derived from the American College of Cardiology National Cardiovascular Data Registry percutaneous coronary intervention (PCI) registry that only uses pre-procedure variables, but this model has modest discrimination and may not be appropriate for patient-level decision making.
Traditionally, simple risk scores have been favored to facilitate bedside calculations, although this can compromise the accuracy of prediction (15). The widespread use of computers in medical care has opened up the possibility of bedside application of more complex tools that leverage developments in statistical science and facilitate use of algorithms that cannot be easily converted into risk scores (16,17).
The goal of our work was to use such methods to develop a highly accurate model for prediction of CIN using pre-procedural variables that are routinely collected in patients undergoing PCI while retaining the advantages of bedside applicability.
Data for development and validation of a new CIN model were derived from the Blue Cross Blue Shield of Michigan Cardiovascular Consortium (BMC2), a quality improvement collaborative that tracks the inpatient outcome of consecutive patients undergoing PCI at all nonfederal hospitals in the state of Michigan. The details of the BMC2 and its data collection and auditing process have been described previously (18,19). Procedural data on all consecutive patients undergoing PCI at participating hospitals were collected using standardized data collection forms. Baseline data included clinical, demographic, procedural, and angiographic characteristics; medications used before, during, and after the procedure; and in-hospital outcomes. All data elements were prospectively defined, and the protocol was approved by local institutional review boards at each of the participating hospitals. In addition to a random audit of 2% of all cases, medical records of all patients undergoing multiple procedures or coronary artery bypass grafting and patients who died in the hospital were reviewed routinely to ensure data accuracy.
The study population for this analysis included all consecutive patients who underwent PCI between January 1, 2010, and June 30, 2012. Patients who were already on dialysis at the time of the procedure or those with missing serum creatinine levels pre- or post-procedure were excluded from outcome analysis. The type of contrast media and hydration protocols used were as per operator preference guided by institutional policy and practice.
The primary endpoint for our study was CIN, which was defined as impairment in renal function resulting in ≥0.5 mg/dl absolute increase in serum creatinine level from baseline (16). Baseline creatinine level was collected within a month of the procedure. Among patients who had multiple assessments of serum creatinine in the 30 days before the procedure, the value closest to the time of the procedure was considered the baseline value. Peak creatinine level was defined as the highest value of creatinine in the week following the procedure and was ascertained per local clinical practice. Peak creatinine level was collected at least 1 day post-procedure but varied depending on length of stay.
The secondary endpoint for the study was nephropathy requiring dialysis (NRD), which was defined as new, unplanned need for dialysis during the hospitalization due to worsening of renal function after PCI.
The study cohort was divided randomly into training and validation data sets, with 70% of procedures assigned to training and the remaining 30% used for validation. A random forest regression model was trained for predicting CIN using 46 baseline clinical variables, including pre-procedural medications, with missing predictors imputed to be the overall median for continuous values and mode for categorical variables. Random forest is an ensemble classification method that determines a consensus prediction for each observation by averaging the results of many individual recursive partitioning tree models. Each of the individual trees are fitted to a randomly selected subset of the observations and use a random subset of the available predictors at each node as candidates for splitting. Random forests have been shown to have good predictive value and are generally robust to issues of overfitting, rendering them particularly useful for evaluating a large number of possible predictors and exploiting potential interactions between predictors and their relationship with the outcome (20). The CIN outcome was entered as a continuous variable coded as 1 in patients developing CIN and 0 for those not meeting the criteria to facilitate regression rather than classification modeling; estimated means (leaf node probabilities of CIN) assigned to a given observation were then aggregated in the ensemble. To facilitate the development of an easy-to-use bedside tool, a reduced model was also trained using only the 15 most important predictors as assessed in the full model by the incremental decrease in node impurity (residual sum of squares) associated with splitting on the predictor averaged over all trees in the ensemble.
The full and reduced models were evaluated in terms of discrimination and predictive power in the validation data set. Random forest estimates for observations in the validation data set were scaled so that the overall predicted CIN rate for the validation sample matched the overall CIN rate observed in the training data set. Overall diagnostic accuracy was estimated using the area under the receiver-operating curve (AUC) assessed via the trapezoidal method, with standard error estimation using the method of DeLong et al. (21). To evaluate bias or nonconstancy of variance across different levels of predicted risk, patients in the validation data set were grouped by predicted CIN risk level; for each of these groups, the incidence of CIN was calculated and plotted against mean predicted risk. The net reclassification improvement (NRI) statistic was used to compare model performance, with predicted risk values grouped into 3 levels: low-, medium-, and high-risk patients; p values and CIs were obtained through bootstrapping (22,23). The additional utility of the models for discrimination of risk of NRD post-procedurally was evaluated in the validation data set through assessment of AUC.
The potential application of the tool for calculating predicted risk and its use in guiding the decision to administer prophylactic therapy such as pre-procedure hydration was compared with the currently recommended strategy of using glomerular filtration rate (GFR)–based selection by plotting the percentage of patients developing CIN to the percentage of all patients who would be selected for prophylaxis by each strategy graphically over the entire range of potential cutoff values (7,10). Patients who would not routinely be considered for prophylaxis, such as those with ST-segment elevation myocardial infarction, cardiogenic shock, or cardiac arrest, were excluded from this analysis.
All analysis was performed in R version 2.14.1 using freely distributed statistical packages (24,25).
Our study cohort comprised 68,573 of the 81,218 procedures (84%) performed across Michigan between January 2010 through June 2012. Of the 12,645 procedures (16%) that were excluded from the analysis, 1,897 patients were already on dialysis at the time of the procedure and the remainder had missing serum creatinine values before (n = 1,903) or following the PCI procedure (n = 8,655).
The training dataset consisted of 48,001 PCI procedures, of which 1,243 (2.59%) resulted in CIN; the validation data set consisted of 20,572 procedures, of which 505 (2.45%) resulted in CIN. NRD developed in 169 of the patients (0.35%) in the development cohort and 66 of patients (0.32%) in the validation cohort. All 46 baseline variables presented in Table 1 were included in the full random forest model. The training and validation datasets were similar in terms of baseline covariates (Online Table 1). The 15 variables with the largest model determined importance are listed in Table 2, and Table 3 provides their distribution in training dataset patients both with and without CIN. All but one of these variables, patient weight, were univariately associated with CIN. This set of predictors was used to fit the reduced random forest model that is available for use at https://bmc2.org/calculators/cin.
When evaluated in the validation dataset, both models provided good discrimination for CIN, with the full model having a small but statistically significant advantage in AUC (full model 0.852, 95% CI: 0.835 to 0.869; reduced model 0.839, 95% CI: 0.821 to 0.857; p for difference = 0.001). Similarly, both models had good discrimination for NRD, but the full and reduced models performed equally well in terms of AUC for this outcome (full model 0.875, 95% CI: 0.819 to 0.931; reduced model 0.875, 95% CI: 0.823 to 0.931; p for difference = 0.815). Both models demonstrated excellent calibration (Fig. 1) with good concordance between observed and predicted risk of CIN.
The full and reduced model predictions were also grouped into low-risk (<1%), intermediate-risk (1% to 7%), and high-risk (>7%) categories, and the number of patients along with the observed CIN rate in each group is presented in Table 4. The NRI statistic for the full model relative to the reduced model for these categories was not statistically significant (NRI 2.92%; p = 0.062; 95% CI: −0.14% to 6.03%).
Figure 2 compares the performance of the full and reduced models with a GFR-based strategy for selection of patients for prophylactic therapy and demonstrates the advantage of using the full or the reduced model over the contemporary practice of using a GFR-based threshold. If a decision were made to select 10% of the highest-risk patients, this highest-risk cohort would contain 52% of the patients who would potentially develop CIN using the full model, compared with 50% if the reduced model were used versus 41% for a GFR-based strategy. Using the currently recommended cutoff of GFR <60 ml/min/1.73 m2, and after excluding the high-risk patients, 23.5% of the population would be selected for hydration; this would capture 62% of the patients who would develop CIN. Assuming the same number of patients were to be selected for pre-treatment, this would correspond to a predicted risk of >2.1% using the full model and a risk of >2.0% using the reduced model and would capture 75% and 72% of the patients who develop CIN, respectively.
The key finding of our study was that standard clinical and laboratory variables that are routinely collected in patients undergoing PCI can be easily and reliably used to predict the risk of CIN and NRD using a novel computational tool. The robust discrimination and calibration of this method, combined with the ease of use for simplified bedside prediction, makes this model an easy tool to apply clinically.
Use of risk stratification has been advocated for both patient-level decision making (for guiding informed consent and therapeutic decision making) and risk adjustment for assessment of quality of care. Our work has several key advantages that make it particularly suitable for these purposes. First and foremost, our model has a much higher discrimination than has been traditionally reported for models or risk scores that have been developed to predict CIN. As an example, the Mehran model has a C-statistic of 0.67, whereas the National Cardiovascular Data Registry acute kidney injury (AKI) model has a C-statistic of 0.72 (13,14). Second, our model should be generalizable to routine clinical practice because it was developed and validated on all consecutive patients treated in Michigan and reflects contemporary practice across multiple institutions and operators. Third, the model was based only on pre-procedure variables and thus can be used for risk stratification before the procedure so that alternate therapeutic strategies could be explored and prophylactic strategies such as hydration or novel therapies that are currently being investigated could potentially be applied for patients who are deemed to be at high risk of complications. Finally, the model also provides highly accurate prediction for the risk of NRD, a complication that both physicians and patients are most keen to avoid.
Our model differs from traditional risk scores in that it requires a computer for calculation rather than something that can be estimated as a bedside arithmetic risk score. Although simple scores have been favored in the past, this approach may sacrifice accuracy and have the potential to mislead rather than truly inform clinical practice (15). Further, the widespread use of electronic medical records and bedside use of computers in medicine has made it possible to embed complex algorithms into clinical workflow, and the widespread use of smart devices makes it practical to use these models at the bedside.
We developed 2 different models, with the full model providing a slightly greater discrimination, but this did not translate into a significant improvement in NRI. This would suggest that the abbreviated model developed for bedside use is almost as good as the full model for clinical decision making, although the full model would be preferable for use if it could be achieved without undue burden on clinicians. We envision that initially the simpler model would be used for routine clinical decision making, whereas the detailed model would be used for quality assessment and benchmarking across institutions and operators. We prefer the use of the full model for these purposes because the presence of minor risk factors such as atrial fibrillation or PCI after recent surgery could vary considerably across institutions and not accounting for the variation in these factors could result in misclassification of observed to expected ratios at an institutional level. It is technically easy to embed this full model into electronic medical records and specifically into the templates that are used for documentation of the initial history and physical assessment of a patient being evaluated for PCI. Having such availability would help patients and physicians make better informed decisions but would require greater input from the vendors of electronic medical record systems.
The clinical utility of risk scores beyond research and patient consent remains unexplored. We have suggested one possible use, whereby catheterization laboratories can determine policies targeted to specific patient populations based on their pre-procedure risk and therefore optimize patient safety without an increase in space and personnel utilization. It is, however, likely that as clinicians use this tool in practice, other uses will emerge that will lead to further optimization of patient care, as well as modification and refinement of the prediction tool.
Like most observational studies, our study findings must be evaluated with certain caveats. We have used the term “CIN,” although the role of contrast media in all patients who develop AKI after PCI remains debatable. It is likely that AKI after PCI is multifactorial, and it may be preferable to use terms that do not assume that all renal dysfunction after PCI is secondary to contrast media. However, regardless of the term used, our model performed well and helps identify patients prone to develop renal complications after PCI. Our risk model was derived from a population in which serum creatinine ascertainment was not standardized but was clinically driven. Only 1 post-procedure creatinine value was available, and no follow-up beyond the initial hospitalization was performed. No data on the type and amount of hydration used were available, and this likely varied across institutions. However, we believe this makes our model more generalizable to routine clinical care because it reflects findings from contemporary practice across multiple institutions.
We have developed a simple tool for accurately predicting risk of CIN and NRD among patients undergoing PCI. This risk prediction algorithm may prove useful for both bedside clinical decision making and risk adjustment for assessment of quality.
The authors are grateful to Mr. Ryan Hughes for developing the web application of this model.
For a supplemental table, please see the online version of this article.
The BMC2 registry is funded by Blue Cross Blue Shield of Michigan. The sponsor had no role in study design or review or the decision to submit the work for publication. Dr. Gurm receives research funding from Blue Cross Blue Shield of Michigan and the National Institutes of Health and Agency for Healthcare Research & Quality. Dr. Share is employed by Blue Cross Blue Shield of Michigan. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- acute kidney injury
- contrast-induced nephropathy
- glomerular filtration rate
- nephropathy requiring dialysis
- percutaneous coronary intervention
- ST-segment elevation myocardial infarction
- Received January 15, 2013.
- Revision received March 12, 2013.
- Accepted March 26, 2013.
- American College of Cardiology Foundation
- Rihal C.,
- Textor S.,
- Grill D.,
- et al.
- Gurm H.S.,
- Dixon S.R.,
- Smith D.E.,
- et al.
- Reed M.,
- Meier P.,
- Tamhane U.U.,
- Welch K.B.,
- Moscucci M.,
- Gurm H.S.
- Reed M.C.,
- Moscucci M.,
- Smith D.E.,
- et al.
- Fliser D.,
- Laville M.,
- Covic A.,
- et al.
- Levine G.N.,
- Bates E.R.,
- Blankenship J.C.,
- et al.
- Mehran R.,
- Aymong E.D.,
- Nikolsky E.,
- et al.
- Tsai T.,
- Patel U.,
- Chang T.,
- et al.
- Chia C.C.,
- Rubinfeld I.,
- Scirica B.M.,
- McMillan S.,
- Gurm H.S.,
- Syed Z.
- Pencina M.J.,
- D’Agostino R.B.
- Moscucci M.,
- Rogers E.K.,
- Montoye C.,
- et al.
- Gurm H.S.,
- Smith D.E.,
- Collins J.S.,
- et al.
- Liaw A.,
- Wiener M.