## Journal of the American College of Cardiology

# Risk Stratification of In-Hospital Mortality for Coronary Artery Bypass Graft Surgery

## Author + information

- Received August 25, 2005
- Revision received October 13, 2005
- Accepted October 18, 2005
- Published online February 7, 2006.

## Author Information

- Edward L. Hannan, PhD, FACC
^{⁎}, - Chuntao Wu, MD, PhD
^{⁎}, - Edward V. Bennett, MD†,
- Russell E. Carlson, MD‡,
- Alfred T. Culliford, MD§,
- Jeffrey P. Gold, MD, FACC∥,
- Robert S.D. Higgins, MD¶,
- O. Wayne Isom, MD, FACC#,
- Craig R. Smith, MD
^{⁎⁎}and - Robert H. Jones, MD, FACC††,
^{⁎}(jones060{at}mc.duke.edu)

- ↵⁎
**Reprint requests and correspondence:**

Dr. Robert H. Jones, Duke Clinical Research Institute, P.O. Box 17969, Durham, North Carolina 27715.

Risk Stratification of In-Hospital Mortality for Coronary Artery Bypass Graft Surgery

Edward L. Hannan, Chuntao Wu, Edward V. Bennett, Russell E. Carlson, Alfred T. Culliford, Jeffrey P. Gold, Robert S. D. Higgins, O. Wayne Isom, Craig R. Smith, Robert H. Jones

Risk scores for in-hospital mortality for coronary artery bypass graft surgery have been used to assess patients’ operative risk. However, none has been developed using data from a population-based region in the U.S. in many years. Data from New York’s Cardiac Surgery Reporting System in 2002 were used to develop a risk index with 10 risk factors. The fit of the index was tested by applying it to another year (2003) and testing the correspondence of expected and observed mortality rates for each risk score in the index. The risk index appears to be a valuable tool for predicting patient risk.

## Abstract

**Objectives** The purpose of this research was to develop a risk index for in-hospital mortality for coronary artery bypass graft (CABG) surgery.

**Background** Risk indexes for CABG surgery are used to assess patients’ operative risk as well as to profile hospitals and surgeons. None has been developed using data from a population-based region in the U.S. for many years.

**Methods** Data from New York’s Cardiac Surgery Reporting System in 2002 were used to develop a statistical model that predicts mortality and to create a risk index based on a relatively small number of patient risk factors. The fit of the index was tested by applying it to another year (2003) of New York data and testing the correspondence of expected and observed mortality rates for each risk score in the index.

**Results** The risk index contains a total of 10 risk factors (age, female gender, hemodynamic state, ejection fraction, pre-procedural myocardial infarction, chronic obstructive pulmonary disease, calcified ascending aorta, peripheral arterial disease, renal failure, and previous open heart operations). The score possible for each variable ranges from 0 to 5, and total risk scores possible range from 0 to 34. The highest score observed for any patient was 22, and 93% of the patients had scores of 8 or lower. When the risk index was applied to another year of New York data with a considerably lower mortality rate, the C-statistic was 0.782.

**Conclusions** The risk index appears to be a valuable tool for predicting patient risk when applied to another year of New York data. It should now be tested against other risk indexes in a variety of geographical regions.

Numerous studies have been conducted to develop “risk scores” or identify risk factors for patients undergoing coronary artery bypass graft (CABG) or cardiac surgery in general (1–11). There are many reasons why risk scores are of interest. First, a patient’s predicted risk is of interest to surgeons because this is one of the factors used in determining whether CABG surgery is the appropriate intervention, and because surgeons need to know which patients should be carefully managed and monitored as a function of their predicted chances of adverse outcomes. Second, the patient has a right to know what risk he/she will be taking in consenting to undergo surgery, and for some patients this risk may not be acceptable depending on factors such as the patient’s age and degree of aversion to risk. Third, risk scores are of interest to hospitals, surgeons, and quality assurance/quality assessment experts in that they provide for a comparison of outcomes among providers (hospitals, surgeons) after adjusting for risk and provide an opportunity to assess changes in risk-adjusted outcomes for a single provider across time.

Despite the fact that several risk scores have been proposed, the most frequently referenced one (EuroSCORE) is based on European data that may not be reflective of patients and outcomes in the U.S. This is particularly true given the fact that the number of CABG procedures per capita is much higher in the U.S. than in Europe or anywhere else. Also, the most commonly referenced risk indexes from the U.S., by Higgins et al. (1) and Parsonnet et al. (10), are based on data that are very old and pre-date many advances in state-of-the-art of cardiac surgery.

The purpose of this study is to develop a risk stratification system based on a well-established large population-based cardiac surgery registry that has been in operation for more than 15 years. This system, the New York State Cardiac Surgery Reporting System (CSRS), contains outcomes and numerous risk factors for all patients undergoing cardiac surgery in non-federal hospitals. The system has been used for many reports and manuscripts that compare outcomes among hospitals and surgeons, track outcomes over time, and compare long-term outcomes for different technologies and types of procedures.

## Methods

### Data

The data used for the study were taken from New York’s CSRS, which was created in late 1988 for the purpose of improving the quality of cardiac care in the state as well as to inform hospitals, surgeons, and the public of patient outcomes and risk factors. Data in the system include patient demographics (age, gender, race, and so on), numerous patient risk factors and comorbidities that have been demonstrated to be related to short-term and long-term outcomes, patient disposition, complications of care, and hospital and surgeon identifiers. The dependent variable (outcome of interest) was in-hospital mortality, arguably the most important outcome measure and the one used by most of the earlier risk stratification systems. It is important to note that these data are audited annually for completeness of cases and determination of discharge disposition by comparing them to New York’s administrative acute care database. Also, the accuracy of risk factors in the system is checked by using New York’s utilization review agent to audit samples of cases from selected hospitals each year.

Patients used to develop the risk score were all 16,120 patients who underwent isolated CABG surgery (no other major cardiac procedures, such as valve surgery) in New York in 2002. The risk score was validated using 2003 isolated CABG surgery data from New York. This study is limited to patients who underwent CABG surgery rather than including other cardiac surgery patients because we have found in earlier studies and reports that the relative impact of various patient risk factors is not proportional in CABG and non-CABG cardiac surgery (12).

### Analysis plan

A total of more than 40 patient risk factors were available in the CSRS for predicting in-hospital mortality. First, the bivariate relationship between each of these independent variables and mortality was examined using chi-square tests after subdividing the two continuous risk factors (age and ejection fraction [EF]) into categories based on similar mortality rates within categories and dissimilar mortality rates between categories. All risk factors with p values <0.05 were then considered in a stepwise logistic regression model with in-hospital mortality as the binary dependent variable (a backward elimination approach yielded the same set of variables). The model was cross-validated by splitting the data into two patient subgroups with nearly identical mortality rates and prevalences for important risk factors, fitting a stepwise model to one-half to identify significant variables, and then using those variables with p < 0.10 in a stepwise model on the second half of the data. The remaining significant variables (with p < 0.05) were then used on the entire data set. This process identified the set of all patient risk factors that were independent predictors of mortality along with their logistic regression coefficients and their odds ratios (OR) with 95% confidence intervals (CI) and p values (12).

The fit of the logistic regression model was measured in terms of its discrimination and calibration. Discrimination, which is measured using the C-statistic (area under the receiver-operating characteristic curve), captures the model’s ability to distinguish between patients who die in the hospital and patients who are discharged alive. It is defined as the proportion of the time that a patient who survives is assigned a higher probability of survival than a patient who dies in the hospital. A value of 1.0 is perfect, and a value of 0.5 denotes only random ability to distinguish between deaths and survivors (13).

Calibration, which is a measure of the model’s ability to predict survival for various levels of patient risk, is the other measure of model fit that was investigated. Calibration was assessed using the Hosmer-Lemeshow statistic (14), which is a variation of the chi-square statistic. Observed and predicted outcomes are compared for 10 equally populated levels of patient risk (defined by the predicted probabilities of survival obtained from the logistic regression model). It should be noted that there are problems related to the arbitrary manner in which patients are subdivided into groups, and that different conventions can lead to different conclusions regarding calibration adequacy.

All but 1 of the 10 total risk factors included in the logistic regression model were either binary or categorical with more than one category. Age was the single continuous variable. The process of converting the user-unfriendly logistic regression output into a risk index follows the procedure developed and described by Sullivan et al. (15). The process first consisted of splitting age into four groups (60 and under, 61 to 69, 71 to 79, and 80+) based on the constant risk under age 60 years and the linearity of the spline function after age 60 years. The youngest group was set as the base category, and each other age group was represented by its mid-point, or reference value, with the last group represented by the mid-point of 80 and the 99th percentile. All other variables except EF and previous myocardial infarction (MI) were binary, and their base categories were chosen as the absence of the characteristic. For EF, spline functions confirmed that the risk of mortality was essentially constant for patients with EFs above 40%, and linear for patients with EFs below 40%. This led to the choice of three categories, with a base category of 40% and over. The base category for previous MI was chosen to be “no previous MI, or MI more than 20 days prior to surgery,” which was the reference category in the logistic regression model. Definitions of the risk factors in the model are listed in the Appendix.

The constant corresponding to one point in the risk score was obtained by multiplying one-half the length of each age range (five years) by the age coefficient (5 × 0.0741 = 0.3705). For all other risk factors, each of which was represented by one or more categories in the logistic regression model, the coefficient of the categorical variable was divided by 0.3705 and then rounded off to the nearest integer. For example, the risk factor “previous open heart operations” has a coefficient of 1.1671, and 1.1671/0.3705 = 3.15, which rounds to 3. The total risk score for a patient is then the sum of the risk scores for each of the risk factors the patient has. Total possible risk scores ranged from 0 to 34.

A predicted mortality rate (PMR) for each risk score was obtained by using the risk score in a logistic regression formula for predicted value of death using methods described by Sullivan et al. (15). These resulting PMRs for each risk score are what can be used to inform surgeons and patients about each patient’s surgical risk.

The accuracy of the risk index was evaluated by comparing predicted mortality values for each risk score with observed values from New York State in the following year (2003). The fit of the model used to develop the risk score was also assessed using 2003 New York data by calculating the C-statistic and the Hosmer-Lemeshow statistic. Because the mortality rate in 2003 was considerably lower than in 2002 (1.61% vs. 2.27%), and the predicted rate for 2003 based on the 2002 model was significantly higher than the 2003 observed rate, the 2002 model was then recalibrated, and the fit of the risk index on 2003 data was then reassessed. The recalibration process consists of calculating a new 2003 mortality rate for each risk score by multiplying the old 2002 rate by the ratio of the overall observed mortality rate (OMR) in 2003 divided by the overall rate predicted by the 2002 model when it is applied to 2003 data._{i, 2003}= the mortality rate in 2003 associated with risk score i; (MR)_{i, 2002}= the mortality rate in 2002 associated with risk score i; (OMR)_{2003}= the overall OMR in 2003; (EMR)_{2003}= the overall expected mortality rate in 2003 based on the 2002 model.

A final set of analyses was aimed at investigating the correspondence between the CABG mortality risk index and two other adverse outcome measures—complications and length of stay. Complications were defined as one or more of the complications available in CSRS, which included stroke, transmural MI, deep sternal wound infection, bleeding requiring reoperation, sepsis/endocarditis, gastrointestinal bleeding/perforation/infarction, renal failure, respiratory failure, and unplanned cardiac operation/interventional procedure.

All statistical analyses except the hierarchical logistic regression analyses were conducted in SAS version 9.1 (SAS Institute, Cary, North Carolina).

## Results

There were a total of 16,120 patients in the study, and their overall in-hospital mortality rate was 2.27%. Table 1contains the significant independent risk factors for in-hospital mortality along with their logistic regression coefficients, OR with 95% CI, and p values. As indicated, there are 10 significant risk factors, with one (age) a continuous risk factor and the others consisting of discrete categories. For example, previous MI has three categories in addition to the reference category (<6 h, 6 to 23 h, 1 to 20 days). The risk factors with the highest ORs were previous MI <6 h (OR = 7.22, 95% CI 3.81 to 13.67), shock (OR = 5.85, 95% CI 3.05 to 11.24), and renal failure requiring dialysis (OR = 5.58, 95% CI 3.62 to 8.61). The logistic regression model had a very good C-statistic (0.823) and an acceptable Hosmer-Lemeshow statistic (p = 0.47).

Table 2presents the score associated with each of the risk factors in the logistic regression model in Table 1. The non-zero individual risk factor scores range from 1 for patients aged 61 to 69 years old to 5 for patients who are at least 80 years old, are in shock, have had a previous MI <6 h before surgery, or have renal failure requiring dialysis. Patients with none of the conditions mentioned in Table 2have a total score of 0. The highest score observed in our database was 22.

Table 3contains the predicted probabilities of death for each risk score and the cumulative percentage of patients with each score or a lower score. The probabilities of mortality range from 0.30% for patients with a score of 0 to >90% for scores of 22 and higher. The mean mortality rate for all patients used to develop the risk score was 2.27%. A total of 74% of the patients in Table 3had a risk score with a PMR lower than the mean, and these patients had risk scores of 5 or less. Slightly more than 50% of the patients had risk scores of 3 or less, and less than 1% of the patients had risk scores of 12 or higher. Therefore, data are combined for risk scores of 12 or higher.

Figure 1demonstrates the correspondence between observed and predicted rates for each risk score where observed and predicted rates were obtained from 2002 data. As the figure demonstrates, the observed and predicted values are quite close together, particularly for the lower risk scores. For every risk score except the 12+ group, the predicted risk mortality is within the 95% CI for the observed mortality.

Figure 2contrasts the observed rates for each risk score in the year 2003 with the predicted values based on the 2002 risk model after recalibrating the 2002 risk score probabilities to reflect the differences in performance between 2002 and 2003. The predicted values and observed values again demonstrate a reasonably good correspondence.

Figures 3 and 4⇓⇓demonstrate that there are very strong correspondences between higher CABG risk scores and the prevalence of higher complication rates and longer lengths of stay. Complication rates rose monotonically from a low of 4.3% for a risk score of 0 to a high of 35.2% for a risk score of 12 or higher. Mean length of stay rose monotonically from a low of 5.3 days for a risk score of 0 to 14.0 days for a risk score of 12 or higher.

## Discussion

The purpose of this study was to develop a risk index for CABG surgery based on New York State CABG surgery data from 2002. The database used in this study has the advantage of being a large database that includes all New York patients who underwent CABG surgery in non-federal hospitals. Also, the accuracy of the database is maintained through auditing of medical records, and the completeness is assured through reconciliation with New York’s acute care hospital database. Although EuroSCORE is a well-established risk index based on European data, there are many differences between countries, and particularly continents, in utilization rates and types of patients undergoing CABG surgery. Consequently, it is of interest to develop a risk index based on U.S. data and at least compare it to EuroSCORE in other settings in the U.S. Although Higgins et al. (1) and Parsonnet et al. (10) have developed risk indexes based on U.S. data, both of these studies are more than 10 years old, and CABG surgery patients and outcomes have changed considerably in the interim.

Results of our study show that the risk index based on New York 2002 data contains a total of 10 risk factors and possible scores for each factor that range from 0 to 5 with a possible total risk score for an individual patient ranging from 0 to 34. However, in New York in 2002, the highest observed score for any patient was 22. A total of 93% of the patients had scores of 8 or lower. When the statistical model used to create the risk index was applied to another year of New York data with a considerably lower mortality rate, the discrimination of the model was very high (C = 0.782). However, because the crude mortality had dropped substantially and the mean patient risk was essentially the same, there was a need to recalibrate the model so that the overall population-predicted mortality was identical to the observed mortality. After recalibration, there were no statistically significant differences between any of the observed and expected mortality deciles. Thus, the risk index works well when applied to other time frames in New York.

The risk factors in the New York CABG surgery risk score are very similar to the ones used in a percutaneous coronary intervention (PCI) risk score based on New York data (16). Seven of the 10 variables (age, female gender, hemodynamic state, EF, pre-procedural MI, peripheral arterial disease, and renal failure) in the CABG risk score are contained in both scores. The CABG risk score also includes chronic obstructive pulmonary disease, extensively calcified ascending aorta, and previous open heart operations, and the PCI risk score also includes congestive heart failure and left main disease.

The two most important uses for predictions of treatment risks derived from large prospective, well-defined patient populations undergoing a specific therapy are to inform patient providers about the relative short- and long-term risks and benefits of alternative treatment strategies and to retrospectively assess quality of care by benchmarking outcome among different populations after correction for difference in baseline characteristics reflecting severity of illness. Statistical models (usually logistic regression models) previously used to predict an individual’s chance of surviving CABG surgery use a computer or calculator to estimate probabilities and consequently are not used frequently. The risk index reported facilitates quick estimation of CABG mortality obtained by adding numbers describing the prognostic weight of 10 variables and consulting a table that assigns a predicted mortality to each risk score value. For example, a 75-year-old woman with an EF of 35%, no previous MI, peripheral arterial disease, but no other risk factors has a risk score of 3 + 2 + 2 + 2 = 9 (Table 2). The predicted risk of in-hospital mortality for this patient is 7.70% (Table 3).

Although a risk score can never be as accurate as the statistical model from which it was derived, the small sacrifice of accuracy to gain simplicity often facilitates wide clinical use at the bedside. Providers commonly internalize risk score information structure into their quantitative thinking and into deliberations with colleagues and patients about choice of alternate treatments. Patients can readily understand how characteristics specific to them relate to the individual risk predictions provided to them as a basis for their consent to a specific recommended therapy.

Individual providers can develop a personal quality of care assessment program prospectively by monitoring observed deaths to compare with deaths predicted by the risk score from characteristics of the patients treated. The probabilities of death derived from the risk index table when summed for all patients and divided by the number of patients provide a PMR to compare to the individual provider’s OMR. If multiple providers in the same environment follow this same process, statistical tests can compare the OMR/PMR as an index of quality of care. This empowers each provider or group of providers to voluntarily monitor their quality of care prospectively by benchmarking among colleagues.

To be used optimally, risk scores should only be used in populations whose patient characteristics were tabulated using the same variable definitions as were used in data collection for the New York State model. Moreover, if the outcomes for hospitals or surgeons in a population other than New York are to be compared with one another, the risk index must be recalibrated to adjust for differences in underlying mortality rates between the new population and the New York population so that a PMR and the OMR are calibrated to be the same for the new population. This recalibration is important given the findings of Ivanov et al. (7), who reported that provider assessments varied based on whether the model used was based on another population or if it was recalibrated using the same population. Without recalibration to actual observed deaths in a new population, the risk index or the model from which it was derived cannot be expected to yield accurate predictions of patient risk or accurate relative provider assessments when the underlying performance (outcome) of the population to which it is being applied is different from the underlying performance of the population from which it was derived. However, with proper recalibration reflecting actual total deaths, the model can be used to intercompare outcomes among different providers or care environments.

The risk index described is limited to CABG surgery because we have found that the risk factors for other open heart surgery, and valve surgery in particular, are somewhat different than the risk factors for CABG surgery. Therefore, the performance of this risk score will be expected to differ from other risk scores such as the EuroSCORE, which includes all open heart surgery. We have found that even when different types of open heart surgery share the same significant risk factors, the relative importance of these factors often varies tremendously. Consequently, we feel that a separate risk index should be created for other specific operations, especially for cardiac valve surgery. Further use of the risk index described in other care settings will define its ultimate usefulness in comparison to other risk indexes developed to predict risk of cardiac surgical care.

## Acknowledgments

The authors would like to thank Kenneth Shine, MD, the Chair of New York State’s Cardiac Advisory Committee (CAC), and the remainder of the CAC for their encouragement and support of this study; as well as Paula Waselauskas, Donna Doran, Kimberly Cozzens, Rosemary Lombardo, and the participating hospitals for their tireless efforts to ensure the timeliness, completeness, and accuracy of the registry data.

## Appendix

For the definition of risk factors in the logistic regression equation for CABG in-hospital deaths in New York State in 2002, please see the online version of this article.

- Abbreviations and Acronyms
- CABG
- coronary artery bypass graft
- CI
- confidence interval
- CSRS
- Cardiac Surgery Reporting System
- EF
- ejection fraction
- EuroSCORE
- risk score based on European data
- MI
- myocardial infarction
- OMR
- observed mortality rate
- OR
- odds ratio
- PCI
- percutaneous coronary intervention
- PMR
- predicted mortality rate

- Received August 25, 2005.
- Revision received October 13, 2005.
- Accepted October 18, 2005.

- American College of Cardiology Foundation

## References

- ↵
- Nashef S.A.,
- Roques F.,
- Michel P.,
- et al.

- Nashef S.A.,
- Roques F.,
- Hammill B.G.,
- et al.

- ↵
- Ivanov J.,
- Tu J.V.,
- Naylor C.D.

- Geissler H.J.,
- Holzl P.,
- Marohl S.,
- et al.

- Gogbashian A.,
- Sedrakyan A.,
- Treasure T.

- ↵
- ↵(2004) Adult Cardiac Surgery in New York State: 2000–2002 (New York State Department of Health).
- ↵
- ↵
- Hosmer D.W.,
- Lemeshow S.

- ↵
- ↵
- Wu C.,
- Hannan E.L.,
- Walford G.,
- et al.