Author + information
- Received June 10, 2008
- Accepted July 13, 2008
- Published online January 27, 2009.
- Matthew J. Budoff, MD⁎,⁎ (, )
- Khurram Nasir, MD†,
- Robyn L. McClelland, PhD‡,
- Robert Detrano, MD, PhD§,
- Nathan Wong, PhD§,
- Roger S. Blumenthal, MD∥,
- George Kondos, MD¶ and
- Richard A. Kronmal, PhD‡
- ↵⁎Reprint requests and correspondence:
Dr. Matthew J. Budoff, Los Angeles Biomedical Research Institute at Harbor-UCLA, 1124 West Carson Street, RB2, Torrance, California 90502
Objectives In this study, we aimed to establish whether age-sex–specific percentiles of coronary artery calcium (CAC) predict cardiovascular outcomes better than the actual (absolute) CAC score.
Background The presence and extent of CAC correlates with the overall magnitude of coronary atherosclerotic plaque burden and with the development of subsequent coronary events.
Methods MESA (Multi-Ethnic Study of Atherosclerosis) is a prospective cohort study of 6,814 asymptomatic participants followed for coronary heart disease (CHD) events including myocardial infarction, angina, resuscitated cardiac arrest, or CHD death. Time to incident CHD was modeled with Cox regression, and we compared models with percentiles based on age, sex, and/or race/ethnicity to categories commonly used (0, 1 to 100, 101 to 400, 400+ Agatston units).
Results There were 163 (2.4%) incident CHD events (median follow-up 3.75 years). Expressing CAC in terms of age- and sex-specific percentiles had significantly lower area under the receiver-operating characteristic curve (AUC) than when using absolute scores (women: AUC 0.73 versus 0.76, p = 0.044; men: AUC 0.73 versus 0.77, p < 0.001). Akaike's information criterion indicated better model fit with the overall score. Both methods robustly predicted events (>90th percentile associated with a hazard ratio [HR] of 16.4, 95% confidence interval [CI]: 9.30 to 28.9, and score >400 associated with HR of 20.6, 95% CI: 11.8 to 36.0). Within groups based on age-, sex-, and race/ethnicity-specific percentiles there remains a clear trend of increasing risk across levels of the absolute CAC groups. In contrast, once absolute CAC category is fixed, there is no increasing trend across levels of age-, sex-, and race/ethnicity-specific categories. Patients with low absolute scores are low-risk, regardless of age-, sex-, and race/ethnicity-specific percentile rank. Persons with an absolute CAC score of >400 are high risk, regardless of percentile rank.
Conclusions Using absolute CAC in standard groups performed better than age-, sex-, and race/ethnicity-specific percentiles in terms of model fit and discrimination. We recommend using cut points based on the absolute CAC amount, and the common CAC cut points of 100 and 400 seem to perform well.
Computed tomography (CT) is a noninvasive tool for the detection and quantification of coronary artery calcium (CAC), a marker for atherosclerosis. The presence and extent of CAC correlates with the overall magnitude of coronary atherosclerotic plaque burden and with the development of subsequent coronary events (1–4). CAC occurs only in the setting of atherosclerosis, and is a better index of global atherosclerotic burden than stenosis severity (5). CAC has been shown to add independent prognostic information in every study to date. Recently, overall results from the MESA (Multi-Ethnic Study of Atherosclerosis), demonstrated that CAC improved risk prediction after taking into account Framingham risk score (FRS) in a multiethnic population-based study (6).
The National Cholesterol Education Program Adult Treatment Panel III (NCEP ATP III) (7), American Heart Association (5), and American College of Cardiology (8) have each stated that it might be reasonable to measure CAC in selected patients at intermediate risk, but the precise method to use these scores has been of debate in the CT published reports. Early data support that having a CAC above an age-sex–specific cut point (CAC ≥75th percentile) is associated with increased coronary heart disease (CHD) events and could be used as marker to identify individuals requiring aggressive preventive management (9). The hypothesis that a low score in a young person is more abnormal than a low score in an older person and might carry independent risk has been incorporated into guidelines, including those from the NCEP, which recommend that persons with CAC >75th percentile for their age and sex would be candidates for intensified low-density lipoprotein cholesterol-lowering therapy (7). Others have reported that increasing events are most associated with increasing absolute scores (i.e., >100 or >400) rather than based upon demographic-specific percentiles (10).
The large population-based observational study, MESA, with 6,814 persons undergoing calcium scoring and longitudinal follow-up, allows evaluation of the robustness of these different scoring approaches. In this study we aim to establish whether absolute coronary artery calcium scores (CACS) predict cardiovascular outcomes better than age-, sex-, and/or race/ethnicity-specific CAC percentiles of the MESA cohort—in other words, whether it is the actual amount of calcium present or the relative amount compared with others of the same age, sex, and race/ethnicity that is most strongly associated with risk.
Recruitment and baseline examination
The MESA cohort (11) is a longitudinal, population-based study of 6,814 men and women, free of clinical cardiovascular disease, ages 45 to 84 years at baseline, recruited from 6 field centers: Baltimore, Maryland; Chicago, Illinois; Forsyth County, North Carolina; Los Angeles, California; New York, New York; and St. Paul, Minnesota. Specific ethnicity groups enrolled included white, black, Hispanic, and Chinese. Over 50% of the participants enrolled were female. Details of the MESA recruitment strategy are contained elsewhere (12). The baseline visit took place between July 2000 and September 2002. The study was approved by institutional review boards at each site, and all participants gave written informed consent.
The purpose of the study is to examine the risk factors and progression of subclinical cardiovascular disease. The design of the study has been described in detail previously (12), but we describe the collection of pertinent variables here.
Measurement of CAC: CT scanning
Scanning centers assessed coronary calcium by chest CT with either a cardiac-gated electron-beam CT scanner (Chicago, Los Angeles, and New York Field Centers) or a multidetector CT system (Baltimore, Forsyth County, and St. Paul Field Centers). Certified technologists scanned all participants twice over phantoms of known physical calcium concentration. A radiologist or cardiologist read all CT scans at a central reading center (Los Angeles Biomedical Research Institute at Harbor–UCLA in Torrance, California). We used the average Agatston score (13) for the 2 scans in all analyses. Carr et al. (14) have reported the details of the MESA CT scanning and interpretation methods.
To date, the cohort has been followed for incident cardiovascular events for a median of 46 months (6). At intervals of 9 to 12 months, a telephone interviewer contacted each participant to inquire about interim hospital admissions, cardiovascular outpatient diagnoses, and deaths. To verify self-reported diagnoses, we requested copies of all death certificates and medical records for hospital stays and outpatient cardiovascular diagnoses and conducted next-of-kin interviews for out-of-hospital cardiovascular deaths. We obtained records on 98% of reported hospitalized cardiovascular events. Some information was available on 95% of reported outpatient diagnostic encounters.
Trained personnel abstracted medical records suggesting possible cardiovascular events. Two physicians independently classified and assigned incidence dates. If, after review and adjudication, disagreements persisted, a full mortality and morbidity review committee made the final classification. For purposes of this study, we used all incident CHD events as the end point, including definite or probable myocardial infarction (MI), resuscitated cardiac arrest, fatal CHD, definite angina, and probable angina if accompanied by revascularization. Definitions for each of these events are as follows. Reviewers classified MI as definite, probable, or absent, primarily on the basis of combinations of symptoms, electrocardiogram (ECG), and cardiac biomarker levels. In most cases, definite or probable MI required either abnormal cardiac biomarkers (2 times upper limits of normal) regardless of pain or ECG findings; evolving Q waves regardless of pain or biomarker findings; or a combination of chest pain and ST-T evolution or new left bundle branch block and biomarker levels 1 to 2 times upper limits of normal.
Reviewers classified resuscitated cardiac arrest when a patient successfully recovered from a full cardiac arrest through cardiopulmonary resuscitation (including cardioversion). Angina was classified, except in the setting of MI and/or angina required symptoms of typical chest pain or atypical symptoms, because asymptomatic coronary artery disease is not a MESA end point. Probable angina required, in addition to symptoms, a physician diagnosis of angina and medical treatment for it. Definite angina required 1 or more additional criteria, including coronary artery bypass graft surgery or other revascularization procedure; 70% or greater obstruction on coronary angiography; or evidence of ischemia by stress tests or by resting ECG. We considered coronary revascularization or a physician diagnosis of angina or CHD, in the absence of symptoms, to not be angina. Fatal CHD required a documented MI within the previous 28 days, chest pain within the 72 h before death, or a history of CHD and required the absence of a known nonatherosclerotic or noncardiac cause of death.
Estimating Age-, Sex-, and/or Race/Ethnicity-Specific Percentiles
The methodology for estimating the age-, sex-, and/or race/ethnicity-specific percentiles is described in detail in McClelland et al. (15). A brief description is provided in the following text. The distribution of baseline CAC in this population is heavily skewed, with approximately 50% of participants having zero calcium. The positive portion of the CAC distribution is fairly symmetric and bell-shaped on the log scale. As a first step in obtaining age-, sex-, and race/ethnicity-specific quantiles, we model the mean of the log CAC distribution (positive CACS only) as a linear function of age, within each sex and race/ethnicity. Within each sex and race/ethnicity, the residuals from this model are then ranked, and we calculate the jth percentile for each of j = 1, … 100 of the residuals. Adding these to the fitted value for a particular age, sex, and race/ethnicity yields an estimated percentile for the log transformed positive CAC variable. Taking the exponential of this percentile yields the jth percentile of the positive portion of the CAC distribution. If a certain proportion (p) have zero calcium, then the jth percentile calculated in the preceding text is the 100 × [p + (1 − p)j/100] percentile of the overall CAC distribution (i.e., including the zeroes). We model p as a sex and race/ethnicity-specific function of age with logistic regression. To estimate age- and sex-specific percentiles, we follow the strategy outlined in the preceding text, but the models are only sex-specific and not race/ethnicity- and sex-specific, and residuals are ranked without regard to race/ethnicity. Percentiles by age and race/ethnicity and age-only are obtained similarly. Sex- and race/ethnicity-specific percentiles as well as overall percentiles are obtained by simply ranking the values within each group of interest. In all cases, participants with zero CAC are assigned a midrank percentile, equal to one-half the predicted probability of zero CAC from the logistic regression model in the preceding text.
Models for Time to Incident CHD
Time to incident CHD was modeled with Cox proportional hazards models. We also considered parametric survival models, including exponential, Weibull, log-normal, and log-logistic, but conclusions were unaffected, and only the Cox model results are presented. We compared models with continuous versions of the percentiles and also categorized versions. For models based on the continuous variables, each model contained a percentile ranking of CAC and an indicator for whether CAC was positive at baseline. The indicator term allows a different intercept for those with and without CAC and is necessary due to the possible discontinuity between the continuous positive CAC values and zero CAC. In addition we also fit continuous model with log(CAC+1) instead of CAC percentile. Because the percentiles would likely be categorized for use clinically, we also fit models with the following groups: zero CAC, ≤75th percentile, 75th to 90th percentile, and >90th percentile. A final model used CAC in 4 groups on the basis of cut points commonly seen in the published reports (zero CAC, 1 to 100, 101 to 400, >400).
Models were compared on the basis of several metrics, each of which reflects a different characteristic of a desirable prediction model. The hazards ratios represent the multiplicative increase in risk association with a 1-percentile point difference in ranking (or a 1 log Agatston unit for the log CAC model). Assuming the scales are comparable, a stronger predictor should have a higher hazards ratio. These are useful to compare the various percentile rankings, although they are not comparable between the percentile rankings and the model with log CAC or CAC group. For each model we calculated a proportion of variation explained, with a modified version of R-squared for censored data described in Royston (16). Additionally we estimated the area under the receiver-operator characteristic curve (AUC) and Akaike's Information Criterion (AIC). These statistics are comparable across all models within a given sex. The R-squared is a measure of model fit, whereas the AUC is a measure of discrimination. For both of these, higher values are preferable. The AIC is also a measure of model fit but includes a penalty for models with more parameters (such as with CAC group in 4 levels). Lower values of AIC indicate better model fit.
Overall the study population consisted of 6,809 individuals at baseline (mean age: 62 ± 10 years, 47% men). There were 163 incident CHD events (2.4%) observed over a median of 3.75 years. Table 1 demonstrates that the cardiovascular risk profile was less favorable in those who subsequently developed CHD than in those in who did not. In addition, baseline coronary artery calcium score (CACS) was significantly higher among those who suffered an incident CHD event compared with those who did not.
Tables 2 and 3⇓⇓ display the sample size, event rates, hazards ratios, and AIC statistics for models with categories based on various adjusted percentile rankings (age-sex– and age-sex-race/ethnicity–adjusted) as well as based on absolute CAC cutoffs (0, 1 to 100, 101 to 400, and >400) in women and men. The best fitting model as measured by the lowest AIC used absolute CAC cut points, and these correspond quite closely to the 75th and 90th overall percentile. Using the percentiles continuously and comparing with a model containing log(CAC+1) yielded the same conclusions, in that the overall percentile or the model using log(CAC+1) performed best. This was also true in terms of AUC and R-squared. For example, among women the AUC was 0.76 for overall percentile (or log[CAC+1]) and was 0.73 for age- and sex-specific percentiles (p = 0.04). For men, the AUC was 0.77 for overall percentile (or log[CAC+1]) and 0.73 for age- and sex-specific percentiles (p < 0.001). The modified R-squared was 0.53 for the log(CAC+1) model for women and 0.50 for men. In contrast the modified R-squared was much lower for age- and sex-specific percentiles at 0.46 for women and 0.38 for men. As shown in Online Table 1, age-specific percentile rankings had the worst model fit, regardless of whether sex and race/ethnicity were also considered.
Figure 1 compares the incidence of CHD over time by CAC group. The absolute CAC categories yield curves with much better separation, indicating greater risk stratification ability. In Figure 2, we display the rates of incident CHD/1,000 person-years at risk by joint categories of absolute CAC group and age-, sex-, and race/ethnicity-specific percentiles. We note that the overall 75th and 90th percentiles for the MESA cohort are 88 and 398 CAC units, respectively, and hence the absolute CAC groups are essentially equivalent to dividing on the basis of the overall percentiles. Within a particular level of age-, sex-, and race/ethnicity-specific percentile, there remains a clear trend of increasing risk across levels of the absolute CAC groups. In contrast, once absolute CAC category is fixed, there is no increasing trend across levels of age-, sex-, and race/ethnicity-specific categories.
In addition, we also assessed the risk of incident CHD according to increasing absolute CACS across age-sex-race/ethnicity–specific percentiles (Table 4). Among individuals with CACS <75th percentile for age-sex-race/ethnicity as compared with those with CAC 1 to 100 (reference group), the hazard ratio for incident CHD after taking into account Framingham risk score was 2.50 (95% confidence interval [CI]: 1.27 to 4.92) with CAC 101 to 400 and 5.58 (95% CI: 2.34 to 13.33), respectively. In contrast, within absolute CACS categories (Table 5), a higher adjusted percentile CAC was not associated with increased risk of incident CHD.
The results of this study demonstrate that there is no advantage and, in some cases, considerable disadvantage to expressing CACS relative to age, sex, and/or race/ethnicity. The overall percentile does just as well as any other percentile ranking and in fact better than any percentile that is age-adjusted. Consider a qualitative example: a 50-year old Hispanic woman with a CAC of 25 Agatston units is at the 95th percentile relative to her age, sex, and race/ethnicity, with an annual risk of only 0.25% (10-year estimated risk of only 2.5%) on the basis of this model (Table 2). Now consider an 83-year-old white man with a CAC of 1,572 Agatston units. Relative to his age, sex, and race/ethnicity, he is at the 72nd percentile. However, the high absolute score drives the overall risk, and the annual risk is 2.8% (10-year estimated 28% risk). So, the age-, sex-, and race/ethnicity-specific percentiles would say the Hispanic woman is at much higher risk. Clearly, the estimates from the age-, sex-, and race/ethnicity-specific percentile model do not reflect what we know about CHD risk. The overall percentiles provide a more realistic picture.
Although individuals with a higher demographic adjusted CAC percentile will have higher CACS, there are still some major differences in classification. In the MESA study, approximately 50% of participants with age-sex-race/ethnicity–adjusted percentile scores in the 75% to 90% group had CACS <100. In contrast, approximately one-third (35%) of MESA subjects with CACS 100 to 399 were considered to have an adjusted percentile <75%. Our study results indicate that within an absolute score group there is no difference in the rate of individuals suffering CHD events associated with worsening CAC percentiles (Fig. 2, Table 4). Patients with low absolute scores are low risk, regardless of adjusted CAC percentile rank. Conversely, within the age-sex-race/ethnicity–specific percentiles, a positive relationship with events is observed across increasing CACS. In addition, after taking into account Framingham risk scores, those with CAC >100 were at 2 to 5 times higher risk of suffering an acute CHD event in the near-term follow-up (Table 5). This demonstrates that percentile rank is not as robust a risk stratifier as absolute scores.
Our data differ somewhat from previously published reports on this topic. Whether age-sex–based scores or absolute scores are better predictors has only been evaluated in 2 small studies to date. One such approach was taken by Raggi et al. (9), who reported on the occurrence of hard events in 632 patients followed for 32 ± 7 months from the time of electron beam tomography (EBT) calcium scanning and on the CT findings of 172 patients undergoing CT imaging within a few days of suffering an acute MI. In both groups the majority of patients (70%) who suffered an MI or a coronary death showed a calcium score above the age-sex–adjusted 75th percentile at the time of screening (70% found vs. 25% expected, p < 0.001). Of interest, the event rate in patients with large calcium scores (>401) was high (approximately 5%/year), but only a small proportion of the subjects studied (7%) presented this level of calcification. Therefore, although a large calcium score represents a serious risk of developing coronary events, the authors felt its low frequency in the population renders it inadequate for risk stratification purposes. This observation contrasted with the powerful risk stratification ability demonstrated by relative calcium scores. In fact, the risk of suffering a hard event in patients with a calcium score >75th percentile was 19 times that of patients with a score <25th percentile, whereas the risk of events in patients in the upper risk factor quartile was 6.5 times greater than that of patients in the lowest quartile.
Wong et al. (10) published a report on 926 asymptomatic patients followed for an average of 3.3 years from the time of EBT screening. Patients with CAC deposits on EBT had more prevalent risk factors, and the calcium scores were significantly greater in patients with events than in those without events. The risk ratio for events in patients in the upper quartile of absolute calcium score (score >271) was 12 times higher than for patients in the lowest quartile (score <15; annual risk: 8.8 and 0.72, respectively; risk ratio: 12). In multivariable analysis adjusted for other risk factors, there was a modest increase in cardiovascular disease events seen among those in the 3rd age and sex quartile (relative risk: 4.3, p = 0.02), with a greater risk seen among those in the 4th quartile (relative risk: 6.0, p < 0.01) (compared with the 1st quartile). Results of this dataset demonstrated that age-sex stratification by percentile rank of CAC was not as accurate as absolute CACS for predicting cardiovascular disease events in asymptomatic persons.
The NCEP (ATP III) has recommended age-sex cut points: “In persons with multiple risk factors, high CACS (e.g., >75th percentile for age and sex) denotes advanced coronary atherosclerosis and provides a rationale for intensified low-density lipoprotein cholesterol-lowering therapy” (7). However, the results of MESA indicate that the relative percentiles do not predict incident CHD as well as simply using the absolute scores or overall percentiles. If adjusted CACS are used as a basis to identify high-risk individuals, nearly one-third of individuals with adjusted CAC <75th percentile have absolute CACS >100 and might not be considered candidates for lipid-lowering medications. It seems that the amount of CAC (as a surrogate for plaque burden) is more important than the relative percentile of an individual on the basis of age and sex. This is consistent with cardiovascular risk factors (such as cholesterol or blood pressure values), which are not normalized on the basis of age.
We would like to emphasize that cut points for treatment might still need to be sex or age specific. If the goal is to identify and treat patients who have a particular level of risk, say at least 2%/year, then the CAC threshold for women will have to be higher than that for men, because women have lower baseline risk. Using sex-based percentiles, however, actually does the opposite of this. By fixing the percentage of patients to target rather than the underlying risk, the threshold for women is lower than for men. Targeting the top 25% of each sex for instance, we would be treating women with much lower CACS and consequently at much lower risk than men.
Using overall percentile or CAC in standard groups performed much better than age-sex-race/ethnicity–specific percentiles in terms of model fit and discrimination. Cut points based on demographic specific percentiles have the additional problem that they are study-specific, and so we recommend using cut points based on the absolute CACS for evaluating risk of CHD events in short-term follow-up. Further study based on a greater number of events might help elucidate which specific cut points are best; however, at the moment the common choices of 100 and 400 seem to perform well.
The authors thank the other investigators, the staff, and the participants of the MESA study for their valuable contributions. A full list of participating MESA investigators and institutions can be found at http://www.mesa-nhlbi.org.
For a supplemental table on the prediction of incident CHG as a function of CAC percentiles calculated in different ways, please see the online version of this article.
Coronary Calcium Predicts Events Better With Absolute Calcium Scores Than Age-Gender-Race Percentiles—The Multi-Ethnic Study of Atherosclerosis
This research was supported by R01 HL071739 and contracts N01-HC-95159 through N01-HC-95165 and N01 HC 95169 from the National Heart, Lung, and Blood Institute.
- Abbreviations and Acronyms
- coronary artery calcium
- coronary artery calcium score
- coronary heart disease
- confidence interval
- computed tomography
- electron beam tomography
- myocardial infarction
- NCEP ATP III
- National Cholesterol Education Program Adult Treatment Panel III
- Received June 10, 2008.
- Accepted July 13, 2008.
- American College of Cardiology Foundation
- Arad Y.,
- Roth M.,
- Newstein D.,
- et al.
- Taylor A.J.,
- Bindeman J.,
- Feuerstein I.,
- Cao F.,
- Brazaitis M.,
- O'Malley P.G.
- Vliegenthart R.,
- Oudkerk M.,
- Hofman A.,
- et al.
- Budoff M.J.,
- Shaw L.J.,
- Liu S.T.,
- et al.
- Budoff M.J.,
- Achenbach S.,
- Blumenthal R.S.,
- et al.
- ↵(2002) Third Report of the National Cholesterol Education Program (NCEP) Expert Panel on Detection, Evaluation, and Treatment of High Blood Cholesterol in Adults (Adult Treatment Panel III) final report. Circulation 106:3143–3421.
- Greenland P.,
- Bonow R.O.,
- Brundage B.H.,
- et al.
- Raggi P.,
- Callister T.Q.,
- Cooil B.,
- et al.
- Bild D.E.,
- Bluemke D.A.,
- Burke G.L.,
- et al.
- Bild D.E.,
- Detrano R.,
- Peterson D.,
- et al.
- Agatston A.S.,
- Janowitz W.R.,
- Hildner F.J.,
- Zusmer N.R.,
- Viamonte M. Jr..,
- Detrano R.
- Carr J.J.,
- Nelson J.C.,
- Wong N.D.,
- et al.
- McClelland R.L.,
- Chung H.,
- Detrano R.,
- et al.
- Royston P.