Author + information
- Daniel B Mark, MD, MPH, FACC, Co-Chair,
- Leslee J Shaw, PhD, Co-Chair,
- Michael S Lauer, MD, MPH,
- Patrick G O’Malley, MD, MPH and
- Paul Heidenreich, MD, MS
In the U.S., an estimated 40 million noninvasive cardiac tests are performed annually, and this rate has been increasing by as much as 20% per year (1). This growth is part of a larger trend of progressive annual increases in total U.S. spending on medical care, which has accelerated over the past four years. Rising costs of care reflect both an increase in the prevalence of disease due to aging of the population and the development of expensive new diagnostic and therapeutic technologies for cardiovascular disease. For cardiologists, cardiac imaging encompasses approximately 30% of all Medicare reimbursement, totaling over $1 billion in 2000 (2). In the area of atherosclerosis imaging, procedural volume for computed tomography (CT) and magnetic resonance imaging (MRI) was 555,652 and 719,329 scans, respectively, in the year 2000 (Siemens Medical Engineering Group, Magnetic Resonance Division, Iselin, New Jersey), whereas 1999 Medicare utilization of carotid or peripheral extremity studies was 424,978 (2). Although no reliable statistics exist on the use of diagnostic tests to detect asymptomatic atherosclerosis, estimates on the use of electron beam tomography (EBT) suggest that approximately 300,000 scans are performed annually in 79 centers in the U.S. (personal communication, Leslee J. Shaw, 2002). Thus, diagnostic cardiovascular tests are not only a significant part of modern cardiovascular care; they are also a “big business.” The economics of this testing, therefore, is of importance for both clinicians and policymakers.
Economic evaluations, particularly cost-effectiveness analyses, are not simply concerned with costs. Instead, these analyses combine cost information with relevant clinical outcome data to provide a measure of the value of a new technology in relation to relevant alternatives. Unfortunately, very few published economic evaluations of atherosclerosis imaging techniques exist (3–9). Two major reasons for this deficiency can be postulated. First, many of the technological advances in cardiac imaging were introduced without undergoing rigorous scientific testing on effectiveness. Without adequate effectiveness data, economic evaluation is extremely limited. Second, economic analyses are most straightforward when evaluating therapies that save lives or improve quality of life. Assessing the value of tests that incrementally improve a diagnosis or an assessment of prognosis, which may or may not alter outcome, is more difficult and often yields less persuasive results.
Defining the effectiveness of diagnostic tests for economic analysis
Any assessment of value for money begins with the effectiveness side of the equation. What is the money purchasing? In the case of a new therapy, money is spent to improve survival or quality of life. A new diagnostic test is used with the same basic goals. To improve outcome, however, several intermediate steps must take place after the test is performed. First, the test results must be summarized in some clinically meaningful fashion—“positive,” “strongly positive,” “high risk,” and so forth. Second, the responsible clinician must link the test results with a subsequent management decision: for example, “high risk” equals need for coronary angiography and revascularization. Third, the therapies associated in this fashion with the test results must be capable of changing patient outcomes.
The ultimate value question for a diagnostic test is: “Does its use improve longevity or quality of life?” A test may fail to achieve this objective for several reasons. First, a test may not provide enough useful incremental information to alter management, because clinicians use the test results inconsistently in decision making, because the effectiveness of the therapy used is inadequate, or because the therapy is of poor quality. For example, a new test may provide information regarding diagnosis or risk level that is already available to the clinician from previously collected data. A 65-year-old male with typical exertional angina has a high pretest probability of significant coronary artery disease (CAD). The addition of a noninvasive stress test to his work-up would be unlikely to alter management in any important way, much less alter outcome. Research on diagnostic tests typically uses an underlying conceptual model that looks at the total available information content of the test across the entire spectrum of patient pretest risk level. However, clinicians make management decisions using a much different conceptual model, one that frequently employs heuristics and informal decision thresholds (10). If a test provides measurably more information about a patient’s risk level but that information does not move that patient across a decision threshold, the added information may be invisible to the clinician and have no effect on management or outcome. For this reason, multivariable models that show a new test is significantly better at stratifying risk than an older test are not, in themselves, sufficient to demonstrate incremental effectiveness, as we have defined it above.
Another way that test information becomes uncoupled from patient outcome is when clinicians use the test results inconsistently. For the test to alter outcome, clinicians as a group must have a consensus about the management implication of test results. Several studies have shown that a significant proportion of symptomatic patients with high-risk stress nuclear perfusion scan results do not undergo coronary angiography (11). For such patients, management does not appear to be significantly altered because of the test result. The reasons why physicians do not act in the anticipated manner upon receiving test data are complex and outside the scope of this report. However, it is important to note that the idealized world often reflected in models of test use, where each test result is closely linked with a management decision, often does not match the real world of test use. New tests often become widely disseminated before there is general evidence-based agreement on what the results mean. The effect of the test on management will therefore be inconsistent, thereby reducing any possible impact on outcomes. The ongoing debate on the meaning of coronary calcium evident on the EBT test illustrates this problem well.
A third reason that testing may fail to yield changes in patient outcome is inadequate effectiveness of the therapies that are linked to the test results. A strongly positive atherosclerosis imaging test may lead to a diagnostic cardiac catheterization, which itself cannot improve outcome and carries a small procedural risk. Results of the catheterization in turn may lead to coronary revascularization. The ability of this therapy to improve prognosis is linked to the severity of underlying CAD (12). If the imaging test applied to a cohort of asymptomatic subjects identifies a subset that has significant CAD, but these patients have predominantly one-vessel disease, subsequent use of revascularization will have minimal impact on survival. Because the screened population is, by definition, asymptomatic, improvement in quality of life with revascularization is unlikely. Thus, screening in this example alters management, but a positive impact on clinical outcomes might be undetectable.
Finally, a test may fail to improve outcome if the quality of the resulting therapy is poor. Models of test use often assume that therapies applied in the real world will be of equivalent quality to the best results available in the published data. However, if the “high-risk” test result leads to a revascularization procedure in a low-volume community hospital that has a procedural morbidity and mortality rate several times higher than the expert high-volume centers, the ability of the test to improve outcomes may be significantly reduced. Similarly, if the “high-risk” test result leads to intensive risk-factor management, but this management does poorly in achieving target cholesterol levels, blood pressure control, and smoking cessation, the value of the test will be proportionately reduced.
For an atherosclerosis imaging test to be clinically effective in the assessment of asymptomatic individuals, it would have to provide new information above and beyond that from the clinical examination (history, physical examination) and initial laboratory data (e.g., cholesterol level, glucose, electrocardiogram). One complexity that is not well appreciated is that the added value of a test in a given cohort may vary with the baseline characteristics of the subjects being tested (13). In addition, a test result is most likely to alter clinical management when used in intermediate-risk subjects. As reflected in Bayes’ rule, a “negative” noninvasive test in a high-risk cohort will not be sufficient to make the cohort low risk, whereas a “positive” test in a low-risk cohort will not yield post-test probabilities that are in the high-risk range. When we require a diagnostic test to change management and thereby improve outcomes so as to demonstrate value for money, it might seem as if we are discounting the common practice of using tests, particularly screening tests, to provide reassurance to the patient. From an economic analysis perspective, reassurance is a “therapy” that has the goal of improving the patient’s quality of life. Use of testing to reassure a patient that he or she is disease free, for example, can be analyzed using cost-utility analysis methods (see Cost Utility Analysis section). The challenge in such an analysis is defining how much and for how long the quality of life is improved by “good news.”
Problems with determining the effectiveness of screening
Because screening studies typically involve low-risk populations, large sample sizes with prolonged follow-up are required to assess the impact of screening on disease-related events. Large, randomized trials are the most rigorous means of determining whether an intervention improves outcome, but these trials for screening strategies are difficult to perform and expensive. Large-scale randomized trials have been performed for screening of some prevalent diseases, such as breast cancer, but not for cardiovascular disease. Even with such large randomized trials, interpretation of the results has often been controversial (14–17).
Because of the difficulty in performing randomized trials of screening for cardiovascular disease, some researchers have attempted to simulate such a trial using two observational cohorts, subjects who did and who did not have screening. Such an approach is subject to several important biases that may not be correctable analytically. The cohort that has the screening test has its disease detected at an earlier stage, introducing “lead-time” bias. Because of the earlier diagnosis, there is an appearance of improved survival (longer interval from “diagnosis” to death) in the screened cohort when in fact this is not the case (18). “Length-time” bias is another more subtle, yet particularly important problem in which patients with more aggressive disease are less likely to undergo screening merely because their disease becomes clinically manifest before they have an opportunity to show up for a screening examination (18). “Overdiagnosis” bias occurs when screening detects indolent disease that is highly unlikely to ever be clinically problematic, but leads to the impression that screening decreases the adverse impact of the disease (18). An example of this is a population screening program for neuroblastoma in children that not only yielded no benefit but also led to the discovery and treatment of clinically unimportant tumors (19,20).
Traditional survival analyses, including Kaplan-Meier product limit calculations (21), Cox proportional hazards regression (22), and parametric modeling (23), are often used to analyze observational studies of screening programs. However, these methods are all based on the assumption that time zero, which is the time that follow-up begins, is clearly defined, has some kind of clinically or biologically meaningful substrate, and is not systematically different between different groups of patients. Assessment of screening and survival outside of randomized trials is inherently problematic because time zero is not known for patients not undergoing screening. This failure to determine time zero accurately leads to length-time and lead-time biases, which cannot be rectified by survival models.
Use of decision models to evaluate screening programs
Because large, randomized trials of atherosclerosis imaging have not been performed, researchers have often employed decision-analytic methods to examine alternative screening strategies. These methods use structured mathematical simulation models to estimate the cost for some benefit achieved. One major advantage of a modeling approach is the ability to consider all available evidence rather than to be restricted to the data from one trial involving a specific limited cohort.
One drawback to the use of a decision model is that comprehensive data needed to address the questions of interest are rarely available. Few empirical studies, for example, have directly compared the accuracy of several candidate screening strategies, and none have compared all in a single cohort (24,25). To compensate for a deficiency of data, decision models use numerous assumptions based on diverse types of evidence, including expert opinion. The cobbling together of unrelated fragments of “evidence” solves the problem of populating the model with the needed parameters, but it can create the impression in unwary consumers of greater certainty than is warranted. The impact of uncertainty on model results can be formally tested using a sensitivity analysis, which involves varying each uncertain model parameter over a range of plausible values and observing the result. Multi-way sensitivity analysis involves varying more than one uncertain parameter at a time. Considerable analytical judgment is required, however, in deciding what to vary and how much variation is required. Extrapolation of model results from the published “evidence” to the general population of interest requires the use of additional assumptions about treatment and target population characteristics that might not be available or might be biased (26,27).
A second problem with the use of decision models to evaluate screening for asymptomatic atherosclerosis is that, currently, the optimal sequence of testing and screening intervals is not known. Modeling the possible permutations in a decision model can become quite complex. Thus, the analyst is required to make some simplifying assumptions about the choices the clinician and patient will make.
Principles of economic analysis relevant to cardiac imaging
Medical economics and accounting provide tools to answer two important questions relevant to any new test or therapy. First, what does it cost? Second, does it provide reasonable value for money? To assess the cost question, it is necessary to estimate not only the cost of the test itself but also the stream of costs that occur because the test was used and would not have otherwise occurred (induced costs). For example, the hospital or clinic cost to perform an EBT test may be $100. If the patient receiving that test subsequently has gated single-photon emission computed tomography (SPECT) and coronary angiogram, then these later procedures should be counted as part of the total cost of the strategy of using EBT. Thus, the cost of “screening with EBT” strategy may be significantly greater than $100 per patient.
To examine the value question, economic efficiency analysis is used to compare the incremental costs of the test strategy with its incremental benefits in a structured format. Three forms of economic efficiency analysis can be employed: cost-effectiveness, cost-utility, and cost-benefit. All three estimate the cost of producing one extra unit of benefit with the new test strategy relative to the comparison strategy (i.e., the efficiency with which benefit is generated for money spent). The most common metric used in cost-effectivenessanalysis is dollars per life-year added. Similarly, dollars per additional correct diagnosis, per high-risk patient identified, per cardiac event prevented, or per gram of myocardium salvaged are also all legitimate measures for a cost-effectiveness analysis. The major difficulty in using something other than life-years (or quality-adjusted life-years in cost-utility analysis, as described later in this document) is the lack of benchmarks with which to interpret them.
Cost-utilityanalysis, a modification of cost-effectiveness analysis, takes account of both the quality and quantity of life added by the new strategy. Utilityis a technical term that refers to the relative value or preference of the decision maker for a given health state. Although utility is related to the concept of quality of life, the techniques to measure it are different. Quality of life is typically measured with instruments that assess either functioning or well-being in a set of domains relevant to health and health care. For example, the New York Heart Association functional class measures physical functioning (crudely), whereas the Short Form-36 (SF-36) assesses both functioning (e.g., physical, role) and well-being (e.g., emotional) in nine domains.
In contrast, utility measurement evaluates how the assessor (typically a patient with the condition of interest) values a specific health state relative to defined benchmarks, such as excellent health (valued at 1.0) and death (valued at 0). The main utility measurement techniques are the standard gambleand the time trade-off. Because these are complex to use, especially in large-scale studies, recent work has favored the use of health utility indices, such as the EuroQoL. These indices are health status measures, similar to the SF-36, that have a finite number of possible health states reflecting the unique permutations of the component scales. Each unique health state has an associated population preference or utility weight, previously measured on a relevant cohort of patients or future patients (i.e., the general public). In a cost-utility analysis, length of survival in a particular health state is combined with the utility weight for that state. For example, a year of survival with mild angina that has been given a utility weight of 0.93 would equal 0.93 (1 year × 0.93 utility) quality-adjusted life-years (QALYs).
Cost-benefitanalysis is a form of economic efficiency analysis in which the incremental health benefits created by the new strategy are converted to their monetary equivalent. Because of the controversies associated with valuing health and survival in terms of money, this form of economic analysis is infrequently used in medicine.
Economic efficiency analysis is always performed incrementally, in relation to an explicitly defined alternative. In the case of screening programs, the alternative is often “no screening,” but in some situations, the relevant comparison may be with an alternative screening test or strategy. The benefits and costs of the new strategy, then, are those that occur only in the presence of the new strategy but not with the comparison strategy.
The cost of an imaging test can be subdivided into fixed and variable components. Fixed costsdo not change with procedural volume over the short-term. Examples include rent on testing laboratory space, leasing costs for test equipment, and salaried employees. These costs will be the same whether the laboratory is operating at capacity or sits completely idle. Variable costschange with unit changes in procedure volume. Examples include disposable supplies (including contrast agents) and personnel who are paid only for hours worked. The total cost of a given test is the variable cost plus a share of the fixed cost. Current estimated costs of cardiac imaging modalities are reported in Table 1.
For diagnostic cardiovascular imaging tests, equipment is a major component of fixed cost. Equipment acquisition costs vary widely, but may be as much as $1 million to $4 million for MR, positron emission tomography, and multislice CT scanners. In general, equipment for low technology tests (e.g., treadmill exercise or ankle brachial index) is much less expensive. Recent innovations for atherosclerosis imaging include the use of multi-slice (e.g., 16 slice) CT, higher strength (e.g., 3 tesla [T]) magnets, and MRI spectroscopic methods. In some cases, existing equipment can be upgraded at minimal to no cost. For example, most CT scanners can perform coronary calcium scoring by the addition of often low-cost software upgrades.
Several accounting methods can be used to allocate fixed costs. For example, the annual fixed cost may be distributed equally over the annual volume of cases performed. If a laboratory is expected to do 1,000 cases, each case would be allocated 1/1,000 of the annual fixed costs. Thus, higher volumes tend to lower the fixed component of test cost, at least until the volume increase necessitates leasing more space or equipment and hiring more personnel. For new technologies, many unresolved issues remain that may add costs, including laboratory standards or certification, imaging protocols, and evolving equipment (e.g., 4 vs. 16 multi-slice CT or 1.5 T to 3.0 T MR) (28,29).
Induced test costs (savings)
Total costs of a testing strategy include induceddownstream costs and savings. The results of a diagnostic test may lead to one or more additional tests and therapies (9). If these would not have been used in the absence of the test, then they constitute part of the induced cost of the test. Similarly, if the test in question demonstrates that other tests and therapies, which would have been done, are not required, these constitute an induced saving of the test. If the test leads to a therapy that prevents a future myocardial infarction (MI) or revascularization procedure, these savings should similarly be counted in the test’s balance sheet. Incidental test findings also drive downstream costs of care. In a recent report by Hunold et al. (30)using EBT, noncoronary abnormal findings were noted in 53% of patients, whereas specific incidental findings (e.g., lung disease) were noted in 20% of patients. In a younger cohort, the prevalence of incidental findings was 9%; one-third of which were major findings, often requiring invasive testing (31).
Other cost components
One issue that is rarely considered in most cost analyses is that the value of screening is sensitive to patient preferences (32). This is exemplified by self-referral patterns to EBT where patients’ willingness to know and pay drive its use as a screening tool. Previous reports have noted that patients with evidence of coronary calcium are more likely to consult with their physician, engage in weight loss, decrease dietary fat intake, and initiate new aspirin and cholesterol lowering medications (33). However, this increase in care-seeking behavior may also lead to an increase in worry and lower thresholds for coronary revascularization. The net result may be an increase in overall costs of care for this population. Travel costs, relevant family labor expenses, out-of-pocket costs for home monitoring and over-the-counter health care products, and insurance deductibles are all indirect costs that should be considered in an economic evaluation.
Defining cost-effectiveness of a diagnostic test
Cost-effectiveness analysis explicitly relates incremental costs to incremental health benefits. The cost-effectiveness ratio summarizes this relationship in terms of the cost required to produce one extra unit of benefit with the new testing strategy relative to the comparison strategy. The cost-effectiveness ratio takes the general form: where CE = cost-effectiveness; C = costs; HB = health benefits; New = new testing strategy; and Standard = comparison testing strategy.
The principal benchmarks for the cost-effectiveness ratio have developed through an informal consensus in the field and should not be regarded as absolute. In general, a cost-effectiveness ratio of less than $50,000 per life-year added is considered “economically attractive” (34–41), whereas a ratio greater than $100,000 per life-year added is considered “economically unattractive.” The intermediate range is an economic gray zone, and many well-accepted medical-care programs fall into this area.
These cost-effectiveness benchmarks represent a statement of societal willingness to pay for incremental health benefits. Thus, it follows that countries that spend more on health care (such as the U.S.) would be willing to accept a higher threshold for defining the zone of economic attractiveness than would countries that spend less.
Use of intermediate-outcome measures
As discussed earlier, much of the existing cardiac imaging outcomes data do not effectively link test results with post-test decision making in terms of the initiation of therapies that alter the outcome of a patient. Cost-effectiveness analysis has tremendous limitations when applied to noninvasive testing because the link between diagnosis and end results is often unknown and must be simulated in a model (42).
Given the difficulty of linking testing strategies with changes in patient outcome, some have recommended the use of intermediate-outcome measures, such as the cost to identify coronary disease or a cardiac event (6). An intermediate-outcome model would require fewer assumptions and extrapolations of long-term prognosis and would rely more upon actual observational data. The major difficulty with this type of model is that it generates a cost-effectiveness ratio for which no benchmarks have been established. Furthermore, use of an intermediate-outcome measure in a cost-effectiveness ratio does not allow for comparison across an array of medical therapeutic regimens and programs, which can be useful in using economic analysis to inform policy decision making.
The available evidence is mixed as to whether atherosclerotic imaging techniques in asymptomatic individuals add important management information over and above that contained in the Framingham risk index (43,44). For example, in the Rotterdam Study, carotid intima-media thickness measured in the common carotid artery did not improve the estimation of stroke or MI over and above a standard risk factor assessment (receiver operator characteristics [ROC curve index = 0.75 vs. 0.72]; although both risk factors and ultrasound measures were equally predictive (ROC curve index = 0.72 vs. 0.71).
Subgroup effects in cost-effectiveness analysis
A cost-effectiveness ratio is not a precise point estimate, although it is often presented that way, and it is sensitive to multiple demographic variables such as age, gender, the risk of the disease, and the analysis perspective (e.g., society, patient, payer) (42,45). For screening, cost-effectiveness ratios often become more favorable beyond a given age or risk level (where disease is more prevalent) (46,47). Further, the proportional benefit of drug treatment is highly related to the underlying risk in the patient population (48–50). For both these reasons, imaging screening tests are generally more cost-effective in higher-risk population subsets in which the test is diagnostically and prognostically more accurate. For example, using a decision model to simulate the cost-effectiveness of screening 1,000 men undergoing Doppler ultrasound for the detection of carotid artery disease during a 20-year time period, a one-time screening program in a high risk subset of the population had a cost-effectiveness ratio of $35,130 as compared to $52,588 per QALYs gained for lower risk individuals (51).
Lessons from screening for preclinical cancer
Given the limited data currently available on screening for atherosclerosis, it is instructive to examine lessons learned and challenges encountered in using diagnostic tests to screen for non-CAD preclinical disorders. Much work has been performed in developing screening for preclinical cancer. Like atherosclerosis, cancer is a major cause of adult morbidity and mortality, and it accounts for a substantial portion of clinical health care spending. Our review of this area is intended to be illustrative rather than comprehensive or authoritative.
Lung cancer is responsible each year for the greatest number of cancer deaths among adults in the U.S. By the time the disease becomes clinically evident, it is usually at an advanced stage. Five-year survival rates average about 15% (52). Thus, the disorder seems an ideal one to screen for preclinical early-stage resectable tumors. Initial randomized trials employed chest radiographs and sputum cytology (18). In about 37,000 male smokers over age 45, screening detected more early stage resectable tumors, and initial results suggested improved survival. However, no reduction in lung cancer mortality was ultimately demonstrated with the screening intervention. Screening appeared to achieve its objective (increased detection of early-stage preclinical disease), but ultimate outcome was unaffected. Some of the uncoupling between diagnosis and outcome has been attributed to the biology of the disease. Even small tumors, at the threshold of radiographic detectability, may have metastasized. Thus, by the time these tumors were detected by radiographic screening, they were beyond the point of surgical curability. In addition, it appears that another subgroup of tumors detected by preclinical screening was prognostically insignificant, and their early detection led to extra procedures without improving survival. In short, lung cancer screening appeared to fail because a significant proportion of tumors detected were either too advanced to cure or were clinically unimportant. A new generation of studies is examining the utility of a more sensitive screening test for lung cancer, low-dose helical CT scans, but it is unclear that this test will be able to rectify the limitations of earlier screening technologies.
Colorectal cancer is the third most common cause of cancer deaths in U.S. adults. Most of these cancers arise from adenomatous polyps, although less than 1% of such polyps give rise to cancer. As with lung cancer, the rationale for screening is that early detection and removal of preclinical cancers or precancerous polyps will increase survival. Most studies of screening for colorectal cancer have examined the utility of fecal occult blood testing, while a few have evaluated direct imaging studies such as sigmoidoscopy or colonoscopy (53). A recent systematic review of cost-effectiveness analyses of colorectal cancer screening found six relevant studies (53). Each used a simulation model to combine published outcome data with cost data from Medicare and prior published reports (53). In these models, screening with any of the major tests currently employed was economically attractive (cost-effectiveness ratios ranging from $6,000 to $40,000 per life-year saved). However, these results are dependent on the reasonableness of the starting assumptions and, for most of the screening tests examined, there is little empirical randomized trial evidence to validate the survival benefits projected by these models. The uncertainty in these models also makes it impossible to confidently identify the most economically attractive testing strategy from among the possible candidates (53).
Thus, although some favorable trial data support the use of occult blood testing to reduce colon cancer mortality, similar data is lacking for use of widely advocated imaging techniques, such as colonoscopy. Further, empirical trial support for screening for lung cancer is quite limited, as are data for screening for preclinical atherosclerosis. Screening does identify more early stage cases (i.e., it does risk stratify the population) and does lead to more invasive therapy, but the assumption—without empirical validation—that meeting these two criteria will lead to the desired result, improved patient outcomes, is clearly not warranted. Unfortunately, screening for preclinical disease seems so “reasonable,” so much in concordance with “common sense,” that the absence of adequate proof of desired effectiveness is often overlooked. In fact, screening may become so accepted that it is considered unethical to subject the screening strategy to a randomized test (54). Economic analyses performed in this environment are often built on weak evidence and may extrapolate even beyond this base to “discover” attractive screening strategies that have never been empirically tested.
Cost effectiveness of preclinical atherosclerosis imaging: current evidence
Initial cost estimates for screening asymptomatic populations
In the U.S., there are approximately 30 million Americans age 50 or older may be eligible for asymptomatic atherosclerosis screening, depending on how the target population is defined (55–59). The cost of screening alone could add $3 billion to our global health care costs. Detection of high-risk abnormalities ranges from 5% to 46%, depending on the age and degree of comorbidity in the population (60,61). Therefore, additional diagnostic tests following the initial screen could substantially increase the total costs, as discussed earlier (9,30,31).
There are no large prospective studies or published models describing the costs of screening intermediate-riskasymptomatic individuals for evidence of atherosclerosis are lacking. The deficiency of high-quality data comparing the costs and outcomes of different screening strategies poses a severe limitation for economic analysts wishing to examine the cost-effectiveness of alternative strategies. Consequently, we review the few studies that present cost analyses of the use of EBT, carotid duplex scans, and ankle brachial index measurement in lower-risk symptomatic populations. Although these often are not directly relevant, they do serve to illustrate some of the issues germane to screening in asymptomatic subjects. In addition, we present some data from a model that has not yet been published to illustrate some of the potential pressure points in using these tests in asymptomatic subjects.
Four studies have examined the use of EBT. The first adapted a published decision model of diagnostic testing to compare five different testing strategies in symptomatic, ambulatory patients being evaluated for obstructive CAD(6). The five testing strategies were angiography alone, or exercise treadmill, stress echocardiography, stress myocardial perfusion imaging, or EBT, followed by angiography as indicated. Four different cut points for EBT calcium scores were considered. The major data used to drive the model results were taken from published diagnostic sensitivities and specificities. The “cost” of each testing strategy in this analysis was the cost of the initial screening test performed plus the cost of angiography for that proportion of the population presumed to be referred following an abnormal initial screening examination. “Cost effectiveness” was calculated as the average cost of testing per correct diagnosis of CAD. In a low prevalence cohort, this analysis found the EBT strategies to have the lowest cost per CAD patient correctly identified. However, this result was simply a consequence of three key assumptions: 1) in the absence of testing, no correct diagnoses would be made and no patients would be referred for angiography; 2) the cost of EBT was about one-third of stress echocardiography or stress myocardial perfusion imaging; and 3) the accuracy of EBT was equivalent to both of these tests.
The second EBT analysis used a similar model to compare the cost of identifying significant CAD with exercise treadmill, myocardial perfusion imaging, or EBT in symptomatic patients with a low to intermediate pretest probability(7). This model predicted a significant cost savings per correct diagnosis with EBT. These results were similar to what the investigators observed in an empirical cohort of 207 patients with a low to intermediate probability of CAD. The results of the model were driven by assumptions of cost for EBT that were only slightly higher than for exercise testing plus an improved diagnostic accuracy.
Both of these reports present simplified models of diagnostic evaluation and contain no outcome data. A third study reported on the costs to identify coronary disease events(death or MI) in a cohort of 676 asymptomaticsubjects with one or more cardiac risk factors who were referred for EBT(9). Patients were followed for an average of 3.5 years after testing. Cost estimates were based upon direct health care costs within the Hospital Corporation of America hospital system; costs were also varied in a sensitivity analysis based on prior studies. The screening EBT cost per patient was $90 (62). Total screening and treatment costs were $1,923 per patient for low-risk subjects and $4,621 per patient for intermediate-risk subjects. Screening identified 2.6 per 100 low-risk subjects who had a subsequent cardiac event and 8.9 per 100 intermediate-risk subjects with a subsequent event. The cost per event identified was $73,000 in low-risk subjects and $37,260 in intermediate-risk subjects. Considering only death events, screening identified 5 per 1,000 deaths in low-risk subjects at a cost of $402,000 per death identified. In the intermediate-risk patients, screening identified 4.3 deaths per 1,000 at a cost of $108,400 per death identified. As noted earlier, there are no benchmarks available to interpret a cost-effectiveness ratio expressed as dollars per death identified. If each one of those “deaths identified” could be converted to “lives saved” with appropriate therapy and these saved patients lived an additional 15 or 20 years (mean age of screened cohort was 51), then it is possible that this screening could be economically attractive when valued in terms of dollars per life year saved. However, the pivotal point in this entire sequence is the assumption that EBT screening identifies patients who will die and allows their deaths to be prevented. As noted in the Section on Lessons from Screening for Preclinical Cancer, stratifying the risk of future clinical disease development and death with a test is not equivalent to showing that screening with that test will save lives.
The fourth study reviewed was a detailed decision analysis of the cost-effectiveness of EBT screening and follow-up testing in a cohort of 1,000 asymptomatic 40-year olds was examined (8). This analysis modeled the cost-effectiveness of screening EBT using decision analysis (Fig. 1) methods to determine: 1) the marginal cost per detection of “at-risk” patient, and 2) the projected marginal cost per QALYs, using favorable assumptions about the efficacy of primary prevention and the independent prognostic value of EBT. “At-risk” was defined as having a probability of a coronary event greater than or equal to 1% per year. This cutoff was chosen because primary prevention has been proven to be cost-effective only when risk exceeds this threshold, therefore identifying a population in whom intervention can make a difference—the goal of any screening program (63). As such, the prevalence of “at-risk” participants in this cohort was 7.2% using the Framingham risk model, rising to 22.4% when incorporating the results of EBT.
The costs for all variables were as follows: further cardiovascular testing was estimated at $400 if the initial follow-up test (e.g., exercise stress test) was normal, and $1,400 if abnormal (to include a cardiac catheterization). The annual cost of medications (such as statins, beta-blockers, aspirin, and perhaps angiotensin-converting enzyme inhibitors) was assumed to be $300. The cost of incidental abnormalities ranged from $50 for a minor finding requiring only a visit or phone call to reassure the patient, to $1,200 for a major finding (often requiring invasive procedures, such as a liver biopsy for a hepatic lesion, or bronchoscopy for perihilar lymphadenopathy). The baseline cost for EBT was $400, assuming there was no repeat scanning for progression.
The marginal cost of identifying each additional patient “at risk” missed with the Framingham risk model was $9,789 in the base case. This cost per diagnosis was most sensitive to the cost of EBT itself and the cost of medications. Changing the cost of the test to $800 increased the marginal cost per diagnosis to $12,421; halving the test cost to $200 resulted in a cost per diagnosis of $8,474. Varying the annual cost of medications from $100 to $600 changed the marginal cost from $5,276 to $16,565. The cost per diagnosis was not sensitive to other variables, or to the cost or frequency of incidental findings. Simultaneously varying the cost and frequency of incidental scan findings over a wide range changed the cost per diagnosis by less than or equal to $1,500.
The marginal cost per QALY saved for the base case was $86,752. This marginal cost was most sensitive to the efficacy of primary prevention, the utility placed on a year of life on medications, and the independent prognostic value of EBT. Because the purpose of any screening program is to intervene early and thereby improve outcomes, this analysis assumed a five-year decrement in survival, and a large relative risk reduction of 30%, which yields an 18-month increase in survival in those patients “at-risk.” The marginal cost-effectiveness of screening EBT is very sensitive to the relative reductions in mortality. As the efficacy of primary prevention decreases, so does the life expectancy of those at risk, and as the relative risk decreases to 25%, EBT becomes dominated by the Framingham Risk Model alone. If an intervention existed that would decrease mortality by 35%, the cost per QALY would fall to $36,076. This indicates that unless early intervention can reduce mortality by at least 25%, screening EBT would not provide any added value in this analysis. Thus, in this model, screening EBT costs at least $86,700 per QALY saved, despite liberal assumptions about the efficacy of primary prevention and the added prognostic value of EBT. However, this analysis was based on the value of screening a relatively young, low-risk population. These results would not be generalizable to older populations with a greater prevalence of intermediate risk individuals.
The adverse impact of screening tests is something that often goes unappreciated. In this model, although the impact of incidental findings was only marginal, the impact of even small but sustained decrements in health status (as reflected by utilities) had a powerful negative effect on the cost-effectiveness of the test. There are no data that directly assess the utility of being “at-risk” owing to coronary artery calcium on EBT, or any atherosclerosis imaging test. One study has shown that having calcification was associated with increased worry and hospitalization (33). In the Beaver Dam Health Outcomes Study, Fryback et al. (64)found that patients with hypertension valued a year of life at 94.4% relative to patients without hypertension. Hypertension is a reasonable surrogate for being diagnosed as “at-risk” because in both conditions the patient is asymptomatic but requires serial follow-up and interventions, including medications. Further research is needed to better understand the impact of screening imaging on quality of life in order to incorporate the patient’s perspective into any screening imaging efficacy.
One study has examined the cost of routine screening for carotid and lower extremity arterial disease in 206 patients referred for abdominal aortic aneurysm repair(65). This, of course, represents a cohort with known advanced atherosclerosis in at least one portion of the arterial tree. Cost of testing was assigned using Medicare reimbursements. Carotid duplex scans revealed significant carotid stenosis (greater than or equal to 60%) in 18% of patients. Lower-extremity Doppler studies with ankle brachial index determinations revealed significant peripheral vascular disease in 12% of patients. Seventy-one percent of patients with advanced carotid disease and 83% with advanced peripheral arterial disease had overt clinical evidence of their disease. The cost of screening was $5,445 per advanced carotid stenosis identified and $3,732 per advanced peripheral vascular disease identified. Selective screening restricted to symptomatic patients was substantially less expensive.
Serial testing or monitoring for changes in risk: use of imaging as a surrogate outcome
The analyses described so far consider simple testing strategies where a positive screening test leads to the definitive diagnostic test and therapy. However, a “real world” alternative for management of asymptomatic individuals with lesser abnormalities or with intermediate-risk imaging results is the use of serial tests. In this setting, serial testing is defined as a repeat use of the initial screening examination to identify progressive changes or improvements as a result of risk-factor reduction or other therapeutic interventions. For example, a baseline carotid MRI scan could be followed at one to two years with an additional MRI scan after intensive statin therapy. In this manner, changes from baseline to one-year on the imaging test serve as surrogate outcomes. Serial testing requires defining significant thresholds of change. The aim of serial testing is to identify patients who have progressive disease in the setting of ongoing risk-factor management and who require more aggressive management. It appears from EBT that calcium score changes of approximately 25% over one to two years are more often associated with an increased risk of nonfatal MI (66). Greater thresholds of change would be required for patients with smaller abnormalities or for modalities that are less reproducible, especially in nonexpert hands (67). Imprecision and lower reproducibility will drive unnecessary testing and costs. Serial testing at one-year intervals using Doppler ultrasound for screening of asymptomatic carotid atherosclerotic disease was found to be cost-ineffectivein one study (51).
In the use of any imaging modality for serial monitoring, subsequent medical management or risk-reducing strategies should be clearly identified. To date, medical management following asymptomatic screening and based upon evidence of subclinical disease or other risk markers has not been adequately evaluated. However, statin treatment has been reported to halt progression of atherosclerotic disease, as determined by a number of cardiac imaging modalities (68–72). In a recent crossover design clinical trial in 66 patients with coronary calcium (low-density lipoprotein [LDL] greater than 130 mg/dl) receiving cerivastatin (0.3 mg/day), the median annual relative increase at 14 months in coronary calcium was 25% during the untreated versus 9% during the treatment period (p is less than 0.0001) (70). None of these prior reports, however, have considered the specifics of medical management nor examined marginal differences between 1) one or more atherosclerosis imaging techniques (e.g., CT vs. MRI) as compared with 2) the Framingham risk equation and, possibly, emerging low-cost laboratory parameters (e.g., high sensitivity C-reactive protein).
Establishing clinical pathways for testing and downstream procedure use
One additional strategy is to examine a clinical pathway of care that includes the initial clinical risk assessment, screening test, and follow-up diagnostic procedure. Fayad et al. (73)have proposed one such approach where low cost treadmill exercise electrocardiographic testing is recommended for those patients with an intermediate CT calcium score. Additionally, for patients who have a high-risk CT calcium score, CT angiography and MR plaque characterization are recommended. This strategy attempts to allocate more expensive resources to those higher-risk individuals. As with many of the other strategies discussed above, no data on the economics of this management strategy have yet been presented, and large-scale outcome studies remain to be done.
Health policy implications and conclusions
Traditionally, medical decisions are made at the patient-physician level and focus on risks and benefits for the individual patient. Doctors are often poorly informed about the costs of tests and therapies they use, and patients are often insulated from these costs by insurance. Advances in medical diagnosis and therapy tend to progressively increase medical costs. However, payers are increasingly unwilling to spend more resources on health care. Theoretically, at least, these conflicts are resolved at the policy level. Policymakers are supposed to translate societal desires for health care and societal willingness to pay into a coherent program.
Finally, economic analysis is primarily a tool to inform the health policy debate. High-quality economic analysis, in turn, is heavily dependent on high-quality clinical outcome data. Currently, screening is an accepted strategy for reducing the morbidity and mortality of certain serious diseases through early detection and intervention (74). In the arena of screening for preclinical atherosclerosis, however, neither the clinical database nor the economic data have reached a satisfactory level of maturity. Thus, whether atherosclerosis imaging techniques could further reduce coronary heart disease mortality at an economically attractive price remains to be established.
1. Cost-effectiveness data are increasingly being applied to the evaluation of imaging technology. A requisite amount of high-quality clinical effectiveness data is necessary for the determination of an added economic benefit. To date, for atherosclerosis imaging, there is a paucity of high-quality clinical outcomes and economic data for review. Thus, an important need exists for long-term outcomes data to be developed for all of the newer imaging modalities in order to inform potential models of cost-effectiveness.
2. Standards for defining cost-effectiveness include the amount of resources or costs required so as to achieve a given clinical benefit. Such standards have been developed from therapeutic intervention data and models. Benchmarks and thresholds for defining cost-effective care defined by those standards may not be directly applicable to the use and application of imaging modalities to detect subclinical atherosclerosis and define risk of future events. As such, professional societies and stakeholder government agencies as well as senior leaders in health care economic analysis should convene to create and define standards for evaluating imaging procedures with regard to costs and outcomes.
3. Current clinical and economic effectiveness analyses are hampered by a lack of clinical algorithms with noted inputs for serial testing, post-test treatment strategies, resultant proportional risk reduction, as well as induced resource consumption levels with a variety of atherosclerosis imaging modalities. Future research in the area of atherosclerosis imaging must provide more definitive data regarding to the links between the initial imaging procedure and results and subsequent downstream testing and treatment effectiveness.
4. The aim of a cost-effectiveness analysis is to guide health care payers and regulators in the evaluation of new therapies and technologies in the setting of standards for use, reimbursement, and for approving use. Substantial additional data are needed for virtually all currently available and developing modalities of atherosclerosis imaging prior to the support of any techniques being considered as cost-effective.
- American College of Cardiology Foundation
- ↵Medicare allowed charge data for procedures commonly performed by cardiologists. www.acc.org/advocacy/advoc%5Fissues/impactchart.htm. Accessed on March 24, 2003
- O’Rourke R.A.,
- Brundage B.H.,
- Froelicher V.F.,
- et al.
- Rumberger J.A.,
- Behrenbeck T.,
- Breen J.F.,
- Sheedy P.F.
- O’Malley P.G.,
- Greenberg B.,
- Taylor A.J.
- Shaw L.J.,
- Callister T.,
- Raggi P.
- ↵Mark DB. Decision making in clinical medicine. In: Braunwald E, editor. Harrison’s Principles of Internal Medicine. 15th ed. The McGraw-Hill Companies, 2001;8–14
- (1991) Guidelines and indications for coronary artery bypass graft surgery. A report of the American College of Cardiology/American Heart Association Task Force on Assessment of Diagnostic and Therapeutic Cardiovascular Procedures (Subcommittee on Coronary Artery Bypass Graft Surgery). J Am Coll Cardiol 17:543–589.
- Kaplan E.,
- Meier P.
- ↵Cox P. Regression models and life tables. J R Stat Soc 1972;34:187–220
- ↵Blackstone EH, Naftel DC, Turner MEJ. The decomposition of time-varying hazard into phases, each incorporating a separate stream of concomitant information. J Am Stat Soc 1986;81:615–24
- Oei H.H.,
- Vliegenthart R.,
- Hak A.E.,
- et al.
- Kajinami K.,
- Seki H.,
- Takekoshi N.,
- Mabuchi H.
- Tosteson A.N.,
- Weinstein M.C.,
- Hunink M.G.,
- et al.
- Mowatt G.,
- Bower D.J.,
- Brebner J.A.,
- Cairns J.A.,
- Grant A.M.,
- McKee L.
- Hunold P.,
- Schmermund A.,
- Seibel R.M.,
- Gronemeyer D.H.,
- Erbel R.
- ↵Gold MR, Siegel JE, Russell LB, Weinstein MC. Cost-effectiveness in health and medicine. Oxford University Press, New York 1996
- Office of Technology Assessment
- Goldman L.,
- Garber A.M.,
- Grover S.A.,
- Hlatky M.A.
- del Sol A.I.,
- Moons K.G.,
- Hollander M.,
- et al.
- Detrano R.C.,
- Wong N.D.,
- Doherty T.M.,
- et al.
- Krumholz H.M.,
- Weintraub W.S.,
- Bradford W.D.,
- Heidenreich P.A.,
- Mark D.B.,
- Paltiel A.D.
- Derdeyn C.P.,
- Powers W.J.
- American Cancer Society
- Fedder D.O.,
- Koro C.E.L.’,
- Italien G.J.
- Centers for Disease Control. www.cdc.gov/nccdphp/statbook/pdf/section3.pdf. 2003
- Wilson P.W.,
- D’Agostino R.B.,
- Levy D.,
- Belanger A.M.,
- Silbershatz H.,
- Kannel W.B.
- Greenland P.,
- Smith J.S. Jr..,
- Grundy S.M.
- Grundy S.M.,
- Pasternak R.,
- Greenland P.Smith S. Jr..,
- Fuster V.
- Newman A.B.,
- Naydeck B.L.,
- Sutton-Tyrrell K.,
- Feldman A.,
- Edmundowicz D.,
- Kuller L.H.
- Shaw LJ, Raggi P, Schisterman EF, Berman DS, Callister TQ. Prognostic value of cardiac risk factors and coronary artery calcium screening for all-cause mortality. Radiology. In press
- Fryback D.G.,
- Lawrence W.F.,
- Martin P.A.,
- Klein R.,
- Klein B.E.
- O’Leary D.H.,
- Polak J.F.,
- Kronmal R.A.,
- Manolio T.A.,
- Burke G.L.,
- Wolfson S.K. Jr.
- Achenbach S.,
- Ropers D.,
- Pohle K.,
- et al.
- Smilde T.J.,
- van Wissen S.,
- Wollersheim H.,
- Trip M.D.,
- Kastelein J.J.,
- Stalenhoef A.F.
- Corti R.,
- Fayad Z.A.,
- Fuster V.,
- et al.
- Fayad Z.A.,
- Fuster V.,
- Nikolaou K.,
- Becker C.
- Shaw L.J.,
- Julvagh S.L.,
- Jacobson C.,
- et al.
- Defining the effectiveness of diagnostic tests for economic analysis
- Principles of economic analysis relevant to cardiac imaging
- Lessons from screening for preclinical cancer
- Cost effectiveness of preclinical atherosclerosis imaging: current evidence
- Health policy implications and conclusions
- Future directions