Author + information
- Received December 7, 2011
- Revision received March 5, 2012
- Accepted April 2, 2012
- Published online July 24, 2012.
- ↵⁎Reprint requests and correspondence:
Dr. Sharon Einav, Shaare Zedek Medical Center Affiliated With the Hebrew University, P.O. Box 3235, Jerusalem 91031, Israel
Objectives The aim of this study was to determine the added value of the serum biomarkers S100 and neuron-specific enolase to clinical characteristics for predicting outcome after out-of-hospital cardiac arrest.
Background Serum S100 beta (S100B) and neuron-specific enolase concentrations rise after brain injury.
Methods A prolective observational study was conducted among all adult survivors of nontraumatic out-of-hospital cardiac arrest admitted to 1 hospital (April 3, 2008 to April 3, 2011). Three blood samples (on arrival and on days 1 and 3) were drawn for biomarkers, contingent on survival. Follow-up continued until in-hospital death or discharge. Outcomes were defined as good (Cerebral Performance Category score 1 or 2) or poor (Cerebral performance category score 3 to 5).
Results A total of 195 patients were included (65.6% men, mean age 73 ± 16 years), with presenting rhythms of asystole in 61.5% and ventricular tachycardia or ventricular fibrillation in 24.1%. Only 43 patients (22.0%) survived to hospital discharge, 26 (13.3%) with good outcomes. Patients with good outcomes had significantly lower S100B levels at all time points and lower neuron-specific enolase levels on days 1 and 3 compared with those with poor outcomes. Independent predictors at admission of a good outcome were younger age, a presenting rhythm of ventricular tachycardia or ventricular fibrillation, and lower S100B level. Predictors on day 3 were younger age and lower day 3 S100B level. The area under the receiver-operating characteristic curve of the admission-day model was 0.932 with and 0.880 without biomarker data (p = 0.027 for the difference).
Conclusions Risk stratification after out-of-hospital cardiac arrest using both clinical and biomarker data is feasible. The biomarkers, although adding an ostensibly modest 5.2% to the area under the receiver-operating characteristic curve, substantially reduced the level of uncertainty in decision making. Nevertheless, current biomarkers cannot replace societal considerations in determining acceptable levels of uncertainty. (Protein S100 Beta as a Predictor of Resuscitation Outcome; NCT00814814)
- biological markers
- brain injuries
- cardiac arrest
- cardiopulmonary resuscitation
- clinical decision making
- NSE protein
- S100 proteins
Hypoxic brain injury remains a leading cause of mortality and morbidity after cardiopulmonary arrest with return of spontaneous circulation (ROSC) (1,2). Survivors of cardiac arrest often require lengthy intensive care admission, rehabilitation, and ongoing treatment of chronic complications as a result of poor functional outcomes (3); however, correct prediction of such outcomes remains elusive.
Protein S100 beta (S100B) is a calcium-binding protein expressed mainly in human astroglial cells. Because astroglia are as sensitive as neurons to hypoxia, serum S100B levels have the potential to be a surrogate marker for neuronal damage and damage to the blood-brain barrier. S100B is eliminated by the kidneys (4) and has an estimated biological half-life of 2 h (5); thus, constant elevation of S100B level in the serum reflects its continuous release from damaged tissue.
Neuron-specific enolase (NSE) is a dimer intracellular enzyme of glucose metabolism localized predominantly in neuronal cytoplasm (6). Serum NSE levels rise more slowly than S100B levels but are more specific for neuronal damage (7). The guidelines of the American Academy of Neurology for predicting outcomes in comatose survivors of cardiac arrest state that “serum NSE levels 0.33 mg/l at days 1–3 after cardiopulmonary arrest accurately predict poor outcome” (1). These guidelines, based on a systematic review of the existing published research, state that there are insufficient data to support or refute the use of other markers for prognostication and recommend the performance of additional research on biomarkers.
S100B and NSE serum levels rise in clinical situations reflecting each of the 3 classic models of brain injury: hypoxia (8), trauma (9–11), and ischemia (12–16). In the past decade, several commercial methods for measuring the blood concentrations of both S100B and NSE have been developed, making the use of these biomarkers technically simple. Despite this, neither biomarker is being used in clinical practice because of concerns regarding their discriminant power.
The purpose of the present study was to determine how biomarker and clinical data may be integrated to develop a model for the early prediction of outcome at hospital discharge and consequently to inform clinical decision making (“outcome” meaning neurologically intact survival vs. in-hospital death or survival with serious residual neurological deficit). The hypothesis was that clinical and laboratory characteristics, the latter determined by commercially available kits and recorded at clinically convenient times (on arrival, on the morning of day 1, and on the morning of day 3) in a nonselective population can be used to develop models for various levels of certainty in classification. The timing of the blood draws was based on previous observations regarding maximal increases in S100 and NSE levels after brain damage, which do not occur concurrently (12–16). Because the present study's aim was to assess how joint modeling of clinical and biomarker data may inform decision making, both biomarkers were included to more fully capture the laboratory picture. The models serve to demonstrate how clinicians may select the likelihood of misdiagnosis (of a patient who would most probably survive with a good neurological outcome) acceptable in their clinical practice.
Within the framework of an extensive ongoing study of nontraumatic out-of-hospital cardiac arrest (OHCA) in the Jerusalem district, this prolective study was conducted over a period of 3 years (April 3, 2008 to April 3, 2011). After approval was obtained from the local institutional review board, data were recorded in real time on all patients with nontraumatic OHCA, age ≥18 years, who were brought after ROSC to the Shaare Zedek Medical Center.
Clinical setting and inclusion and exclusion criteria
Shaare Zedek Medical Center, the second largest hospital in the Jerusalem district, is a 600-bed, university-affiliated acute care facility. All patients with OHCA who underwent attempted resuscitation within the Jerusalem district and survived to admission to Shaare Zedek Medical Center were eligible for blood collection. Excluded were patients whose arrest was triggered by acute hemorrhage, hanging, or drowning.
OHCA was defined as the absence of either spontaneous respiration or palpable pulse or both, documented by the national emergency medical services while attending an emergency call at any location within the Jerusalem district that was not an acute care facility. Arrival was the time a patient was admitted to the hospital. Day 0 was the day the event occurred (until midnight). Day 1 was the day after the event occurred (midnight to 11:59 pm); further days were defined accordingly. A poor outcome was defined as a Cerebral Performance Category (CPC) score of 3, 4 (severe neurological impairment), or 5 (death) at the time of hospital discharge. A good outcome was defined as a CPC score of 1 or 2 (good to moderate neurological outcome) at the time of discharge from the hospital (17–19).
No change was made in standard patient therapy. Jerusalem district emergency medical services resuscitation is performed in accordance with American Heart Association guidelines. All resuscitation attempts are recorded at the location of the arrest on standard forms by emergency medical services staff members. Event data collection was in accordance with accepted guidelines (20). Additional data were extracted on a daily basis from hospital emergency department and admission files as required.
Contingent on patient survival, blood samples for serum S100B and NSE levels were drawn at the following times: hospital arrival, the morning of day 1, and the morning of day 3. Because it was our intention to examine the value of biomarker testing as part of routine patient treatment, sampling times coincided either with the establishment of intravenous access or with routine blood testing for noninvestigational purposes.
Throughout the period of data collection, both treating staff members and investigators were blinded to the results of the investigational blood tests. Outcome was assessed within the 24 h before discharge using the CPC score (21). CPC score was used as the instrument for assessing outcome because a high CPC score implies a very low likelihood of a good Health Utilities Index score (22). CPC score was assessed by a single trained research nurse (N.K.) and was not recorded in the patient's chart. Patients were followed by study staff members until either hospital discharge or death, whichever occurred first.
Laboratory testing of blood samples
Serum S100B and NSE were measured using the LIASON analyzer (DiaSorin, Saluggia, Italy) using 2 different DiaSorin immunometric chemiluminescence assays (sandwich principle) on the basis of paramagnetic particles coated with monoclonal antibodies and monoclonal tracer antibody labeled with an isoluminol derivate. The light signal, and hence the amount of isoluminol-antibody conjugate, is measured by a photomultiplier as relative light units and is indicative of the sample concentration of S100B or NSE. Interassay coefficients of variation were <15% and <5% for S100B and NSE, respectively.
Primary outcome measure
Poor versus good patient outcome at discharge (see “Definitions”) was the measure used to test the study hypotheses of improved prediction attributable to S100B and NSE concentration.
The study cohort included all enrolled patients. Descriptive statistics were used for patient sociodemographic and event characteristics. Categorical variables (e.g., patients' sex and survival status) are expressed as percents. Numerical variables (e.g., patient age, biomarker levels) are presented with their means, standard deviations, medians, interquartile range, and maximal and minimal values. The nonparametric Friedman rank test was used for within–outcomes group comparisons of biomarker levels at the different sampling times, with correction of p values to account for multiple testing.
Unadjusted differences between the 2 outcome groups (poor outcome [CPC score 3 to 5] vs. good outcome [CPC score 1 or 2]) in the biomarker variables (S100B and NSE levels) were assessed using the Mann-Whitney U test (because their distributions were not normal), with the Bonferroni correction for multiple comparisons. Each of the 6 tests was performed with α1 = 0.05/6 = 0.0083. Biomarker data are presented using box plots.
The main analysis used multivariate logistic regression modeling (forward stepwise) to predict the probability of a poor outcome. Interactions between the presenting rhythm of ventricular tachycardia (VT) or ventricular fibrillation (VF) and biomarker levels were sought and not found. After determining the acceptable specificity of the relevant combination of variables for predicting a poor outcome (to minimize the likelihood of misclassification of a patient with survival potential), prediction models for a poor outcome (yes or no) were created for day 0 (decision support for the Department of Emergency Medicine) and for day 3 (decision support for the intensive care unit or ward staff). The day 0 model included all patients who contributed blood samples on arrival. Explanatory variables were the corresponding biomarker values and patient age, sex, and presenting rhythm. The day 3 model included all survivors to day 3 who had contributed 2 blood samples; to derive a more stable estimate, a model was created on the basis of the 74 observations available for days 1 and 3 rather than the 66 observations available for admission and day 3. Explanatory variables included the biomarker values of samples 2 and 3; patient age, sex, and presenting rhythm; and treatment with therapeutic hypothermia. Because of their non-normal distribution, biomarker levels underwent logarithmic transformation before inclusion in the models.
A receiver-operating characteristic (ROC) curve analysis was performed for each model, yielding both an estimate of the area under the curve (AUC) and cutoff values that can be used for prediction. DeLong's 1-sided test was used to compare the ROC curves with the biomarker data to those without. Data were analyzed using SPSS version 18.0 (SPSS, Inc., Chicago, Illinois) and R version 2.13.1 (23,24).
Of the 250 patients who were screened, 55 patients were excluded because they arrived at the emergency department in an agonal state, 195 were eligible for study inclusion, and 184 contributed blood samples. The demographic and event details of the eligible compared with the ineligible patients are shown in Table 1. The eligible population was slightly younger but otherwise did not differ markedly from their ineligible counterparts.
Participants were mostly men (n = 128 [65.6%]), with a mean age of 73 ± 16 years (range: 19 to 111 years). Their presenting rhythm was most often asystole (n = 120 [61.5%]), VT or VF (n = 47 [24.1%]), or pulseless electrical activity (n = 24 [12.3%]). Of the 195 patients studied, 39 died in the emergency department, an additional 113 patients died later during admission, and 43 (17.2%) survived to hospital discharge. Thirty of the 47 patients who presented with VF were treated with hypothermia.
Biomarkers were sampled as follows: from 158 of the 195 patients on arrival (in 37 cases, the research nurse was not notified regarding patient arrival), 32 of whom survived (20%), 19 with good outcomes (12%); from 101 of the 126 patients alive on day 1, 40 of whom survived (40%), 25 with good outcomes (25%); and from 74 of the 87 patients alive on day 3, 37 of whom survived (50%), 24 with good outcomes (32%) (Fig. 1).
Median levels were consistently higher at each time point in patients with poor outcomes (7.7, 1.8, and 1.4 μg/l at admission, day 1, and day 3, respectively) compared with patients with good outcomes (2.3, 0.3, and 0.2 μg/l, respectively (p < 0.0083 for each between-group comparison) (Fig. 2A).
Median S100B levels decreased sharply from admission to day 1 within each group (corrected p = 0.002 and p = 0.006 for the poor and good outcomes, respectively). The within-group change from admission to day 3 was steep in both groups (corrected p = 0.002 and p < 0.001, respectively), whereas the decline from day 1 to day 3 was minimal.
Median levels at arrival did not differ significantly between patients with poor and good outcomes (37 and 28 μg/l, respectively, p = 0.059), but subsequently larger differences emerged (35 vs. 22 μg/l for day 1 and 61 vs. 16 μg/l for day 3) (p < 0.0083 for days 1 and 3) (Fig. 2B).
NSE levels declined in patients with good outcomes (Friedman rank test p = 0.007) and increased in patients with poor outcomes (p = 0.022). Significant changes were observed in both groups on day 3, although in opposite directions (corrected p = 0.018 and p = 0.008 for poor and good outcomes, respectively, for the difference between arrival and day 3).
Day 0 model (n = 158)
Table 2 presents the factors predicting a poor outcome on arrival. Model estimation without the biomarkers (with age > 67 or ≤ 67 years; i.e., the youngest age tertile) and VT or VF [yes or no] as the only explanatory variables) resulted in an AUC of 0.880 (95% confidence interval: 0.806 to 0.954). In a model that included both biomarkers, the level of NSE became nonsignificant, and the model retained only 3 significant variables: age, VT or VF, and the level of S100 at admission. ROC analysis performed on this model yielded an AUC of 0.932 (95% confidence interval: 0.887 to 0.976) (Fig. 3). Using DeLong's 1-sided test to compare the 2 correlated ROC curves with and without S100 yielded a p value of 0.027.
The model was used to tabulate the probability cutoff values (for a poor outcome) with their respective sensitivities, specificities, and positive and negative predictive values (Online Appendix). For example, if the physician decides that specificity must be at least 0.90, then for patients with values above 0.92, we predict a poor outcome. The sensitivity at this cutoff value was 0.79, and the exact specificity was 0.95 (i.e., there is a 5% likelihood of misdiagnosing a patient with a good prognosis).
Other cutoff values (that are meaningful to clinicians and socially acceptable) can be determined for selected values of specificity and sensitivity. For the aforementioned cutoff value of 0.92, we derived the inequality BX > ln [0.92/(1 − 0.92)] from the model of the poor outcome probability estimated at admission (Pr) by the following:
From this inequality, cutoff values for predicting a poor outcome despite the achievement of ROSC and survival to hospital admission for various combinations of our variables may be determined (Table 3).
Day 3 model
Among the patients who survived to day 3, the only variables to contribute independently to the model were a presenting rhythm of VT or VF and the serum level of S100B measured on day 3 (Table 4). ROC analysis performed on the day 3 model yielded an AUC of 0.931 (95% confidence interval: 0.873 to 0.989) (Fig. 4). Using the same method as above (the table derived from this model is not shown), we may determine that if the presenting rhythm was not VT or VF, S100B > 0.195 μg/l predicts a poor outcome, and if the presenting rhythm was VT or VF, serum S100B > 0.566 μg/l predicts a poor outcome (sensitivity 0.86, specificity 0.92).
The present study demonstrates how modeling of clinical data together with neurological biomarker values can assist in predicting outcome after OHCA within ethically acceptable safety margins. A local cohort of post-ROSC patients provided pilot estimates of cutoff values for S100 levels on arrival to the emergency department. Although biomarker data independently contributed an ostensibly modest 5.2% to the AUC, they substantially reduced the probability of misclassification error compared with that based solely on clinical criteria. When a presenting rhythm of VT or VF and the lowest age tertile were added as examples of key clinical covariates, cutoff values for S100 could be tabulated with their specificity and sensitivity characteristics and predictive values, providing a range of therapeutic limits to be discussed by ethicists and policy makers.
This study is unique in its examination of the effect size of adding biomarkers to clinical data for prognostication after cardiac arrest. Previous studies of S100B and NSE levels on hospital arrival included smaller sample sizes (13,14,25–28). Sampling of biomarkers at time of hospital arrival is rarely performed; the few studies that did sample S100B at this time and sought outcomes other than death also identified significant between-group differences in S100B levels at this time (13,14,25). Although a larger number of studies have examined NSE levels on arrival, their findings were not definitive (29). The present study takes the additional step of demonstrating how these data can be used by clinicians; all patients were sampled at the time of routine blood testing, and no change was made in care to emulate a real-life situation. In most studies, S100B and NSE were sampled at preset times after suspected onset of cardiac arrest. Our study end point was differentiation between poor and good outcomes, the latter including only surviving patients with good neurological outcomes. Shinozaki et al. (29) found 16 studies addressing the clinical usefulness of NSE or S100B as a prognostic predictor of neurological outcomes and 5 of functional outcomes; rarely did any study that involved blood sampling both on arrival and as an integral part of routine care (rather than at pre-specified times from arrest or ROSC) also seek an outcome other than death.
Several studies have merged biomarker and clinical data. Zingler et al. (25) investigated NSE and S100B on days 1, 2, 3, and 7 as well as somatosensory-evoked potentials recorded within 48 h and on day 7 after ROSC (n = 27) but did not integrate these data into a single model. Prohl et al. (30) performed a multivariate logistic regression analysis of NSE and S100B levels together with standardized clinical examinations (days 2 to 4) and short-latency and long-latency somatosensory-evoked potentials (n = 80). They constructed a model in which 85% of the variance in the neurological outcome (dichotomized Glasgow-Pittsburgh CPCs) was explained by age together with the clinical examination score and the level of NSE on day 4. Grubb et al. (31) performed a multiple regression on S100B and NSE concentrations (sampled within 24 to 48 h of ROSC), age, a social deprivation score, National Adult Reading Test score, and a local clinical prognostic score (including a presenting rhythm of VT or VF, bystander cardiopulmonary resuscitation, and Glasgow Coma Scale score on admission). They found that NSE and S100 concentrations were significant independent predictors of the Rivermead Behavioural Memory Test score, which was the principal outcome measure in their study (n = 105).
NSE levels on arrival do not consistently correlate with poor outcomes (29). It is therefore unsurprising that determination of NSE in addition to S100B at this time did not improve the likelihood of a correct prediction. However, NSE rises more slowly than S100B and should therefore have been of value in the day 3 model. Our seemingly unexpected finding may stem from the large number of in-hospital deaths, which led to a small sample size and hence limited power for the day 3 analyses. We also a priori selected the model on the basis of a greater number of cases (day 3 rather than day 1) to achieve a more stable estimate; in this model, NSE was not contributory. However, others have also suggested that S100B may be the better marker of the 2 (32).
The present study had several limitations, the key one being the need for a larger sample size to generate stable probability estimates. Several investigators have suggested that the time courses of S100B and NSE may provide more useful information than isolated sampling (29,31). We found no advantage to this method as assessed by repeated-measures analysis, perhaps because of the high in-hospital mortality rates in our sample, which diminished our sample size and thus the study power for day 3 analysis. A multicenter study is needed to increase both the stability (i.e., sample size) and the generalizability of the estimates. Lack of a validation cohort limits drawing stable inferences regarding survival from the sample reported here. However, a goal of this study was to demonstrate how joint modeling of clinical and biomarker data may inform the design of future analyses on substantially larger datasets.
Most studies of biomarkers in post-ROSC patients seek simple cutoff points, ignoring patient demographic and clinical characteristics. Few clinicians would agree to cease ongoing resuscitation efforts on the basis of the results of blood tests alone. However, many physicians would not hesitate to modify treatment in accordance with the likelihood of success. Stratification based on combined clinical and laboratory data should thus be the preferred clinical approach for the interpretation of biomarkers of brain damage. This said, modeling still cannot replace societal responsibility in determining the acceptable level of uncertainly in decision making. Parallel to searching for the infallible brain biomarker, policy makers should determine the level of risk for misclassification acceptable to society within this clinical setting.
The authors wish to thank their consultant statistician, Nurith Strauss-Liviatan, for her valuable contribution to the data analysis.
For supplementary data, please see the online version of this article.
This study was supported by grant 3-00000-3160 from the Chief Scientist Office of the Ministry of Health of Israel. All authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- area under the curve
- Cerebral Performance Category
- neuron-specific enolase
- out-of-hospital cardiac arrest
- receiver-operating characteristic
- return of spontaneous circulation
- S100 beta
- ventricular fibrillation
- ventricular tachycardia
- Received December 7, 2011.
- Revision received March 5, 2012.
- Accepted April 2, 2012.
- American College of Cardiology Foundation
- Wijdicks E.F.,
- Hijdra A.,
- Young G.B.,
- Bassetti C.L.,
- Wiebe S.,
- Quality Standards Subcommittee of the American Academy of Neurology
- Berger R.P.,
- Adelson P.D.,
- Richichi R.,
- Kochanek P.M.
- Martens P.,
- Raabe A.,
- Johnsson P.
- Bottiger B.W.,
- Mobes S.,
- Glatzer R.,
- et al.
- Rosen H.,
- Rosengren L.,
- Herlitz J.,
- Blomstrand C.
- Jacobs I.,
- Nadkarni V.,
- Bahr J.,
- et al.
- Safar P.
- R Development Core Team
- Zingler V.C.,
- Krumm B.,
- Bertsch T.,
- Fassbender K.,
- Pohlmann-Eden B.
- Schoerkhuber W.,
- Kittler H.,
- Sterz F.,
- et al.
- Grubb N.R.,
- Simpson C.,
- Sherwood R.A.,
- et al.