Author + information
- Received December 13, 2012
- Accepted December 19, 2012
- Published online May 7, 2013.
- H. Vernon Anderson, MD⁎,⁎ (, )
- William S. Weintraub, MD†,
- Martha J. Radford, MD‡,
- Mark S. Kremers, MD§,
- Matthew T. Roe, MD, MHS‖,
- Richard E. Shaw, PhD¶,
- Dana M. Pinchotti, BS# and
- James E. Tcheng, MD‖
- ↵⁎Reprint requests and correspondence:
Dr. H. Vernon Anderson, UTHSC Houston, 6431 Fannin, Suite 1.246, Houston, Texas 77030
Relatively little attention has been focused on standardization of data exchange in clinical research studies and patient care activities. Both are usually managed locally using separate and generally incompatible data systems at individual hospitals or clinics. In the past decade there have been nascent efforts to create data standards for clinical research and patient care data, and to some extent these are helpful in providing a degree of uniformity. Nonetheless, these data standards generally have not been converted into accepted computer-based language structures that could permit reliable data exchange across computer networks. The National Cardiovascular Research Infrastructure (NCRI) project was initiated with a major objective of creating a model framework for standard data exchange in all clinical research, clinical registry, and patient care environments, including all electronic health records. The goal is complete syntactic and semantic interoperability. A Data Standards Workgroup was established to create or identify and then harmonize clinical definitions for a base set of standardized cardiovascular data elements that could be used in this network infrastructure. Recognizing the need for continuity with prior efforts, the Workgroup examined existing data standards sources. A basic set of 353 elements was selected. The NCRI staff then collaborated with the 2 major technical standards organizations in health care, the Clinical Data Interchange Standards Consortium and Health Level Seven International, as well as with staff from the National Cancer Institute Enterprise Vocabulary Services. Modeling and mapping were performed to represent (instantiate) the data elements in appropriate technical computer language structures for endorsement as an accepted data standard for public access and use. Fully implemented, these elements will facilitate clinical research, registry reporting, administrative reporting and regulatory compliance, and patient care.
Clinical research studies are usually organized as separate and distinct efforts conducted locally at independent individual sites. Clinical information used in patient care also typically is managed locally using separate, distinct, and generally incompatible data systems at each individual institution. There has been relatively little attention focused on data exchange both in the clinical research and patient care domains. Although some limited clinical data standards exist and can be helpful in standardizing certain aspects of clinical data and providing a certain amount of uniformity, for the most part these have not been converted into accepted computer-based language structures that could be used interchangeably across computer networks. So while clinicians in different locations may think, act, and talk alike in their activities, the basic computer systems that they use to store and retrieve data locally do not, and for the most part cannot, transmit, receive, combine, analyze, and use shared data as information. As a consequence, a robust infrastructure for conducting clinical research using commonly defined and electronically exchangeable data derived directly from clinical sources does not exist in the United States.
In 2009, the National Cardiovascular Research Infrastructure (NCRI) project was initiated by the Duke Clinical Research Institute (DCRI) and the American College of Cardiology Foundation (ACCF) to create a model infrastructure for clinical research, clinical registries, and patient care (Kong et al., personal communication, January 2012). Initial funding was provided by a grant through the American Recovery and Reinvestment Act (ARRA). The four goals of NCRI are to: 1) replace the repetitive assembly and disassembly of short-lived clinical investigator networks with a stable and enduring operational infrastructure for clinical research; 2) standardize and harmonize cardiovascular data to achieve complete syntactic and semantic interoperability throughout the network; 3) coordinate and facilitate the transfer of selected, standardized cardiovascular data into existing and future national registries; and 4) develop an enduring library of content for education and training of clinical investigators and site personnel. The NCRI seeks to overcome limitations of current approaches, including the absence of streamlined, one-time data-collection activities at each independent site; lack of common data terms used by all; and the inability to transmit, receive, combine, analyze, and use shared data in comparable and interchangeable formats (interoperability).
One crucial aspect of the NCRI is establishing a universal vocabulary of cardiovascular data elements. This includes establishing all of the formal technical features that are required of a controlled vocabulary that can operate on multiple computer networks in the healthcare environment, achieving both syntactic interoperability (format, packaging, transmission) as well as semantic interoperability (unambiguous shared meanings) (1,2). This also includes disseminating widely the selected data elements and their definitions, and then eliciting feedback from, and facilitating acceptance by, all relevant parties, including investigators, sponsors, regulatory bodies, clinicians, policymakers, payors, and the general public. We describe here the methodology and principal results of the project to identify and harmonize clinical definitions of a base set of standardized cardiovascular data elements applicable to clinical research, registries, and patient care. We also seek to engage the community in efforts to absorb and integrate this distinct advance. Our work continues and expands on recent work by the American College of Cardiology Foundation/American Heart Association (ACCF/AHA) Task Force on Clinical Data Standards, which previously established a base cardiovascular vocabulary of key data elements and definitions for electronic health records (EHRs) (3). That initiative identified 99 key terms that should be available in every general-purpose EHR, terms that are interoperable and applicable to every cardiovascular subspecialty EHR and that have maximal utility across the widest spectrum of clinical settings, including clinical care and clinical research, as well as in local institutional, state, regional, and national registries and all data-interchange environments. The NCRI Data Standards Workgroup followed these same principles in its efforts to build on that foundation.
The principal investigators of the NCRI collaborated with ACCF leadership to identify appropriate members for a Data Standards Workgroup charged with undertaking this project. The eight members selected have overlapping expertise in clinical research and clinical care, information technologies, informatics, clinical registries, data-standards development, and statistical analyses. The present document was composed and written by the Workgroup.
Relationships With Industry and Other Entities
The ACCF, DCRI, and NCRI and their committees, task forces, workgroups, and other bodies all make every effort to avoid actual or potential conflicts of interest. Specifically, all members of a workgroup are required to file statements disclosing current and recent relationships that may be perceived as relevant real or potential conflicts of interest, and the same is required of all peer reviewers of a document. These disclosures for the members of this Workgroup are listed in Online Appendix 1. Comprehensive disclosure information is available online at www.cardiosource.org/ACC/About-ACC/Who-We-Are/Leadership/Guidelines-and-Documents-Task-Forces.aspx.
Review of Literature and Existing Data Elements
This Workgroup identified several tasks involved in establishing the library of core universal cardiovascular concepts (i.e., vocabulary) to be developed for this project. The first task was identifying key clinical terms from among the many available data element concepts. To begin, the Workgroup examined the data dictionaries of the ACCF National Cardiovascular Data Registry (NCDR) and the Society of Thoracic Surgeons (STS) Adult Cardiac Surgery Registry, and then systematically examined all of the current existing cardiovascular data dictionaries and standards documents published by these and other professional societies (3–10). Criteria for inclusion of a specific term (data element) from these sources was that the key clinical concept embodied in the term had the broadest utility and therefore would be collected commonly in cardiovascular clinical research investigations, including both randomized clinical trials and registries. Selection of terms from these sources was achieved by consensus of the group after review and discussion. In general, basic (simple, singular, or atomic) terms were preferred over composite terms. Once selected, all data elements were grouped into standard categories as previously outlined (3,11). These categories indicate the clinical context in which the data element is expected to be obtained or collected and reflect the usual workflow organization of information in typical clinical settings for a single episode of care. Categories are Personal History and Family History, Physical Examination (Clinical Condition) at the time of the encounter, Laboratory Values, Diagnostic Procedures, Therapeutic Procedures, Adverse Events, Medications, Discharge Information, and Outcomes.
Data Element Definitions and Consensus Development
The second task of the Workgroup was to harmonize the definitions of the elements selected, making certain that unambiguous definitions resulted. This task was intentionally focused on the needs of both the clinical care and clinical research communities, as one objective of NCRI is to promote and foster cross-domain compatibility (clinical and research) while accomplishing semantic interoperability. Nearly all clinical terms considered had multiple-source definitions. However, on closer examination, many source element definitions were the same or very nearly so. This reflects prior work harmonizing the NCDR registries and the STS Adult Cardiac Surgery Registry with existing clinical data standards. Where differences remained, the Workgroup used a hierarchical approach to select a final definition. Preference was given to sources as follows (sources shown in Table 1): 1) ACCF/AHA Adult Cardiovascular Vocabulary for EHR (3); 2) NCDR-STS harmonized data elements (7); 3) other ACCF/AHA Task Force on Clinical Data Standards–endorsed elements (3–8); and 4) other published data standards (9,10). The ACCF/AHA Adult Cardiovascular Vocabulary for EHR (containing 99 elements) was given highest priority because it is the most recently completed data-standardization effort and was developed specifically for EHR systems. Nonetheless, this hierarchy was not absolute and rigid; definitions were selected for best unambiguous structure and wording in the judgment of the Workgroup, regardless of source. Every element from every source was thoroughly reviewed and discussed. When inconsistencies, discrepancies, inaccuracies, ambiguity, or other substantive issues were discovered in existing data elements or definitions, the Workgroup proposed resolutions for consideration by the ACCF/AHA Task Force on Clinical Data Standards.
The Workgroup was assisted by informatics staff of the ACCF and DCRI, with additional help from two other organizations (further described subsequently): the Clinical Data Interchange Standards Consortium (CDISC) and Health Level Seven International (HL7) (12,13). Staff members provided technical informatics support for the project, including representation of elements and terms in a standard machine readable information model developed according to the specifications in the National Cancer Institute (NCI) Data Standards Repository (caDSR) (14). Materials were assembled by staff and circulated by e-mail. The work was conducted in a series of telephone conference calls and e-mail exchanges beginning in June 2010 and concluding in October 2011. In addition, there was one face-to-face meeting held during the ACC Scientific Sessions in March 2011.
Relations to Other Standards
As described earlier, the Workgroup reviewed available published data standards and current national registry data elements. From these source materials, a circumscribed set of data elements, along with single best definitions, was selected to serve as an initial cardiovascular data standard for computer network implementation in clinical research, clinical registries, and patient care activities.
Technical Development for Endorsement as a Recognized Data Standard
The final task for the Workgroup and supporting staff was to represent (instantiate) the selected vocabulary within accepted EHR technical language standards and publish it in a publicly available data library (15,16). The NCRI leadership and staff therefore contacted and collaborated with CDISC and HL7 as the two relevant international standards organizations working in this segment of the healthcare environment. Although likely not widely known among clinicians, the CDISC and HL7 technical standards are broadly accepted and have been generally adopted within the information technology platforms of both the patient care and clinical research communities (12,13). For example, the HL7 Reference Information Model (RIM), along with its clinical documents standards and EHR functional profile, are widely recognized as the international technical standard for clinical information systems. The CDISC Study Data Tabulation Model (SDTM) and the Clinical Data Acquisition Standards Harmonization (CDASH) are technical standards used for clinical research data collection and exchange between different organizations, for data comparisons across different clinical trials, and for electronic data submission to regulatory agencies (9,17). The SDTM accommodates metadata (data format and content tags), which facilitate interoperability and data exchange. The U.S. Food and Drug Administration endorses the submission of clinical data in this standard for regulatory review purposes. The NCRI staff therefore created a Unified Modeling Language representation of elements as a Cardiovascular Domain Analysis Model, mapping the model to the specifics required for CDISC SDTM and HL7 RIM. The NCRI data elements were then matched with concept codes assigned by NCI Enterprise Vocabulary Services (EVS). The entire set of cardiovascular concepts will be published in the NCI EVS for public access and use (18). The data model will be imported into the NCI caDSR and linked with the metadata tags required for full and complete semantic interoperability. This means that these 353 selected cardiovascular data elements should be fully exchangeable across computer networks and within EHR structures, which previously has not been possible.
Peer Review and Approval
Drafts of this report and the core set of cardiovascular data elements (excluding the technical representations required for CDISC and HL7 endorsement), were reviewed by the ACCF/AHA Task Force on Clinical Data Standards, and discussed at the Task Force Meeting at the ACC Scientific Sessions in March 2012, with comments transmitted back to the Workgroup. The final version was reviewed and approved by the chairs of the Research and Publications committees of the NCDR registries, and also by the chair of the Science and Quality Oversight Committee. The Workgroup fully acknowledges and anticipates that these standardized data elements and definitions will require regular review and updating, as occurs with all other published guidelines, data standards, performance measures, and appropriateness criteria. NCRI staff will monitor and receive feedback and periodically review the controlled vocabulary work product to ascertain whether modifications should be considered.
Adoption and implementation of the cardiovascular data standards presented here should improve interoperability, accuracy, and efficiency in all domains: administrative, regulatory, clinical research, and patient care. Dependable and reliable data exchange should reduce errors caused when multiple transcriptions occur, with the same data being entered into several systems. At the local site level, this will facilitate efforts to extract and review local data and to transmit data to other entities, such as the large national registries. Combining uniform data from multiple sites for larger scale analyses will also be possible. Linkages of extracted data with administrative and long-term data records will facilitate longitudinal follow-up of specific patient groups of interest. Such linkages with outside data sources may have advantages over the direct clinical follow-up of patients and may be more efficient and more complete, especially for larger patient groups and for very long-term analyses. The Center for Medicare and Medicaid Services (CMS) Medicare Provider Analysis and Review (MEDPAR) data files are an example of external data linkages that might be made. Linkages with longitudinal databases may provide opportunities to assess long-term mortality, hospital readmissions, subsequent procedures, and various other outcomes of interest. This is likely to enhance the study of the long-term safety and efficacy of drugs and devices in widespread clinical practice after the initial approval of a drug or device. Furthermore, clinical effectiveness and patient-centered outcomes research comparing a variety of options could be conducted, and evidence-based practice recommendations developed and validated (19,20). Such efforts align with other national efforts to improve the clinical patient care domain, specifically the implementation of clinical decision-support tools, with the compilation and return of patient-specific, clinician-specific, and institution-specific data back to the point of care from which it originates. These efforts furthermore are significant steps toward achieving the goals of the CMS Meaningful Use program, including the use of certified EHR technologies for the purposes of exchanging health information to improve patient care (21). All of this is consistent with the policies of the national professional societies and conforms to the recent policy statement from the AHA on expanding the applications of existing and future clinical registries (22).
National Cardiovascular Research Infrastructure Data Elements
From the various sources examined, the Workgroup assembled a final list of 353 elements, including a number that are intended to exist as “parent–child” relationships. Elements that were judged to be the most commonly used in cardiovascular clinical research and clinical care were selected, including all 99 of the previously developed elements for the Adult Cardiovascular EHR. The Workgroup was also keenly aware of the need for parsimony. While this initial list is meant to be comprehensive, we recognize that it may not be adequate for all purposes. Furthermore, any list of data elements will always need ongoing review, with outdated ones deleted and new ones added. The underlying concepts leading to element formation also will change over time, and periodic revisions are intended.
Data Elements by Category
The elements and their source reference locations are shown in Table 2. Only the element names along with the sources of element values and definitions are listed. Complete element specifications and definitions can be found in Online Appendix 2, as well as the NCRI Web site (www.ncrinetwork.org) and the HL7 Web site (www.hl7.org). Most of these elements were selected from existing data sources. However, nine new data elements of a minor nature were adopted by the Workgroup. These nine new elements and their definitions are shown in Table 3.
Example Representation of Data Elements
Representation of the data elements was done according to the caDSR implementation of the International Organization for Standardization (ISO) 11179 metamodel (14). An example of this representation for the physical examination assessment of Killip Class is shown in Figure 1. More details can be found in Online Appendix 2. A description of the cardiovascular domain analysis model (CV_DAM) is available at the HL7 Web site (14).
Clinical research in the United States is an enormous enterprise of great value to the nation's health. Yet the remarkable advances achieved over the past 80 years have been accomplished largely as a series of separate, organizationally distinct and disconnected efforts undertaken by individual public and private sponsors. For the most part, these were done using data-management procedures unique to each specific endeavor. Even when ultimate sponsorship has been through the federal enterprise (the National Institutes of Health and other agencies), the individual projects themselves have been dispersed and uncoordinated, and with little effort or attention focused on data interchange. There does not yet exist in the United States, Europe, or elsewhere a robust and sustainable unifying infrastructure that spans the entire translational research, clinical research, regulatory, and clinical practice continuum. Arguably, this absence leads to inefficiencies, delays, and increased costs, all of which have called into question the foundations on which our clinical research enterprise is built (23–25). In some instances, the increasing globalization of clinical research has allowed new techniques and therapies, including some that are federally funded, to become available first to other regions of the world.
It is noteworthy that the multiple available methods of data collection, storage, and transmission mostly remain generally incompatible with one another, even though they are parts of the same system involving administrative functions, patient care, clinical research, and regulatory reporting and compliance. Lack of full integration with clinical EHR systems has especially constrained efforts to coordinate information transfer, despite the fact that all of the functional areas mentioned have become increasingly interdependent. Development of standard data elements with clear and unambiguous definitions and that are compatible with EHR systems holds great promise for addressing the current absence of interoperability. The EHR thus becomes the definitive repository of valid and fully verifiable clinical data, as well as the substrate for facilitating extraction and exchange of data across multiple systems in both the clinical research and patient care domains. Properly constructed, this substrate will enable a broadly distributed yet interconnected network to facilitate information exchange with semantic interoperability among geographically dispersed sites. To begin, a single, authoritative set of interoperable data elements is needed as the basis for a unified nationwide infrastructure useful simultaneously in both clinical research and patient care. This portion of the NCRI project addresses that need.
Ideally, all clinical data captured via integrated clinical workflows into EHRs eventually will be subjected to data standards, including those endorsed by the ACCF, AHA, Society for Cardiac Angiography and Interventions, STS, and other organizations. However, the task is 2-fold. First, the relevant clinical data standards have to be created by the appropriate clinical workgroups. Then, these clinical terms and concepts must be converted into syntactically and semantically compatible computer language structures to make them interoperable across networked computer information systems. Implementing such structures for all existing clinical data standards is a daunting task and cannot be accomplished all at once. The NCDR and STS registries together contain approximately 2,400 data elements in current use. When other officially approved data elements are added, the total could grow by hundreds and possibly thousands more. The costs of fully developing the technical specifications and obtaining endorsement for all potential data elements will be quite large. Therefore, some selectivity is required initially to establish the core elements for a baseline data standard that can be put into place and then periodically modified. That was the task of this Workgroup. Ultimately, the NCRI project is intended to evolve into permanent stewardship by the ACCF of a fully accepted cardiovascular vocabulary. This stewardship will include mechanisms for constant oversight and periodic formal review and updating in response to research, development, and new discoveries. There will be continuing opportunities for engagement and involvement of all stakeholders. For one thing, much more work is needed to harmonize even these initial standardized cardiovascular data elements with other recognized administrative data formats, such as the Systematized Nomenclature for Medicine (SNOMED/CT), the International Classification for Diseases (ICD 9/10), the Logical Observation Identifiers Names and Codes for laboratory values (LOINC), and RxNorm for drugs and pharmacy systems (26–29).
The NCRI Data Standards Workgroup has assembled a set of 353 cardiovascular data elements with definitions that was designed to serve as a foundation of a national cardiovascular clinical and research infrastructure. The vast majority of elements were identified from already existing sources. This work builds on earlier efforts to establish a base cardiovascular vocabulary for EHRs, and it includes all of the technical developments required for adoption as an international standard. Once fully adopted and implemented, these elements will be useful in facilitating clinical research, registry reporting, administrative reporting and regulatory compliance, and all aspects of patient care.
The following staff participated in this research: from the ACCF: Dana M. Pinchotti, BS, and Arsalan Khalid, MBA; from the DCRI: Rebecca Wilgus, RN, MSN, Brian McCourt, BS, and David F. Kong, MD; from the CDISC: Chris Tolk, BS; from HL7: Mead Walker; and from NCI EVS: Erin Muhlbrandt, PhD, and Theresa Quinn, RN, BS.
For disclosures and definitions and technical specifications of NCRI data elements, please see the online version of this paper.
This work was supported by National Cardiovascular Research Infrastructure grant no. 1RC2HL101512-01. This paper is a collaboration of the Duke Clinical Research Institute and the American College of Cardiology-National Cardiovascular Data Registry. Dr. Kremers has received consultant's fees from Medtronic; has received speaker's fees from Boston Scientific; and has been an investigator for Medtronic, Boston Scientific, and St. Jude Medical. Dr. Tcheng has received consultant fees from Cardiovascular Systems Inc., Ischemix, and Philips Medical Systems; and has received grants from the U.S. Food and Drug Administration and the National Institutes of Health. All other authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- American College of Cardiology Foundation
- American Heart Association
- National Cancer Institute Data Standards Repository
- Clinical Data Interchange Standards Consortium
- Center for Medicare and Medicaid Services
- Duke Clinical Research Institute
- Enterprise Vocabulary Services
- Health Level Seven International
- International Organization for Standardization
- National Cardiovascular Data Registry
- National Cancer Institute
- National Cardiovascular Research Infrastructure
- Reference Information Model
- Study Data Tabulation Model
- Society of Thoracic Surgeons
- Received December 13, 2012.
- Accepted December 19, 2012.
- American College of Cardiology Foundation
- Weintraub W.S.,
- Karlsberg R.P.,
- Tcheng J.E.,
- et al.
- Hendel R.C.,
- Budoff M.J.,
- Cardella J.F.,
- et al.
- Buxton A.E.,
- Calkins H.,
- Callans D.J.,
- et al.
- Cannon C.P.,
- Battler A.,
- Brindis R.G.,
- et al.
- The Society of Thoracic Surgeons
- Cannon C.P.,
- Brindis R.G.,
- Chaitman B.R.,
- et al.
- National Quality Forum
- Radford M.J.,
- Heidenreich P.A.,
- Bailey S.R.,
- et al.
- Health Level Seven International
- McCourt B.,
- Harrington R.A.,
- Fox K.,
- et al.
- Nahm M.,
- Walden A.,
- McCourt B.,
- et al.
- National Cancer Institute
- Centers for Medicare and Medicaid Services
- Bufalino V.J.,
- Masoudi F.A.,
- Stranne S.K.,
- et al.
- Kim E.S.H.,
- Carrigan T.P.,
- Menon V.
- Califf R.M.,
- Harrington R.A.
- ↵Systematized Nomenclature of Medicine—Clinical Terms. www.nlm.nih.gov/research/umls/Snomed/snomed_main.html. Accessed January 15, 2012.
- World Health Organization
- Logical Observation Identifiers Names and Codes (LOINC). http://loinc.org. Accessed March 28, 2012.
- National Library of Medicine-Standardized Nomenclature for Clinical Drugs (RxNorm). www.nlm.nih.gov/research/umls/rxnorm. Accessed March 28, 2012.