Author + information
- aDepartment of Medicine, Massachusetts General Hospital, Boston, Massachusetts
- bBrigham and Women’s Hospital Heart & Vascular Center and Harvard Medical School, Boston, Massachusetts
- ↵∗Address for correspondence:
Dr. Muthiah Vaduganathan, Brigham and Women’s Hospital Heart & Vascular Center, 75 Francis Street, Boston, Massachusetts 02115.
Broad sharing of participant-level data from completed clinical trials has received a wide base of support, including from cardiovascular clinical trialists (1). Beginning in July 2018, all clinical trials published in International Committee of Medical Journal Editors journals will be required to provide formal data sharing statements (2). Clinical trial data sharing platforms are web-based systems that host individual participant data from completed studies. Several federal-, industry-, and university-supported data sharing platforms are already operational (3–5), with the volume of accessible data expected to increase (6). Despite this, currently available shared data remain underutilized (7), potentially related to a lack of awareness or uncertainty over the data sharing process. We believe that fellows-in-training and early career (FIT/EC) investigators will be at the forefront of shared data consumption (8). Although the overall data sharing process has previously been reviewed (6), we provide a more practical approach to accessing and exploring these data, highlighting anticipated barriers and potential solutions to successful data consumption by FIT/EC investigators.
What Shared Data Are Available for Access?
Data sharing platforms host raw and analysis-ready datasets, necessary metadata (e.g., variable labels, formats), and ancillary documents (e.g., protocols, analysis plans, case report forms). All data are carefully processed, deidentified, and deposited within platforms after a variable time frame. For instance, this proprietary-use period is approximately 2 years after trial completion in the National Heart, Lung, and Blood Institute (NHLBI) data repository. For more recently established, industry-sponsored initiatives, new and historical clinical trial databases continue to be added to platforms, such that median time from study completion to data accessibility remains ∼7 years (7). Many platforms allow data requests for access to multiple clinical trial datasets for pooled analyses.
Data Sharing Platforms
Although data sharing efforts are widespread across the biomedical sciences, we highlight established and ongoing initiatives related to cardiology.
In 2000, the NHLBI created a formal data repository to facilitate the sharing of data from NHLBI-supported clinical trials and observational studies. As of May 2016, the NHLBI data repository hosted individual patient-level data from nearly 350,000 participants enrolled in 100 clinical trials. Greater than 800 data requests have been submitted since the repository’s inception, resulting in 277 publications to date (3).
More recently, several industry-supported data sharing platforms have emerged. The largest of these, ClinicalStudyDataRequest.com (CSDR), hosts data from 13 pharmaceutical companies, and since its inception in early 2013, data from 537 cardiometabolic studies have been made available (7). The Yale University Open Data Access project, launched in 2011, is a both industry- and university-supported platform (4). Through partnership with Johnson & Johnson, Medtronic, and SI-BONE, the Yale University Open Data Access project has shared 228 clinical trials to date, with 73 data requests to date. Finally, Bristol-Myers Squibb and the Duke Clinical Research Institute partnered to form Supporting Open Access for Researchers, which has received 57 data requests to date (5).
Practical Steps to Shared Data Requests and Access
Most clinical trial data are not truly “open access” but are provided after review and processing by a “learned intermediary” (an independent review panel). Understanding the mechanics and practicalities of data requests and anticipated hurdles to effective utilization are important for FIT/EC investigators embarking on this process (Figure 1).
Most data sharing platforms openly display currently approved proposals and clinical studies available for request. The first step is preparation of a brief unique proposal requesting data from 1 or more of the hosted trials, which may be used for a variety of analytic purposes. For instance, investigators may pool data for meta-analyses or systematic reviews. Walker et al. (9) accessed data from CSDR for a systematic review on dipeptidyl peptidase-4 inhibitors in comorbid diabetes mellitus and chronic kidney disease. Other investigators may conduct secondary analyses related to risk modeling, predictors of response, or subgroup analyses of treatment effects. Tsujimoto and Kajio (10) used a limited dataset hosted on the NHLBI data repository to better characterize abdominal obesity as a potential risk predictor in heart failure with preserved ejection fraction. Shared data are infrequently used for replication or validation of original trial findings (11). Focused proposals should typically include: study databases requested, background information, specific aims, research methodology, endpoints, funding sources, and potential conflicts of interest. The proposed study should be approved or exempted by the institutional review board.
Once prepared, the proposal may be submitted via a secure, web-based portal. Proposals are evaluated based on completeness, scientific merit and promise, and the research team’s qualifications. If approved, the data requesters will sign a data use agreement with the trial sponsor. Access is then granted via a password-protected portal. Primary data often cannot be directly downloaded; instead, statistical analyses need to be carried out by data requesters within the private workspace using limited provided software. Data access is generally granted for 1 to 2 years, and submission for peer-reviewed publication is often required.
Anticipated Barriers and Potential Solutions to FIT/EC Data Consumption
At each step of the data sharing process, FIT/EC investigators may encounter important barriers to successful consumption of shared data. First, in identifying potential study topics, few available studies may be directly related to the FIT/EC’s field of interest. For instance, of all cardiometabolic studies hosted by CSDR, 75% were related to hypertension and diabetes mellitus (7) and more than one-half were only available after 6 years of trial completion, which may limit the contemporary relevance of these data. Second, FIT/ECs may not be prepared for or have the immediate resources to process and negotiate lengthy data use agreements with trial sponsors. Up-front communication with mentors and the institutional legal team may help navigate this process more efficiently. Third, the data sharing process can be lengthy, and FIT/ECs should be aware of expected timelines. This timeline may vary from project to project and may depend on the platform, but FIT/EC investigators should anticipate up to 1 year (11) prior to data analyses, a timeframe that may not be optimal for certain shorter fellowships. Fourth, as with most clinical research, financial and materials support is often needed for secondary use of shared data. Recent studies of data sharing have suggested that only one-half of proposals are specifically funded, and lack of funding and statistical support may be a factor in delaying analyses and publication (8). The NHLBI offers a R21 Exploratory/Developmental Research Grant ($150,000 over a 2-year period), which may be used to perform secondary analyses of studies included in their repository. Other funding avenues or support from a mentor may be required. Last, FIT/EC investigators should anticipate performing necessary statistical analyses themselves or including a biostatistician on the research team.
Individual patient-level data from cardiovascular clinical trials are expected to continue to be broadly shared and become accessible sooner after trial completion (1). Data sharing platforms provide researchers with access to participant-level data, which may broaden the scientific impact of collected data and bolster return of patient participation. Secondary use of shared data may be an important adjunctive resource for FIT/EC investigators to complement primary data generation. As such, FIT/EC investigators should familiarize themselves with available data sharing platforms, the process and practicalities of data acquisition, and the anticipated barriers to efficient utilization.
Dr. Vaduganathan is supported by the National Heart, Lung, and Blood Institute T32 postdoctoral training grant (T32HL007604). Dr. McCarthy has reported that he has no relationships relevant to the contents of this paper to disclose.
- 2018 American College of Cardiology Foundation
- Taichman D.B.,
- Sahni P.,
- Pinborg A.,
- et al.
- Pencina M.J.,
- Louzao D.M.,
- McCourt B.J.,
- et al.
- Dey P.,
- Ross J.S.,
- Ritchie J.D.,
- Desai N.R.,
- Bhavnani S.P.,
- Krumholz H.M.
- Vaduganathan M.,
- Nagarur A.,
- Qamar A.,
- et al.
- Ross J.S.,
- Ritchie J.D.,
- Finn E.,
- et al.
- Walker S.R.,
- Komenda P.,
- Khojah S.,
- et al.
- Tsujimoto T.,
- Kajio H.
- Gay H.C.,
- Baldridge A.S.,
- Huffman M.D.