Author + information
- Received May 3, 2011
- Accepted August 9, 2011
- Published online November 1, 2011.
- Paul Sorlie, PhD⁎ ( and )
- Gina S. Wei, MD, MPH
- ↵⁎Reprint requests and correspondence:
Dr. Paul Sorlie, Epidemiology Branch, Prevention and Population Sciences Program, Division of Cardiovascular Sciences, National Heart, Lung, and Blood Institute, 6701 Rockledge Drive, Suite 10210, Bethesda, Maryland 20892-7936
This commentary discusses the question whether observational epidemiology studies using a population-based cohort design continue to make an impact on the prevention and treatment of cardiovascular disease. Although these studies are large and comprehensive, have they advanced from the early recognition of traditional risk factors to become relevant in the current complex research environment? Five themes are discussed: 1) their role in scientific discovery, including in the context of clinical trials' role in interventional research; 2) their value in encompassing diverse ethnic and age groups to remain relevant to the changing diversity of the United States; 3) the research potential of combining these datasets into large consortia; 4) the ability to use advances in biomedical research technologies; and 5) the recognition that these are national resources that allow outside research community to analyze the collected data and to originate novel ancillary studies. The National Heart, Lung, and Blood Institute longitudinal cohort studies offer opportunities that hold great promise in improving evaluation of personal risk, identifying mechanisms of disease, and directing potential targets for behavior and medical interventions.
Are observational epidemiology studies using a population-based cohort design still relevant? More specifically, do they make any significant impact on the prevention and treatment of cardiovascular disease? Population-based cohort studies are a specific category of epidemiology studies in which a defined population is followed up and observed longitudinally to assess exposure and outcome relationships (1). Some critics may argue that such studies have yielded little clinical impact recently—unlike decades ago when they helped uncover major cardiovascular risk factors such as smoking, hypertension, and dyslipidemia (2). In contrast, randomized control trials (RCTs) continue to make headlines, and the growing use of electronic medical records seem to offer abundant promises for research opportunities (3). Some may even blame observational studies for leading the public astray. A well-known example is hormone “replacement” therapy, of which the beneficial effects identified in several observational cohort studies were later refuted by the Women's Health Initiative; this then led to extensive investigations to explain the differences between the observational and trials results (4).
The RCT, despite its limitations, is the gold-standard research method for determining the effectiveness of clinical interventions; nevertheless, population-based cohort studies are still extremely relevant for other research purposes, such as scientific discovery, informing the design of RCTs, and assessing effects of harmful exposures. Table 1 provides examples of the roles of observational cohort studies in research discovery, as well as clinical trials' roles in testing interventions. In general, although RCTs can test interventions in more controlled environments, the research questions are typically intentionally focused—hence, generalizability may be restricted to narrowly defined populations. In contrast, observational cohort studies are established with the capacity to address a wide range of research topics that can be generalized to a community or an even broader population. Aside from discovery, cohort studies can also be used to inform RCT designs. Hypothesis-generating discoveries from observational cohort studies are 1 source for spawning research ideas for RCTs. They further provide fundamental data on disease rates to help estimate the trials' sample-size requirements. Observational studies are also critical when RCTs would be considered unethical, particularly when studying a potentially harmful exposure. For example, given the known damaging effects of second-hand smoke on the lung, observational studies were relied on to identify whether it increased the risk for coronary heart disease. Subsequent observational data that followed the implementation of smoking bans have further shown an ensuing decreased rate of heart attacks (5).
It is worth emphasizing the role of population-based cohort studies in scientific discovery, particularly in unearthing emergent risk factors and understanding the genetic and biological basis of cardiovascular diseases. Some may argue that the identification of cardiovascular risk factors is already complete; however, this statement does not take into account that behavioral, social, and environmental exposure patterns continue to evolve, and our ability to measure them continues to improve. Additionally, a better understanding of pathogenesis is relevant to improve translational research for developing new therapeutics. The electronic medical records are limited in these capacities, although they certainly hold great promises in other arenas, including health services and outcomes research, post-market surveillance of drug safety, and even nationwide disease surveillance. Medical care databases are generally designed as either claims data or clinical records, but in all cases reflect existing clinical practices; these electronic medical record databases do not include the novel bioassays or medical imaging required for innovative research purposes (6). In contrast, cohort studies can collect detailed data using measurement tools that mirror current clinical practices and also cutting-edge technologies that could lead to advances in health care.
The National Heart, Lung, and Blood Institute (NHLBI) has supported large population-based cohort studies in the United States beginning with the Framingham Heart Study in the 1940s. These cohorts have been initiated to remain relevant to the changing population composition and the increasing sociocultural diversity of the United States. Differences in health status among various population groups are consistently confirmed by data on variations in mortality (7). Current emphasis on understanding the genes and the environment, and their interactive effects on disease, requires research in diverse populations and cultures. The NHLBI cohort studies listed in Table 2 are drawn from many cultures, environments, and locations. They examine exposures such as racial discrimination, acculturation, changing life-style and food sources, economic disadvantage, and the built environment. In addition to these listed studies, the NHLBI has funded and continues to fund investigator-initiated cohort studies, along with other epidemiology studies such as community surveillance studies, disease registries, and follow-up of clinical trials (8–11).
Recently, the relevance of the traditional population-based cohort studies has been challenged by the concept of “mega-cohorts” that consist of several hundred thousands of participants. The scientific rationale for such huge sample sizes stems from the need to identify the numerous genetic variants and their gene-gene and gene-environment interactions for common yet complex disorders such as cardiovascular disease (12). The mega-cohorts could originate from aggregated electronic medical records (e.g., HMO [Health Maintenance Organization] Research Network), a new identification of participants and collection of information (e.g., UK Biobank), or consortia of existing cohorts (e.g., the CHARGE [Cohorts for Heart and Aging Research in Genomic Epidemiology] consortium) (13–15). Studies that recruit and follow up very large populations de novo would require, in the United States anyway, extremely high operational costs (16). European studies can use their extensive national health records systems to reduce these costs considerably. Data collections that rely on existing medical records, however, can be limited by the research scope (i.e., data collected for medical care rather than for research purposes) or by the unstandardized quality of the information (17). An unanswered question is whether these concerns are outweighed by the massive sample sizes. Consortia of ongoing cohort studies can provide another cost-efficient alternative to mega-cohorts. The studies listed in Table 2 have been included in various consortia, and the accumulated sample sizes can exceed 70,000 persons. These existing consortia have adopted very conservative rules for determining statistical significance, yet still discovered areas of the genome associated with cardiovascular risk factors or disease (18–20). However, as clearly described by the National Human Genome Research Institute, the progression of “base pairs to bedside” is still in its early stages; substantial research in the coming decades is expected before the full impact of genomic discoveries on the practice of clinical medicine can be realized (21). The balance of cost, data validity, feasibility, and required sample size will likely continue to frame the welcome ongoing discussions of the role of mega-cohorts versus traditional population-based cohorts.
To remain relevant in a technologically evolving world, cohort studies should be able to capture data using advances in biomedical research technologies. These past decades have witnessed an explosion of discoveries of the genetic complexity associated with human diseases and the molecular signatures that describe underlying processes for health and disease. The study of blood components has moved well beyond the earliest measures of cholesterol to the most complex measures of proteins and metabolites. The genetic world has moved from assessing simple genetic characteristics to genotyping a million or more single-nucleotide polymorphisms, and recently to genomic sequencing. Many large NHLBI cohort studies have readily incorporated these measurements into their data collection. The use of noninvasive technologies of ultrasound imaging, computerized tomography, and magnetic resonance imaging have further permitted cohort studies to assess disease process from the healthy to subclinical state and to the manifestation of clinical disease—all adding to their values as research resources to more directly measure the processes leading to disease.
To remain relevant to research needs, observational cohort studies must also allow access to their rich resources. Their value is likely to increase as research funding becomes more limited and demands for high-quality scientific data rise. The primarily contract-based funding of the studies listed in Table 2 support the infrastructure, recruitment, examinations, follow-up, event identification and validation, specimen storage, and statistical analysis. With this foundation, these studies create opportunities for investigator-initiated grants—either as a stand-alone project or as part of a consortium—to add examination components, to use collected biological samples for new analytes or genotyping, and for additional statistical analyses. The listed cohort studies often have 20 to 75 added ancillary studies, many involving non–study investigators. Additionally, non–study researchers can access data through an extensive data distribution process provided by the NHLBI. Phenotype data are available through the NHLBI's Biologic Specimen and Data Repository Information Coordinating Center (22), and the combined genetic and phenotypic data are accessible through the National Center for Biotechnology Information's Database of Genotypes and Phenotypes (23).
As the NHLBI looks to the future, its cohort studies will continue to evolve by taking advantage of rapid advances in research technology and adding (or ending) cohorts in response to research needs, yet remain committed to maximizing research potential. The NHLBI population-based cohort studies are national resources. They offer opportunities to evaluate countless scientific hypotheses to translate bench research to both prevention and treatment, furthering insight into the complex interaction of environment, behaviors, and genetics on cardiovascular disease. These opportunities hold great promise in improving evaluation of personal risk, identifying mechanisms of disease, and directing potential targets for behavior and medical interventions.
Both authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- Abbreviations and Acronyms
- National Heart, Lung, and Blood Institute
- randomized control trial
- Received May 3, 2011.
- Accepted August 9, 2011.
- American College of Cardiology Foundation
- Szklo M.
- Friedman C.P.,
- Wong A.K.,
- Blumenthal D.
- Institute of Medicine
- HMO Research Network (HMORN)
- UK Biobank
- The Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium
- Smith N.L.,
- Felix J.F.,
- Morrison A.C.,
- et al.
- Zhu X.,
- Young J.H.,
- Fox E.,
- et al.
- The Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC)
- The Database of Genotypes and Phenotypes (dbGaP)