Author + information
- aDepartment of Medicine, Stanford University School of Medicine, Stanford, California
- bHuman Genetics Center, Department of Epidemiology, Human Genetics, and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, Texas
- ↵∗Address for correspondence:
Dr. Themistocles L. Assimes, Stanford University School of Medicine, 1070 Arastradero Road, Suite 300, Palo Alto, California 94304-1334.
- causal inference
- coronary artery disease
- instrumental variable
- Mendelian randomization
- plasma proteomics
The extensive polygenic nature of traits responsible for the majority of human disease has only recently become evident. Genome-wide association studies conducted in the last decade have uniformly found that hundreds and possibly thousands of genetic variants additively contribute to the inherited susceptibility of most common but complex diseases, including coronary artery disease (CAD) (1). These findings have confirmed that Gregor Mendel’s law of dominance is more of an exception than a rule (2). Fortunately, a relatively simple analytic approach for causal inference, appropriately dubbed Mendelian randomization (MR), has emerged out of this complex situation to take full advantage of Mendel’s 2 other laws of segregation and independent assortment. In the last few years, the potential for MR analyses to provide insight on the causal and noncausal pathways of disease has clearly come into focus (3).
An MR study is a type of instrumental variable (IV) analysis at its root (3). Economists have been performing IV analysis for decades to estimate causal relationships within economic data. An IV is a measurable quantity that is associated with an exposure of interest but is not associated with any other competing risk factor that can serve as a confounder. Furthermore, an IV is not associated with an outcome of interest except potentially through the causal pathway that includes the exposure of interest. The IV in MR studies is simply 1 or more genetic variants that have been: 1) inherited from both parents; 2) randomly assigned at conception independent of genetic variation in other regions of the genome; and 3) shown to unequivocally associate with an exposure of interest (3). Given this definition, one can easily appreciate how the act of randomization in a clinical trial can also serve as an IV. However, randomly assigning exposure groups related to health and disease is neither feasible nor ethical in many circumstances.
A majority of genetic variants used as IVs are single-nucleotide polymorphisms that express only 1 of 2 alternate alleles and produce 1 of 3 genotype groups (e.g., AA, AB, BB), each with different exposure levels that are otherwise balanced with respect to confounders. Comparison of outcomes among these groups can then be used to distinguish causation from correlation, a major shortcoming of classical epidemiology. Such fundamental knowledge has the potential to dramatically accelerate the pace of translation of epidemiological studies into novel insights on pathophysiology followed by innovative therapeutic interventions. For example, MR studies have provided compelling evidence of causal effects of low-density lipoprotein cholesterol and triglycerides on CAD while showing that a causal effect of high-density lipoprotein cholesterol on CAD is absent (4). In this context, carefully designed MR studies were arguably able to predict the success of statins and the failure of the cholesteryl ester transfer protein (CETP) inhibitor drug trials (5,6). The causality of other modifiable traditional CAD risk factors such as adiposity, hypertension, and type 2 diabetes has also been corroborated by MR (4).
The most fruitful application of MR going forward for CAD may come not in the form of confirming the causality of established risk factors but rather in the form of applying MR to a wealth of emerging molecular data including measures of gene expression, microRNAs, DNA methylation, metabolomics, and proteomics to identify novel causal associations in a hypothesis-free manner. A particularly alluring aspect of molecular MR studies in search of causal risk factors for CAD is the biological plausibility that many causal atherosclerotic processes active in the arterial wall are likely to be faithfully reflected in the levels of various protein biomarkers present in the easily accessible tissue of blood.
In this issue of the Journal, Sjaarda et al. (7) provide an example of this approach by conducting an MR analysis of 205 serum protein biomarkers in the ORIGIN (Outcomes Reduction with an Initial Glargine Intervention), CARDIoGRAMplusC4D (Coronary ARtery DIsease Genome wide Replication and Meta-analysis [CARDIoGRAM] plus The Coronary Artery Disease [C4D] Genetics), and UK Biobank studies implicating blood levels of colony-stimulating factor 1 (CSF1) and stromal cell–derived factor (CXCL12) in the etiology of CAD while simultaneously confirming the causal roles of lipoprotein(a), interleuking-6 receptor, apolipoprotein E, and apolipoprotein C3. The investigators are to be commended for their systematic approach to the identification of genetic instruments for their biomarkers followed by testing of these biomarkers for association with CAD in external datasets. Key strengths of their analytic plan include the 2-stage design, the replication of positive findings in an independent cohort, and the conduct of MR analysis not only between each biomarker and CAD but also between 2 biomarkers of interest to assess for the presence and direction of mediation. These strengths increase the confidence that the novel causal mechanisms identified are real and the relationships between inflammatory markers presented in their central figure are accurate.
What can we expect in the future using this approach? As technologies mature and our ability to reliably measure the levels of all biomarkers in blood is perfected, we can anticipate well-powered MR studies of the entire plasma proteome to materialize and, hopefully, yield a relatively comprehensive map of causal and noncausal relationships among biological processes related to CAD. Current multiplex affinity-based or targeted assays are able to measure the absolute or relative levels of ∼1,000 proteins, including many low-abundance “leakage” proteins such as cytokines and troponins, as well as many actively secreted proteins (8). The exact number of proteins present in low abundance in blood remains unknown, but it seems likely that a large proportion of the human proteome may at some point in time be detectable in blood (8).
The new knowledge gained through such an approach will undoubtedly build on our current knowledge base and may be particularly helpful in illuminating the complicated relationship between markers of inflammation and the risk of CAD. However, the road to clarity of the pathophysiology of CAD using this approach will undoubtedly be bumpy given that hypothesis-free designs are also more susceptible to violations of 1 of the core assumptions of MR studies—namely, the absence of horizontal pleiotropy (3). Horizontal pleiotropy is where a single-nucleotide polymorphism is associated with multiple traits independently of the exposure of interest which may, in turn, independently influence an outcome of interest. This type of pleiotropy is widely considered to be the single greatest threat to the validity of MR studies (9). Although it may not be possible to prove that this assumption holds, various specialized extensions of the basic MR design have been developed to detect its presence and estimate the causal effect of the exposure even in the presence of such violations of the assumptions (10).
We close by emphasizing that the results of positive MR studies, although often appealing and exciting on their own, should normally be regarded as hypothesis generating given the potential for subtle violations in 1 or more of the 3 core assumptions. Where possible, investigators should seek corroborating evidence from orthogonal sources such as experimental cellular or tissue studies in humans, animal studies, or clinical trials before making definitive conclusions about causality. More confidence regarding causality is justified when evidence generated by a range of diverse study designs, each with their own strengths and limitations, all point to the same conclusion. In this respect, we are intrigued by the abundant additional experimental evidence that already exists on the role of colony-stimulating factor 1 and stromal cell–derived factor in the development of atherosclerosis as summarized by Sjaarda et al. in their report (7). To date, pharmacological inhibition of these 2 molecules has largely been explored in the context of noncardiovascular disorders including cancer and chronic pancreatitis (11,12). Is it time to explore such inhibition for the primary and/or secondary prevention of CAD? With the findings from Sjaarda et al. (7), the evidence in favor of this line of investigation is building and may soon reach a tipping point for action.
↵∗ Editorials published in the Journal of the American College of Cardiology reflect the views of the author and do not necessarily represent the views of JACC or the American College of Cardiology.
Both authors have reported that they have no relationships relevant to the contents of this paper to disclose.
- ↵Mendel G. Experiments in plant hybridization. Verhandlungen des naturforschenden Vereines in Brünn 1866;4:3–47.
- Burgess S.,
- Thompson S.G.
- Burgess S.,
- Harshfield E.
- Sjaarda J.,
- Gerstein H.,
- Chong M.,
- et al.
- Smith J.G.,
- Gerszten R.E.
- Zheng J.,
- Baird D.,
- Borges M.C.,
- et al.
- Cannarile M.A.,
- Weisser M.,
- Jacob W.,
- Jegg A.M.,
- Ries C.H.,
- Ruttinger D.
- Neesse A.,
- Ellenrieder V.