16 research outputs found
Avoiding bias from weak instruments in Mendelian randomization studies
Background Mendelian randomization is used to test and estimate the magnitude of a causal effect of a phenotype on an outcome by using genetic variants as instrumental variables (IVs). Estimates of association from IV analysis are biased in the direction of the confounded, observational association between phenotype and outcome. The magnitude of the bias depends on the F-statistic for the strength of relationship between IVs and phenotype. We seek to develop guidelines for the design and analysis of Mendelian randomization studies to minimize bias. Methods IV analysis was performed on simulated and real data to investigate the effect on bias of size of study, number and choice of instruments and method of analysis. Results Bias is shown to increase as the expected F-statistic decreases, and can be reduced by using parsimonious models of genetic association (i.e. not over-parameterized) and by adjusting for measured covariates. Using data from a single study, the causal estimate of a unit increase in log-transformed C-reactive protein on fibrinogen (mmol/l) is shown to increase from À0.005 (P ¼ 0.99) to 0.792 (P ¼ 0.00003) due to injudicious choice of instrument. Moreover, when the observed F-statistic is larger than expected in a particular study, the causal estimate is more biased towards the observational association and its standard error is smaller. This correlation between causal estimate and standard error introduces a second source of bias into meta-analysis of Mendelian randomization studies. Bias can be alleviated in meta-analyses by using individual level data and by pooling genetic effects across studies. Conclusions Weak instrument bias is of practical importance for the design and analysis of Mendelian randomization studies. Post hoc choice of instruments, genetic models or data based on measured F-statistics can exacerbate bias. In particular, the commonly cited rule of thumb that F410 avoids bias in IV analysis is misleading
Bayesian methods for meta-analysis of causal relationships estimated using genetic instrumental variables.
Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context of multiple genetic markers measured in multiple studies, based on the analysis of individual participant data. First, for a single genetic marker in one study, we show that the usual ratio of coefficients approach can be reformulated as a regression with heterogeneous error in the explanatory variable. This can be implemented using a Bayesian approach, which is next extended to include multiple genetic markers. We then propose a hierarchical model for undertaking a meta-analysis of multiple studies, in which it is not necessary that the same genetic markers are measured in each study. This provides an overall estimate of the causal relationship between the phenotype and the outcome, and an assessment of its heterogeneity across studies. As an example, we estimate the causal relationship of blood concentrations of C-reactive protein on fibrinogen levels using data from 11 studies. These methods provide a flexible framework for efficient estimation of causal relationships derived from multiple studies. Issues discussed include weak instrument bias, analysis of binary outcome data such as disease risk, missing genetic data, and the use of haplotypes
Collaborative pooled analysis of data on C-reactive protein gene variants and coronary disease: judging causality by Mendelian randomisation
Many prospective studies have reported associations between circulating C-reactive protein (CRP) levels and risk of coronary heart disease (CHD), but causality remains uncertain. Studies of CHD are being conducted that involve measurement of common polymorphisms of the CRP gene known to be associated with circulating concentrations, thereby utilising these variants as proxies for circulating CRP levels. By analysing data from several studies examining the association between relevant CRP polymorphisms and CHD risk, the present collaboration will undertake a Mendelian randomisation analysis to help assess the likelihood of any causal relevance of CRP levels to CHD risk. A central database is being established containing individual data on CRP polymorphisms, circulating CRP levels, and major coronary outcomes as well as age, sex and other relevant characteristics. Associations between CRP polymorphisms or haplotypes and CHD will be evaluated under different circumstances. This collaboration comprises, at present, about 37,000 CHD outcomes and about 120,000 controls, which should yield suitably precise findings to help judge causality. This work should advance understanding of the relevance of low-grade inflammation to CHD and indicate whether or not CRP itself is involved in long-term pathogenesis
Collaborative pooled analysis of data on C-reactive protein gene variants and coronary disease : judging causality by Mendelian randomisation
Bayesian methods for meta-analysis of causal relationships estimated using genetic instrumental variables
Genetic markers can be used as instrumental variables, in an analogous way to randomization in a clinical trial, to estimate the causal relationship between a phenotype and an outcome variable. Our purpose is to extend the existing methods for such Mendelian randomization studies to the context of multiple genetic markers measured in multiple studies, based on the analysis of individual participant data. First, for a single genetic marker in one study, we show that the usual ratio of coefficients approach can be reformulated as a regression with heterogeneous error in the explanatory variable. This can be implemented using a Bayesian approach, which is next extended to include multiple genetic markers. We then propose a hierarchical model for undertaking a meta-analysis of multiple studies, in which it is not necessary that the same genetic markers are measured in each study. This provides an overall estimate of the causal relationship between the phenotype and the outcome, and an assessment of its heterogeneity across studies. As an example, we estimate the causal relationship of blood concentrations of C-reactive protein on fibrinogen levels using data from 11 studies. These methods provide a flexible framework for efficient estimation of causal relationships derived from multiple studies. Issues discussed include weak instrument bias, analysis of binary outcome data such as disease risk, missing genetic data, and the use of haplotypes
