725 research outputs found
Methodological Issues in Multistage Genome-Wide Association Studies
Because of the high cost of commercial genotyping chip technologies, many
investigations have used a two-stage design for genome-wide association
studies, using part of the sample for an initial discovery of ``promising''
SNPs at a less stringent significance level and the remainder in a joint
analysis of just these SNPs using custom genotyping. Typical cost savings of
about 50% are possible with this design to obtain comparable levels of overall
type I error and power by using about half the sample for stage I and carrying
about 0.1% of SNPs forward to the second stage, the optimal design depending
primarily upon the ratio of costs per genotype for stages I and II. However,
with the rapidly declining costs of the commercial panels, the generally low
observed ORs of current studies, and many studies aiming to test multiple
hypotheses and multiple endpoints, many investigators are abandoning the
two-stage design in favor of simply genotyping all available subjects using a
standard high-density panel. Concern is sometimes raised about the absence of a
``replication'' panel in this approach, as required by some high-profile
journals, but it must be appreciated that the two-stage design is not a
discovery/replication design but simply a more efficient design for discovery
using a joint analysis of the data from both stages. Once a subset of
highly-significant associations has been discovered, a truly independent
``exact replication'' study is needed in a similar population of the same
promising SNPs using similar methods.Comment: Published in at http://dx.doi.org/10.1214/09-STS288 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Interethnic differences in pancreatic cancer incidence and risk factors: The Multiethnic Cohort.
While disparity in pancreatic cancer incidence between blacks and whites has been observed, few studies have examined disparity in other ethnic minorities. We evaluated variations in pancreatic cancer incidence and assessed the extent to which known risk factors account for differences in pancreatic cancer risk among African Americans, Native Hawaiians, Japanese Americans, Latino Americans, and European Americans in the Multiethnic Cohort Study. Risk factor data were obtained from the baseline questionnaire. Cox regression was used to estimate the relative risks (RRs) and 95% confidence intervals (CIs) for pancreatic cancer associated with risk factors and ethnicity. During an average 16.9-year follow-up, 1,532 incident pancreatic cancer cases were identified among 184,559 at-risk participants. Family history of pancreatic cancer (RR 1.97, 95% CI 1.50-2.58), diabetes (RR 1.32, 95% CI 1.14-1.54), body mass index ≥30 kg/m2 (RR 1.25, 95% CI 1.08-1.46), current smoking (<20 pack-years RR 1.43, 95% CI 1.19-1.73; ≥20 pack-years RR 1.76, 95% CI 1.46-2.12), and red meat intake (RR 1.17, 95% CI 1.00-1.36) were associated with pancreatic cancer. After adjustment for these risk factors, Native Hawaiians (RR 1.60, 95% CI 1.30-1.98), Japanese Americans (RR 1.33, 95% CI 1.15-1.54), and African Americans (RR 1.20, 95% CI 1.01-1.42), but not Latino Americans (RR 0.90, 95% CI 0.76-1.07), had a higher risk of pancreatic cancer compared to European Americans. Interethnic differences in pancreatic cancer risk are not fully explained by differences in the distribution of known risk factors. The greater risks in Native Hawaiians and Japanese Americans are new findings and elucidating the causes of these high rates may improve our understanding and prevention of pancreatic cancer
A Kinship-Based Modification of the Armitage Trend Test to Address Hidden Population Structure and Small Differential Genotyping Errors
BACKGROUND/AIMS: We propose a modification of the well-known Armitage trend test to address the problems associated with hidden population structure and hidden relatedness in genome-wide case-control association studies. METHODS: The new test adopts beneficial traits from three existing testing strategies: the principal components, mixed model, and genomic control while avoiding some of their disadvantageous characteristics, such as the tendency of the principal components method to over-correct in certain situations or the failure of the genomic control approach to reorder the adjusted tests based on their degree of alignment with the underlying hidden structure. The new procedure is based on Gauss-Markov estimators derived from a straightforward linear model with an imposed variance structure proportional to an empirical relatedness matrix. Lastly, conceptual and analytical similarities to and distinctions from other approaches are emphasized throughout. RESULTS: Our simulations show that the power performance of the proposed test is quite promising compared to the considered competing strategies. The power gains are especially large when small differential differences between cases and controls are present; a likely scenario when public controls are used in multiple studies. CONCLUSION: The proposed modified approach attains high power more consistently than that of the existing commonly implemented tests. Its performance improvement is most apparent when small but detectable systematic differences between cases and controls exist
Improved Imputation of Common and Uncommon Single Nucleotide Polymorphisms (SNPs) with a New Reference Set
Statistical imputation of genotype data is an important technique for analysis of genome-wide association studies (GWAS). We have built a reference dataset to improve imputation accuracy for studies of individuals of primarily European descent using genotype data from the Hap1, Omni1, and Omni2.5 human SNP arrays (Illumina). Our dataset contains 2.5-3.1 million variants for 930 European, 157 Asian, and 162 African/African-American individuals. Imputation accuracy of European data from Hap660 or OmniExpress array content, measured by the proportion of variants imputed with R^2^>0.8, improved by 34%, 23% and 12% for variants with MAF of 3%, 5% and 10%, respectively, compared to imputation using publicly available data from 1,000 Genomes and International HapMap projects. The improved accuracy with the use of the new dataset could increase the power for GWAS by as much as 8% relative to genotyping all variants. This reference dataset is available to the scientific community through the NCBI dbGaP portal. Future versions will include additional genotype data as well as non-European populations
Recommended from our members
Leveraging population admixture to characterize the heritability of complex traits.
Despite recent progress on estimating the heritability explained by genotyped SNPs (h(2)g), a large gap between h(2)g and estimates of total narrow-sense heritability (h(2)) remains. Explanations for this gap include rare variants or upward bias in family-based estimates of h(2) due to shared environment or epistasis. We estimate h(2) from unrelated individuals in admixed populations by first estimating the heritability explained by local ancestry (h(2)γ). We show that h(2)γ = 2FSTCθ(1 - θ)h(2), where FSTC measures frequency differences between populations at causal loci and θ is the genome-wide ancestry proportion. Our approach is not susceptible to biases caused by epistasis or shared environment. We applied this approach to the analysis of 13 phenotypes in 21,497 African-American individuals from 3 cohorts. For height and body mass index (BMI), we obtained h(2) estimates of 0.55 ± 0.09 and 0.23 ± 0.06, respectively, which are larger than estimates of h(2)g in these and other data but smaller than family-based estimates of h(2)
- …
