12 research outputs found
Regional heritability mapping method helps explain missing heritability of blood lipid traits in isolated populations
Single single-nucleotide polymorphism (SNP) genome-wide association studies (SSGWAS) may fail to identify loci with modest effects on a trait. The recently developed regional heritability mapping (RHM) method can potentially identify such loci. In this study, RHM was compared with the SSGWAS for blood lipid traits (high-density lipoprotein (HDL), low-density lipoprotein (LDL), plasma concentrations of total cholesterol (TC) and triglycerides (TG)). Data comprised 2246 adults from isolated populations genotyped using ∼300 000 SNP arrays. The results were compared with large meta-analyses of these traits for validation. Using RHM, two significant regions affecting HDL on chromosomes 15 and 16 and one affecting LDL on chromosome 19 were identified. These regions covered the most significant SNPs associated with HDL and LDL from the meta-analysis. The chromosome 19 region was identified in our data despite the fact that the most significant SNP in the meta-analysis (or any SNP tagging it) was not genotyped in our SNP array. The SSGWAS identified one SNP associated with HDL on chromosome 16 (the top meta-analysis SNP) and one on chromosome 10 (not reported by RHM or in the meta-analysis and hence possibly a false positive association). The results further confirm that RHM can have better power than SSGWAS in detecting causal regions including regions containing crucial ungenotyped variants. This study suggests that RHM can be a useful tool to explain some of the ‘missing heritability' of complex trait variation
Effect of skill drills on neonatal ventilation performance in a simulated setting—observation study in Nepal
A cautionary note on the impact of protocol changes for genome-wide association SNP × SNP interaction studies: an example on ankylosing spondylitis
peer reviewedGenome-wide association interaction (GWAI) studies have increased in popularity. Yet
to date, no standard protocol exists. In practice, any GWAI workflow involves making
choices about quality control strategy, SNP filtering, linkage disequilibrium (LD)
pruning, analytic tool to model or to test for genetic interactions. Each of these can
have an impact on the final epistasis findings and may affect their reproducibility in
follow-up analyses. Choosing an analytic tool is not straightforward, as different such
tools exist and current understanding about their performance is based on often very
particular simulation settings. In the present study, we wish to create awareness for the
impact of (minor) changes in a GWAI analysis protocol can have on final epistasis
findings. In particular, we investigate the influence of marker selection and marker
prioritization strategies, LD pruning and the choice of epistasis detection analytics on
study results, giving rise to 8 GWAI protocols. Discussions are made in the context of
the ankylosing spondylitis (AS) data obtained via the Wellcome Trust Case Control
Consortium (WTCCC2). As expected, the largest impact on AS epistasis findings is
caused by the choice of marker selection criterion, followed by marker coding and LD
pruning. In MB-MDR, co-dominant coding of main effects is more robust to the effects
of LD pruning than additive coding. We were able to reproduce previously reported
epistasis involvement of HLA-B and ERAP1 in AS pathology. In addition, our results
suggest involvement of MAGI3 and PARK2, responsible for cell adhesion and cellular
trafficking. Gene Ontology (GO) biological function enrichment analysis across the 8
considered GWAI protocols also suggested that AS could be associated to the Central
Nervous System (CNS) malfunctions, specifically, in nerve impulse propagation and in
neurotransmitters metabolic processes
Efficiency of different strategies to mitigate ascertainment bias when using SNP panels in diversity studies
Abstract Background Single nucleotide polymorphism (SNP) panels have been widely used to study genomic variations within and between populations. Methods of SNP discovery have been a matter of debate for their potential of introducing ascertainment bias, and genetic diversity results obtained from the SNP genotype data can be misleading. We used a total of 42 chicken populations where both individual genotyped array data and pool whole genome resequencing (WGS) data were available. We compared allele frequency distributions and genetic diversity measures (expected heterozygosity (H e ), fixation index (F ST ) values, genetic distances and principal components analysis (PCA)) between the two data types. With the array data, we applied different filtering options (SNPs polymorphic in samples of two Gallus gallus wild populations, linkage disequilibrium (LD) based pruning and minor allele frequency (MAF) filtering, and combinations thereof) to assess their potential to mitigate the ascertainment bias. Results Rare SNPs were underrepresented in the array data. Array data consistently overestimated H e compared to WGS data, however, with a similar ranking of the breeds, as demonstrated by Spearman’s rank correlations ranging between 0.956 and 0.985. LD based pruning resulted in a reduced overestimation of H e compared to the other filters and slightly improved the relationship with the WGS results. The raw array data and those with polymorphic SNPs in the wild samples underestimated pairwise F ST values between breeds which had low F ST (0.15). LD based pruned data underestimated F ST in a consistent manner. The genetic distance matrix from LD pruned data was more closely related to that of WGS than the other array versions. PCA was rather robust in all array versions, since the population structure on the PCA plot was generally well captured in comparison to the WGS data. Conclusions Among the tested filtering strategies, LD based pruning was found to account for the effects of ascertainment bias in the relatively best way, producing results which are most comparable to those obtained from WGS data and therefore is recommended for practical use
Development and application of a novel genome-wide SNP array reveals domestication history in soybean
Constructing endophenotypes of complex disease using non-negative matrix factorization and adjusted rand index
[[abstract]]Complex diseases are typically caused by combinations of molecular disturbances that vary widely among different patients. Endophenotypes, a combination of genetic factors associated with a disease, offer a simplified approach to dissect complex trait by reducing genetic heterogeneity. Because molecular dissimilarities often exist between patients with indistinguishable disease symptoms, these unique molecular features may reflect pathogenic heterogeneity. To detect molecular dissimilarities among patients and reduce the complexity of high-dimension data, we have explored an endophenotype-identification analytical procedure that combines non-negative matrix factorization (NMF) and adjusted rand index (ARI), a measure of the similarity of two clusterings of a data set. To evaluate this procedure, we compared it with a commonly used method, principal component analysis with k-means clustering (PCA-K). A simulation study with gene expression dataset and genotype information was conducted to examine the performance of our procedure and PCA-K. The results showed that NMF mostly outperformed PCA-K. Additionally, we applied our endophenotype-identification analytical procedure to a publicly available dataset containing data derived from patients with late-onset Alzheimer’s disease (LOAD). NMF distilled information associated with 1,116 transcripts into three metagenes and three molecular subtypes (MS) for patients in the LOAD dataset: MS1 (), MS2 (), and MS3 (). ARI was then used to determine the most representative transcripts for each metagene; 123, 89, and 71 metagene-specific transcripts were identified for MS1, MS2, and MS3, respectively. These metagene-specific transcripts were identified as the endophenotypes. Our results showed that 14, 38, 0, and 28 candidate susceptibility genes listed in AlzGene database were found by all patients, MS1, MS2, and MS3, respectively. Moreover, we found that MS2 might be a normal-like subtype. Our proposed procedure provides an alternative approach to investigate the pathogenic mechanism of disease and better understand the relationship between phenotype and genotype.[[notice]]補正完
